logo To Foot
© J R Stockton, ≥ 2005-11-10

Borland Pascal Optimisation 2.

No-Frame * Framed Index * Frame This
Links within this site :-

Help Parameter?

A DOS program should show Help on being given a parameter of /? or similar;

for J := 1 to ParamCount do
  if Pos(ParamStr[J]+#32, '/? /h /H -h -H') then begin
    Writeln(HelpText) ; HALT(1) end ;

appears to be an effective way of testing.

if Pos('/?', PString(Ptr(PrefixSeg,$80))^)>0 then HelpProc ;

(idea stolen from Franz Glaser 1999-12-05)

Sorting

Many sort algorithms of various complexities and efficiencies exist. If you follow the following two rules, they will run faster, and a simpler algorithm may be quick enough :-
* Never move the data (unless small); swap indexes or pointers to it instead.
* Compute sort-keys in a preliminary pass (generally).

See also in JavaScript Sorting and Order.

Searching

See at NIST.

String Handling

In (and up to ?) BP7, Sys.Delete uses a slow, "Copy + Copy" algorithm (AIUI, Delphi is better). I, and others, have faster methods.

In (and up to ?) BP7, Sys.Insert uses a slow, "Copy + Source + Copy" algorithm (AIUI, Delphi is better). I have developed a faster method.

I think I can combine strings better, too.

Mod & Div

Michael Salem suggested the following, with a version for longints (details) :-

var Quotient, Dividend, Divisor, Remainder : word ;
 ...
Quotient := Dividend div Divisor ; asm mov word ptr Remainder, DX end ;

That also works for Delphi; and there is now a Delphi longint version.

Procedure? It would need to be a pseudo-procedure generating in-line code (like Ord, Addr, etc.), to avoid call overhead. For Pascal, one could I suppose write an inline directive; for later Pascal, an asm..end to do the division and keep both results ; for 32-bit Delphi, another one. One could use {$IFDEF VER##) to prevent it compiling under unknown (higher) Delphis. Delphi 5 seemingly has procedure DivMod in SysUtils.

Function Result Pointer

Sometimes it is helpful to refer to the incipient result of a function in a way not provided in TP/BP; particularly for string functions. The following method comes from news:c.l.p.b; P^ can be used more freely than the function name :-

var P : ^string ;
begin asm
    les di, @Result ; mov word ptr P,di ; mov word ptr P+2,es end {FP/OR} ;
  P^ := S1 ;
   ...

Windows 95/98 Time-Slicing

In news:comp.lang.pascal.borland, Andreas Killer, in answer to : "I'm experiencing a problem with my protected mode application under Windows 95/98; the system becomes very slow", wrote (1998-10-17) : You have to release the current virtual machine time-slice while you are waiting for a keystroke, i.e. like this :-

repeat asm  mov ax, 1680h ; int 2Fh  end until KeyPressed ;

Untested by JRS.

Multiplication

function Imul(x, y : integer) : longint ;
  inline(
    $5B/      {POP  BX}
    $58/      {POP  AX}
    $F7/$EB   {IMUL BX}
    ) ;

by Osmo; untested by JRS.

Squaring

function Isqr(x : integer) : longint ;
  inline(
    $58/      {POP  AX}
    $F7/$E8   {IMUL AX}
    ) ;

by Osmo; untested by JRS.

Raising to a Power

For XY, where Y is not small, it is inefficient to use

P := 1 ;
for j := 1 to Y do P := P*X end ;

If Y is fixed, note these examples :-

X9 := Sqr(Sqr(Sqr(X)))*X ;
X13 := Sqr(Sqr(Sqr(X)*X)))*X ;

If Y is variable, see ExpNum in Pascal Maths.

Lookup

It's often worth watching out for the possibility of using a lookup table of some sort or other. Even in cases where full run-time checks may make it slower, ISTM that it improves the chances of getting the answer (in non-trivial cases) consistently right - or maybe consistently wrong, which is often easier to debug than inconsistently!

For the integer square root of an integer :-

var J, JP : byte ; K, J2 : word ; A : array [0..maxint] of byte ;
  begin K := 0 ; JP := 0 ;
  for J := 0 to 181 {Trunc(Sqrt(MaxInt))} do begin J2 := J*J ;
    while K<J2 do begin A[K] := JP ; Inc(K) end ;
    JP := J end ;
  for K := K to MaxInt do A[K] := JP ;
  { check: }
  for K := 0 to 100 do Write(A[K]:4) ;
  Writeln ;
  for K := 180*180 to Maxint do Write(A[K]:4) ;
  end.

Cacheing

Never evaluate unnecessarily; consider, where Sqrt is assumed to be slow :-

procedure Srt(const J : integer) : integer (untested} ;
const XJ : integer = 0 ; XR : integer = 0 ;
begin if J=XJ then Srt := XR else begin
  XJ := J ; XR := Sqrt(J) ; Srt := XR end end;

A better example, cacheing an array of answers, is in jad_9514.pas.

Another, untested example :-

function B1(const n, m : word) : word ;
begin
  if (m = 0) or (n = 0) or (m = n) then B1 := 1
    else B1 := B1(n-1, m) + B1(n-1, m-1) ;
  end ;

will be slow for modest values of m, n because of recursion.

With FillChar(C, SizeOf(C), 0) ; first, the following will be much faster :-

function B2(const n, m : word) : word ;
var T : word ;
begin
  T := C[n, m] ; if T>0 then begin B2 := T ; EXIT end ;
  if (m = 0) or (n = 0) or (m = n) then T := 1
    else T := B2(n-1, m) + B2(n-1, m-1) ;
  C[n, m] := T ; B2 := T end ;

The technique would be particularly useful if it were impractical to predict which intermediate values would be needed.

Comparing Memory

I wrote - Has anyone got fast code (asm..end?) for testing the first N bytes of records P[J], P[J-1] for non-equality (repeat ... until <>)? Any optimisation should assume that inequality is rare (but vital!).

Pedt Scragg replied :-

How about :-

function CompareMemory(var block1,block2;size:word):word;assembler;
asm
      mov dx,ds
      mov cx,[size]
      xor ax,ax
      jcxz @@end
      lds si,[block1]
      les di,[block2]
      cld
      repe cmpsb
      je @@end
      mov ax,[size]
      sub ax,cx
@@end:mov ds,dx
end;

Function returns 0 if equal to size bytes or returns actual byte location of inequality if one is found.

And in his later reply :-

Interestingly, I hadn't thought about optimising the code as it was fast enough for anything I used it for. I switched to word based comparisons to see if there was difference. I had got the [obviously erroneous] impression that the extra code needed for odd bytes would negate the speed increase of using cmpsw. The version below is around twice as fast on my computer for reasonable sized blocks with a 14 byte extra code overhead. Unless the code is called many times, the extra speed is probably not necessary.

function CompareMemory2(var block1,block2;size:word):word;assembler;
asm
      mov dx,ds
      mov cx,[size]
      xor ax,ax
      jcxz @@end
      lds si,[block1]
      les di,[block2]
      xor bx,bx
      shr cx,1
      jnc @@run
      inc bx
      cmpsb
      je @@end
@@run:cld
      repe cmpsw
      je @@end
      mov ax,[size]
      shl cx,1
      sub ax,cx
      add ax,bx
@@end:mov ds,dx
end;

HCF, LCM

{ Highest Common Factor (a.k.a. GCD), Lowest Common Multiple }

function HCF1(const x, y : longint) : longint ;
begin if x<y then HCF1 := HCF1(y, x) else if y=0 then HCF1 := x
    else HCF1 := HCF1(x mod y, y) end {HCF1} ;

could possibly be improved (if x=y ... ; y, x mod y ...) but is fast anyway.

function HCF2(U, V : integer) : integer ;
begin repeat
  U := U mod V ; if U=0 then begin HCF2 := V ; EXIT end ;
  V := V mod U ; if V=0 then begin HCF2 := U ; EXIT end ;
  until false end {HCF2} ;

being non-recursive may well be better. In fact, on test, it is the fastest I have : see HCFactor.pas.

function LCM(const x, y : longint) : longint ;
begin LCM := (x/HCF(x, y))*y end {LCM} ; (untested)

Odd

Arno Fehm wrote, in c.l.p.b 1999-03-11, that, instead of k := Odd(xInteger), one can use k := boolean( xInteger AND 1 ), which is faster. It seems that it is shorter, and uses no jumps.

I added : For those who don't rewrite compilers,

function Nodd(Z : integer) : boolean ;
inline($58/$25/$01/$00) ;

generates jump-free code shorter than Odd does; it has a Push and a Pop instead of a Mov (and a good optimiser would combine them).

FillWord

Routine posted 1999-05-09 in c.l.p.b by Pedt Scragg :-

procedure FillWord(var Dest ; Count, What : word) ; assembler ;
asm  LES DI,Dest ; MOV CX,Count ; MOV AX,What ; CLD ; REP STOSW  end ;

This should be useful in particular for filling the screen buffer, where What = (TextAttr shl 8) + Ord(Character). It can be tested with, for example,

FillWord(Ptr(SegB800, 0)^, 2000, ($2400+Ord('*'))) ;

It seems that

procedure FillLongInt(var Dest ; Count : word ; What : longint) ;
assembler ;
asm  LES DI,Dest ; MOV CX,Count ; db $66 ; MOV AX,word ptr What ;
  CLD ; db $66 ; REP STOSW  end ;

also works, though I vaguely recall some warning about the combination of $66 with other prefixes on some CPUs.

UNTESTED : Ignoring those, note that, although the standard FillChar is often attractive, because it works with bytes it may be slower for zeroing than a loop of longints - or of double, or extended. The efficiency of FillChar may depend on version, especially P/D)

  FillChar(ByteArray, 1000, 0) ;

  type LongArray = array [0..15000] of longint ;
  for J := 0 to 249 do LongArray(ByteArray)[J] := 0 ;

  type DblArray = array [0..7500] of double ;
  for J := 0 to 124 do DblArray(ByteArray)[J] := 0.0 ;

  type ExtArray = array [0..6000] of extended ;
  for J := 0 to 99 do ExtArray(ByteArray)[J] := 0.0 ;

And  FillChar(*, *, 255) ;  fills shortint, integer, and longint arrays with -1.

Other parts of Opts : Index * Part 1
Home Page
Mail: no HTML
© Dr J R Stockton, near London, UK.
All Rights Reserved.
These pages are tested mainly with Firefox 3.0 and W3's Tidy.
This site, http://www.merlyn.demon.co.uk/, is maintained by me.
Head.