[fpc-devel] Patch to speed up Uppercase/Lowercase functions
Daniël Mantione
daniel at deadlock.et.tudelft.nl
Sat Jun 11 14:25:56 CEST 2005
Op Sat, 11 Jun 2005, schreef Daniël Mantione:
> > So, judge for yourself. I think this is worth the 256 byte lookup table.
>
> ?
>
> 0.948/0.999 = 95 %
>
> So, we 5% speed improvement from using a table; this is much worse than I
> thought and can easily be undone in real world by the increased cache
> trashing. Of course any speed improvement is welcome, but IMHO this is not
> worth the size increase.
>
> Remember, this just 1 procedure, and 256 byte extra is nothing compared to
> the whole unit.
>
> But if we start doing this kind of optimization accross the entire unit,
> we'll get a horribly bloated unit.
>
> Also, if speed is really important, nothing can beat a hand-optimized
> assembler routine that does the operation without jump and by means of
> 32-bit registers does 4 chars at once. We have hand optimized string
> routines in the rtl, I don't see why it cannot be done in sysutils.
For what it's worth, I found this on my harddrive. It uses
shortstring, assumes oldfpccall and looking at the code it doesn't look
that it's possible to process 4 bytes at once this way.
Daniël
function uppercase(const s:string):string;assembler;
asm
mov esi,s
mov edi, at result
lodsb
stosb
movzx ecx,al
jecxz @a2
@a1:
lodsb
cmp al,'a'
sbb bl,bl
cmp al,'z'+1
sbb bh,bh
not bh
or bl,bh
not bl
and bl,32
sub al,bl
stosb
dec cl
jnz @a1
@a2:
end;
function lowercase(const s:string):string;assembler;
asm
mov esi,s
mov edi, at result
lodsb
stosb
movzx ecx,al
jecxz @a2
@a1:
lodsb
cmp al,'A'
sbb bl,bl
cmp al,'Z'+1
sbb bh,bh
not bh
or bl,bh
not bl
and bl,32
add al,bl
stosb
dec cl
jnz @a1
@a2:
end;
More information about the fpc-devel
mailing list