[fpc-devel] Patch to speed up Uppercase/Lowercase functions

Daniël Mantione daniel at deadlock.et.tudelft.nl
Sat Jun 11 14:25:56 CEST 2005



Op Sat, 11 Jun 2005, schreef Daniël Mantione:

> > So, judge for yourself. I think this is worth the 256 byte lookup table.
>
> ?
>
> 0.948/0.999 = 95 %
>
> So, we 5% speed improvement from using a table; this is much worse than I
> thought and can easily be undone in real world by the increased cache
> trashing. Of course any speed improvement is welcome, but IMHO this is not
> worth the size increase.
>
> Remember, this just 1 procedure, and 256 byte extra is nothing compared to
> the whole unit.
>
> But if we start doing this kind of optimization accross the entire unit,
> we'll get a horribly bloated unit.
>
> Also, if speed is really important, nothing can beat a hand-optimized
> assembler routine that does the operation without jump and by means of
> 32-bit registers does 4 chars at once. We have hand optimized string
> routines in the rtl, I don't see why it cannot be done in sysutils.

For what it's worth, I found this on my harddrive. It uses
shortstring, assumes oldfpccall and looking at the code it doesn't look
that it's possible to process 4 bytes at once this way.

Daniël



function uppercase(const s:string):string;assembler;

asm
    mov esi,s
    mov edi, at result
    lodsb
    stosb
    movzx ecx,al
    jecxz @a2
@a1:
    lodsb
    cmp al,'a'
    sbb bl,bl
    cmp al,'z'+1
    sbb bh,bh
    not bh
    or bl,bh
    not bl
    and bl,32
    sub al,bl
    stosb
    dec cl
    jnz @a1
@a2:
end;

function lowercase(const s:string):string;assembler;

asm
    mov esi,s
    mov edi, at result
    lodsb
    stosb
    movzx ecx,al
    jecxz @a2
@a1:
    lodsb
    cmp al,'A'
    sbb bl,bl
    cmp al,'Z'+1
    sbb bh,bh
    not bh
    or bl,bh
    not bl
    and bl,32
    add al,bl
    stosb
    dec cl
    jnz @a1
@a2:
end;






More information about the fpc-devel mailing list