[fpc-pascal] Re: UnicodeString comparison performance
tom_at_work at gmx.at
Tue Jul 24 12:20:00 CEST 2012
On Mon, 2012-07-23 at 16:58 +0200, OBones wrote:
> Jonas Maebe wrote:
> > On 23 Jul 2012, at 10:58, OBones wrote:
> >> leledumbo wrote:
> >>> I look at the generated code and in the direct one there's additional
> >>> overhead of decrementing the reference counter on each iteration.
> >> I see it too now (I forgot about the -a option).
> >> I can understand why there is a call to the decrementer outside the loop when using the variable, but I don't understand why it pushes it completely out of the loop. I mean, isn't there a reference count problem here?
> > Reference counted data types are returned by reference in a location passed to the function by the caller. The compiler here passes the address of S to the function, so that when assigning something to the function result inside the function, S' reference count gets decreased.
> > The fact that when the result is returned in a temp, this temp also gets finalized on the caller side before the call is just a code generator inefficiency. I've changed that in trunk.
> Thanks for the explanation and the fix.
> Because the finalization happened too early, those memory allocations
> and deallocations were very costly and I found the direct code to be 30
> times slower.
> Doing it this way has the advantage of being inherently thread safe, but
> considering the performance penalty, I have moved to doing it this way:
> Buffer: array [0..1024*1024-1] of WideChar;
> BufferLock: NativeInt;
> function GetSomeString(Index: Integer): UnicodeString;
> while BufferLock > 0 do
> CallToAnAPIThatWritesBackAWideString(@Buffer, Length(Buffer) - 1);
> Result := PWideChar(@Buffer);
> This works, with an equivalent penalty on both methods and is threadsafe
> (I believe).
This code is not thread safe at all. A thread switch after the while
loop and before the increment will not prevent progress on other
threads, so multiple threads can enter the "critical section".
Use EnterCriticalSection/LeaveCriticalSection from the RTL if you need
If you could somewhat decrease the size of your buffer to something
reasonable - do you really expect to translate a string with 1M chars? -
use the stack. Actually even a 2M data object on the stack might be
Further, in any case you'll need some code that works in "all" cases
anyway - what if your string does exceed 1M chars after all? I mean, on
a 1M string, heap allocation will probably be the least of your
This will completely avoid use of any synchronization primitive.
function GetSomeString(index : integer) : UnicodeString;
bufsize = 1024; // as you wish
buffer : array[0..bufsize-1] of WideChar;
pbuffer : PWideChar;
islongstring : boolean;
lengthofstring : integer;
lengthofstring := length(stringfromindex(integer));
islongstring := lengthofstring > bufsize;
if (islongstring) then
pbuffer := getmem(lengthofstring * sizeof(WideChar));
pbuffer := @buffer;
if (islongstring) then
result := pbuffer;
You may want to improve the code further by e.g. checking whether the
memory allocation has been successful or not; or factor out the case
when the string is big into a separate method to improve readability.
More information about the fpc-pascal