[fpc-pascal] Re: UnicodeString comparison performance

OBones obones at free.fr
Mon Jul 23 16:58:00 CEST 2012


Jonas Maebe wrote:
> On 23 Jul 2012, at 10:58, OBones wrote:
>
>> leledumbo wrote:
>>> I look at the generated code and in the direct one there's additional
>>> overhead of decrementing the reference counter on each iteration.
>> I see it too now (I forgot about the -a option).
>> I can understand why there is a call to the decrementer outside the loop when using the variable, but I don't understand why it pushes it completely out of the loop. I mean, isn't there a reference count problem here?
> Reference counted data types are returned by reference in a location passed to the function by the caller. The compiler here passes the address of S to the function, so that when assigning something to the function result inside the function, S' reference count gets decreased.
>
> The fact that when the result is returned in a temp, this temp also gets finalized on the caller side before the call is just a code generator inefficiency. I've changed that in trunk.
Thanks for the explanation and the fix.
In my example the difference is very small, but in my real code, it was 
amplified by the fact that I was doing this:

   SetLength(Result, 1024 * 1024);
   CallToAnAPIThatWritesBackAWideString(@Result[1], Length(Result) - 1);
   SetLength(Result, StrLen(PWideChar(Result)));

Because the finalization happened too early, those memory allocations 
and deallocations were very costly and I found the direct code to be 30 
times slower.
Doing it this way has the advantage of being inherently thread safe, but 
considering the performance penalty, I have moved to doing it this way:

var
   Buffer: array [0..1024*1024-1] of WideChar;
   BufferLock: NativeInt;

function GetSomeString(Index: Integer): UnicodeString;
begin
   while BufferLock > 0 do
     Sleep(1);

   InterlockedIncrement(BufferLock);
   try
     CallToAnAPIThatWritesBackAWideString(@Buffer[0], Length(Buffer) - 1);
     Result := PWideChar(@Buffer[0]);
   finally
     InterlockedDecrement(BufferLock);
   end;
end;

This works, with an equivalent penalty on both methods and is threadsafe 
(I believe).

Should anyone have a better workaround, I'd be very pleased to hear 
about it as I can't move to the trunk version of FPC just yet.

Many thanks for the help

Regards



More information about the fpc-pascal mailing list