[fpc-devel] Is calling the Windows Unicode APIs really faster than the ANSI API's?

Ivo Steinmann ivo_steinmann at gmx.net
Fri Sep 26 12:06:19 CEST 2008


Marco van de Voort schrieb:
>  
>   
>> For many people Unicode is just "let's go UTF-8". It's far more than that 
>> and 100% supporting Unicode is even next to impossible.
>>     
>
> Correct, but that is what I'm suggesting. UTF-16 is not a cure all either,
> only at a first superficial glance. I'm btw not for UTF-8, but for working
> in the native encoding per platform.
>
>   
I guess that would be one of the best solutions. Having a system unicode
string type and then some specialized string types.

SysString
UTF8String
UTF16String
UTF32String



Anyway, I still think something like this would be nice ;) I have got
already an implementation of such a system and I think it's not best
solution (there's no best solution) but it's not a bad one. let's see
what next delphi version brings, but my code works like this:

type
  TMapFunction = function(const Dest: pointer; const Source: Pointer):
integer;

  PEncodings = ^TEncodings;
  TEncodings = record
    signsize: integer; // 1,2 or 4
    encode: TMapFunction;  // encode some ucs32 string to this encoding
    decode: TMapFunction;  // decode this encoding to ucs32 buffer
  end;

const
  MyOwnEncodings: TEncodings = (
    Foo: ....
    Bar: ....
  );

type
  SysString = UnicodeString[SystemEncoding]
  UTF8String = UnicodeString[UTF-8]
  UTF16String = UnicodeString[UTF-16]
  MyOwnString = UnicodeString[MyOwnEncodings]


then you can assign all specialized string types to UnicodeString, but
you can't change the encoding of UnicodeString (either it's not
changeable at all or it's locked);

TUnicodeStringRec = record
  Encoding: PEncoding;
  Locked: Boolean; // locked encoding   SetEncoding(S, someEncoding); 
is not possible
  CodeCount: Integer;   // number of signs
  RefCount: Integer;  // refcounter
  Length: Integer; // number of char
  FirstChar: Byte/Word/Longword;
end;


locked encoding is allways true after you assigned a spezialized string
to UnicodeString, eg

S1: UTF8String;
S2: UnicodeString;

S1 := 'foobar';
S2 := S1;
SetEncoding(S2, UTF16);  <<< exception

for fast string processing, it's easy to convert a string to UCS32

S1: UTF8String;
S2: UCS32String;
P: PUCS32Char;

S2 := S1;
for i := 0 to length(S2)  - 1 do
  S2[i] := 'X';
S1 := S2;

or

P := PUCS32Char(S2);
while P^ <> 0 do
begin
  P^ := 'X';
  Inc(P);
end;


-Ivo Steinmann



More information about the fpc-devel mailing list