[fpc-devel] Is calling the Windows Unicode APIs really faster than the ANSI API's?
Ivo Steinmann
ivo_steinmann at gmx.net
Fri Sep 26 12:06:19 CEST 2008
Marco van de Voort schrieb:
>
>
>> For many people Unicode is just "let's go UTF-8". It's far more than that
>> and 100% supporting Unicode is even next to impossible.
>>
>
> Correct, but that is what I'm suggesting. UTF-16 is not a cure all either,
> only at a first superficial glance. I'm btw not for UTF-8, but for working
> in the native encoding per platform.
>
>
I guess that would be one of the best solutions. Having a system unicode
string type and then some specialized string types.
SysString
UTF8String
UTF16String
UTF32String
Anyway, I still think something like this would be nice ;) I have got
already an implementation of such a system and I think it's not best
solution (there's no best solution) but it's not a bad one. let's see
what next delphi version brings, but my code works like this:
type
TMapFunction = function(const Dest: pointer; const Source: Pointer):
integer;
PEncodings = ^TEncodings;
TEncodings = record
signsize: integer; // 1,2 or 4
encode: TMapFunction; // encode some ucs32 string to this encoding
decode: TMapFunction; // decode this encoding to ucs32 buffer
end;
const
MyOwnEncodings: TEncodings = (
Foo: ....
Bar: ....
);
type
SysString = UnicodeString[SystemEncoding]
UTF8String = UnicodeString[UTF-8]
UTF16String = UnicodeString[UTF-16]
MyOwnString = UnicodeString[MyOwnEncodings]
then you can assign all specialized string types to UnicodeString, but
you can't change the encoding of UnicodeString (either it's not
changeable at all or it's locked);
TUnicodeStringRec = record
Encoding: PEncoding;
Locked: Boolean; // locked encoding SetEncoding(S, someEncoding);
is not possible
CodeCount: Integer; // number of signs
RefCount: Integer; // refcounter
Length: Integer; // number of char
FirstChar: Byte/Word/Longword;
end;
locked encoding is allways true after you assigned a spezialized string
to UnicodeString, eg
S1: UTF8String;
S2: UnicodeString;
S1 := 'foobar';
S2 := S1;
SetEncoding(S2, UTF16); <<< exception
for fast string processing, it's easy to convert a string to UCS32
S1: UTF8String;
S2: UCS32String;
P: PUCS32Char;
S2 := S1;
for i := 0 to length(S2) - 1 do
S2[i] := 'X';
S1 := S2;
or
P := PUCS32Char(S2);
while P^ <> 0 do
begin
P^ := 'X';
Inc(P);
end;
-Ivo Steinmann
More information about the fpc-devel
mailing list