[fpc-devel] Encoded AnsiString
Michael Van Canneyt
michael at freepascal.org
Sun Dec 29 17:24:43 CET 2013
On Sun, 29 Dec 2013, Hans-Peter Diettrich wrote:
> Michael Van Canneyt schrieb:
>>
>>
>> On Sun, 29 Dec 2013, Hans-Peter Diettrich wrote:
>>
>>> Inspired by the current Lazarus discussion I'd like to learn more about
>>> the current state of the implementation of the new AnsiStrings.
>>>
>>> In case nothing has be done yet, I'd suggest to extend TAnsiRec by the new
>>> codePage and elemSize fields (words). These can be zero for now, so that
>>> the remaining codebase is not affected. Then it will be possible to play
>>> around with encoded strings, using the codePage field.
>>>
>>
>> All this is done already a long time ago in trunk.
>> We're way past that stage.
>
> I'm very confused, didn't use FPC for a long time. Have to refresh memory of
> all related procedures...
>
> How do I instruct fpcup to checkout the trunk version? (Windows)
> I tried to add an parameter fpcURL=trunk to the shortcut, is this correct?
>
> How do I proceed (build, use in Lazarus...)?
> Any links appreciated :-)
No idea.
>
>> Current stage is the creation of a unicode RTL, where all base file/string
>> operations accept unicode strings. This is done too.
>>
>> Next step is creation of the unicode RTL, where "string" = "widestring".
>> This will be combined with the dotted unit filenames, to be Delphi 2010+
>> compatible.
>
> <sigh.sigh>
> How do I create source files for use with both versions?
What do you mean by this statement ?
>> To allow people to choose, 2 RTLs will be created: one unicode
>> (string=ansistring), one non-unicode (string=widestring).
>>
>> This will result (probably) in 2 paths:
>> units/os-cpu
>> units/os-cpu-unicode
>> This is not decided yet.
>>
>> I planned the work in februari/march.
>
> Thanks :-)
>
> Where can I jump in?
When I'm done I will release a version for testing to the public.
>>> A related question:
>>> Why is the string length set to zero in NewAnsiString, when the allocated
>>> Length is already known?
>>
>> Because the allocated memory length is not necessarily equal to the string
>> length.
>> If you have a string of length 50, setting the length to 25 will not
>> discard and reallocate the memory block, but merely set the character
>> length to 25.
>
> This means that the allocated length is stored somewhere else, in the memory
> block descriptor?
Yes.
>
> How can a user request an string of a specific allocation size?
You should not. But if you absolutely want that: Look up TAnsiRec and do
SetLength(S,AllocationLength-SizeOf(TAnsiRec));
Don't rely on this. Messing with internals is always a bad idea.
That is why TAnsiRec is an internal type, not exposed.
To prevent people (like you, seemingly) from messing with internals.
>
>
> Another one:
>
> I've heard that a mix of encodings converts the (concatenated) output
> (RawByteString?) to CP_ACP, with possible losses. Is this correct?
Define "output" ?
Michael.
More information about the fpc-devel
mailing list