[fpc-devel] Encoded AnsiString

Michael Van Canneyt michael at freepascal.org
Sun Dec 29 17:24:43 CET 2013



On Sun, 29 Dec 2013, Hans-Peter Diettrich wrote:

> Michael Van Canneyt schrieb:
>> 
>> 
>> On Sun, 29 Dec 2013, Hans-Peter Diettrich wrote:
>> 
>>> Inspired by the current Lazarus discussion I'd like to learn more about 
>>> the current state of the implementation of the new AnsiStrings.
>>> 
>>> In case nothing has be done yet, I'd suggest to extend TAnsiRec by the new 
>>> codePage and elemSize fields (words). These can be zero for now, so that 
>>> the remaining codebase is not affected. Then it will be possible to play 
>>> around with encoded strings, using the codePage field.
>>> 
>> 
>> All this is done already a long time ago in trunk.
>> We're way past that stage.
>
> I'm very confused, didn't use FPC for a long time. Have to refresh memory of 
> all related procedures...
>
> How do I instruct fpcup to checkout the trunk version? (Windows)
> I tried to add an parameter fpcURL=trunk to the shortcut, is this correct?
>
> How do I proceed (build, use in Lazarus...)?
> Any links appreciated :-)

No idea.

>
>> Current stage is the creation of a unicode RTL, where all base file/string 
>> operations accept unicode strings. This is done too.
>> 
>> Next step is creation of the unicode RTL, where "string" = "widestring".
>> This will be combined with the dotted unit filenames, to be Delphi 2010+ 
>> compatible.
>
> <sigh.sigh>
> How do I create source files for use with both versions?

What do you mean by this statement ?

>> To allow people to choose, 2 RTLs will be created: one unicode 
>> (string=ansistring), one non-unicode (string=widestring).
>> 
>> This will result (probably) in 2 paths:
>> units/os-cpu
>> units/os-cpu-unicode
>> This is not decided yet.
>> 
>> I planned the work in februari/march.
>
> Thanks :-)
>
> Where can I jump in?

When I'm done I will release a version for testing to the public.

>>> A related question:
>>> Why is the string length set to zero in NewAnsiString, when the allocated 
>>> Length is already known?
>> 
>> Because the allocated memory length is not necessarily equal to the string 
>> length.
>> If you have a string of length 50, setting the length to 25 will not 
>> discard and reallocate the memory block, but merely set the character 
>> length to 25.
>
> This means that the allocated length is stored somewhere else, in the memory 
> block descriptor?

Yes.

>
> How can a user request an string of a specific allocation size?

You should not. But if you absolutely want that: Look up TAnsiRec and do

SetLength(S,AllocationLength-SizeOf(TAnsiRec));

Don't rely on this. Messing with internals is always a bad idea.
That is why TAnsiRec is an internal type, not exposed. 
To prevent people (like you, seemingly) from messing with internals.

>
>
> Another one:
>
> I've heard that a mix of encodings converts the (concatenated) output 
> (RawByteString?) to CP_ACP, with possible losses. Is this correct?

Define "output" ?

Michael.



More information about the fpc-devel mailing list