[fpc-devel] new string - question on usage
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Tue Oct 11 08:52:33 CEST 2011
Martin schrieb:
> just for how to do
>
> procedure foo(x: utf8string); begin end;
>
> var a: string; //ansistring, but contains already utf8
The encoding will be stored or converted when a string is assigned to
that variable. When the FPC implementation is finished, it should be
impossible to have strings stored with a wrong encoding.
> foo(a); // do not convert
Why not?
>>> And what happens if an app did read data from some external source
>>> (serial port) and then wants to declare what encoding it is?
>> http://docwiki.embarcadero.com/VCL/en/System.SetCodePage
>>
>>
> I hadn't seen that.
>
> That may help. Though not the best solution...
It does *not* help, because SetCodePage does a string *conversion*, when
it really changes the encoding. Delphi even had allowed to convert
between UTF-16 (CP 1200) and other (byte oriented) encodings, but later
disallowed such in-place conversions again. Now an UTF-16 (Delphi
default) string is *always* converted, when it's passed to a subroutine
expecting an RawByteString argument.
> I can call it before calling the "foo" proc. But I must revert it
> afterwards, or at sometime later, the string will be translated, when it
> will be used in a normal string again (yet expected to keep being utf8..
IMO the only chance for fixing a wrong encoding is a TBytes (or similar)
buffer, then copy the string content into it (without translation), and
read it back specifying the correct encoding.
> Yes, I know, what i want to do, is not what it was designed for.
> ultimately a huge update to the entire source will be needed... but now
> I need a temporary solution until then
You don't need a temporary solution, until the new strings are perfectly
implemented in FPC. Afterwards you only have to take care for reading
strings from *external* sources, where you have to specify the correct
external encoding - see e.g.
http://docwiki.embarcadero.com/VCL/en/Classes.TStrings.LoadFromStream
with the added Encoding argument.
When you want a variable to contain strings of a specific encoding, e.g.
UTF-8, you simply give it the appropriate type. I assume that an
UTF8String type will be declared like AnsiString<cpUTF8>, with
appropriate constants being declared for the standard codepages.
DoDi
More information about the fpc-devel
mailing list