[fpc-devel] assign constant text to widestring
fpc at mfriebe.de
Wed Oct 22 17:01:56 CEST 2008
Michael Schnell wrote:
> Hi Experts,
> When I want to simply assign a constant text "ö2" to a WideString I
> would think that I just write s := 'ö2'; . But I found that this does
> not work, but that it creates a WideString of length 3 that contains
> the three 8-Bit subcodes of the utf8-coded string "ö2", zero-extended
> to 16 Bits, each in one WideChar element. For me this is very
> surprising and incompatible to the same code (s := 'ö2'; ) used in a
> Turbo-Delphi program.
> Obviously - other than Turbo-Delphi that uses ANSIString here - a
> constant string gets UTF8String as it's intermediate type. This might
> be a useful definition, but if that is done this way why does an
> assignment WideString := UTF8String inot implicitly call UTF8Decode as
> a type conversion ? In my example it calls fpc_ansistr_to_widestr
> instead, just as if the UTF8String would be an ANSIString.
I am not an expert, but here is what I believe to know:
This is the result of 2 (hidden) "features":
AFAIK the compiler reads the source as non-utf8 (latin or some 8 bit
encoding). This leads to other things too, like identifiers cannot
The String within the quotes is a byte sequence to the compiler. And the
compiler does not know it to be utf8. From your description I take it
the compiler does translate those 3 "8bit chars" into some 16bit chars
(correctness of this translation based on the 8bit source encoding is
Lazarus uses UTF8 for everything, it will save your string as utf8. If
Your string was kept as ansistring, the compiler would treat it as
bytes, and pass it through, so any code wanting to see the utf8 would be
You can try and tell Lazarus to save you file as latin1. As long as all
you strings fit into latin1, this may work; IF and only if the compiler
will translate the latin1 into correct Widechars.
It will not work for anything not in utf8. AFAIK Lazarus currently
doesn't save in ucs2 (or any 16 bit encoding). But even if Lazarus did,
since the compiler wants 8bit encoding, your whole source would be broken.
Not much help, I know. Maybe some one else does have more ideas / knowledge.
> Is there some compiler setting to change this ?
> fpc-devel maillist - fpc-devel at lists.freepascal.org
More information about the fpc-devel