[fpc-devel] Unicode support (again)
Vincent Snijders
vsnijders at vodafonevast.nl
Tue Nov 11 15:26:30 CET 2008
Jonas Maebe schreef:
>
> On 10 Nov 2008, at 17:00, Vincent Snijders wrote:
>
>> procedure TForm1.Button1Click(Sender: TObject);
>> var
>> w: widestring;
>> i: integer;
>> begin
>> w := UTF8Decode('hallo äöü');
>> Edit1.Caption := UTF8Encode(w);
>
> Note that if the file has been saved using an UTF-8 BOM, then the
> compiler will at compile time create a widestring containing the UTF-16
> version of 'hallo äöü'. If you then pass this to a function expecting an
> ansistring (such as UTF8Decode), then the widestring manager will be
> used to decode that string and this decoded string will be passed to
> UTF8Decode. So then you'll pass an ansi-encoded string to UTF8Decode
> rather than an UTF-8-encoded string (unless ansi = utf-8 for the current
> execution).
>
Yes, that might be confusing. Therefore I don't recommend to save if with an
UTF8-BOM of compile -Fcutf-8
> It seems much more advisable to me to save the file with an UTF-8 BOM,
> or even better to add {$encoding utf-8} (and/or to pass -Fcutf-8 to the
> compiler) and then just use
>
> Edit1.Caption := UTF8Encode('hallo äöü');
As an extra bonus of not adding the UTF-8 BOM, you don't have to use conversions to
assign the UTF8 string in the source, translated by the compiler to a UTF16 string,
to an UTF8 encoded ansistring. It saves a conversion at compile time and a
conversion at run time.
Edit1.Caption := 'hallo äöü'.
Is there an explicit way to tell the compile not to convert widestring string
constants, even if the file contains an UTF-8 BOM? The UTF-8 BOM might be usefull,
if you want to edit the file with another text editor.
Vincent
More information about the fpc-devel
mailing list