[fpc-pascal] Re: Real length of WideString

Felipe Monteiro de Carvalho felipemonteiro.carvalho at gmail.com
Thu Oct 21 08:15:36 CEST 2010


On Thu, Oct 21, 2010 at 2:28 AM, Zaher Dirkey <parmaja at gmail.com> wrote:
> var
>   ws: widestring;
> begin
>   ws:= 'زاهر';
>   Label3.Caption := inttostr(length(ws));
> //= 8

Here the compiler assumes that your source code is encoded in the
default system encoding and then it converts your string to utf-16.
Probably your system encoding is not UTF-8. so your string is
corrupted by a wrong conversion. Then length 8 seams to indicate that
the string was corrupted.

You could tell the compiler that your source code is in UTF-8, which
should allow the code above to work, but then it will break all utf-8
ansistrings. There is no real solution for this issue while FPC lacks
full Unicode support. In the mean time I recommend using your second
version of the code.

> It is different with
>
>   ws:= UTF8Decode('زاهر');
>   Label3.Caption := inttostr(length(ws));
> //= 4

This is more like the standard Lazarus way of doing things. The string
is encoded in the file as utf-8 and assigned to a ansistring. Then
UTF8Decode converts this to UTF-16. A correct conversion takes place,
because you didn't let the compiler decide for you what to convert,
but you decided yourself.

-- 
Felipe Monteiro de Carvalho



More information about the fpc-pascal mailing list