[fpc-pascal] Re: Real length of WideString
Felipe Monteiro de Carvalho
felipemonteiro.carvalho at gmail.com
Thu Oct 21 08:15:36 CEST 2010
On Thu, Oct 21, 2010 at 2:28 AM, Zaher Dirkey <parmaja at gmail.com> wrote:
> var
> ws: widestring;
> begin
> ws:= 'زاهر';
> Label3.Caption := inttostr(length(ws));
> //= 8
Here the compiler assumes that your source code is encoded in the
default system encoding and then it converts your string to utf-16.
Probably your system encoding is not UTF-8. so your string is
corrupted by a wrong conversion. Then length 8 seams to indicate that
the string was corrupted.
You could tell the compiler that your source code is in UTF-8, which
should allow the code above to work, but then it will break all utf-8
ansistrings. There is no real solution for this issue while FPC lacks
full Unicode support. In the mean time I recommend using your second
version of the code.
> It is different with
>
> ws:= UTF8Decode('زاهر');
> Label3.Caption := inttostr(length(ws));
> //= 4
This is more like the standard Lazarus way of doing things. The string
is encoded in the file as utf-8 and assigned to a ansistring. Then
UTF8Decode converts this to UTF-16. A correct conversion takes place,
because you didn't let the compiler decide for you what to convert,
but you decided yourself.
--
Felipe Monteiro de Carvalho
More information about the fpc-pascal
mailing list