[fpc-pascal] Re: Ido not understand UTF8 in Windows
Michael Van Canneyt
michael at freepascal.org
Sat Feb 20 15:21:51 CET 2010
On Sat, 20 Feb 2010, Tomas Hajny wrote:
> On Sat, February 20, 2010 01:15, JoshyFun wrote:
>> Hello Tomas,
>>
>> Friday, February 19, 2010, 11:55:39 PM, you wrote:
>>
>> TH> No, this can't work that way, otherwise output of any accented
>> TH> character in one of the Windows codepages would result in the same
>> TH> error.
>>
>> Tested the "wrong" return of stdout:
>>
>> code page UTF8 - 65001 en Windows
>> Length of string: 7
>> camión -> Returned written: 6
>>
>> Source code:
>> -------------------------------------
>> uses classes,windows;
>> var
>> s: ansistring;
>> OutputStream: TStream;
>> Begin
>> Writeln('code page UTF8 - 65001 en Windows');
>> OutputStream := THandleStream.Create(GetStdHandle(STD_OUTPUT_HANDLE));
>> s:='cami'+#$C3+#$B3+'n'; //camión
>> writeln('Length of string: ',Length(s));
>> writeln(' -> Returned written: ',OutputStream.write(s[1],Length(s)));
>> OutputStream.free;
>> End.
>
> OK, this seems to be the problem. The underlying Win32 API (WriteFile) is
> requested to write 7 bytes to a file. However those 7 bytes correspond to
> only 6 characters in UTF-8, and the Win32 API (apparently) returns the
> number of written _characters_ rather than the number of written _bytes_.
I fail to see how this can be an FPC problem.
See
http://msdn.microsoft.com/en-us/library/aa365747(VS.85).aspx
and
http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx
For an explanation. It states clearly that the number of bytes is returned.
If it does return the number of characters, then that is a bug in the Microsoft call,
not in FPC.
Michael.
More information about the fpc-pascal
mailing list