[fpc-pascal] Problem with string conversion
Felipe Monteiro de Carvalho
felipemonteiro.carvalho at gmail.com
Fri Oct 20 18:42:30 CEST 2006
On 10/20/06, Vincent Snijders <vsnijders at quicknet.nl> wrote:
> This should be
> WideText := GetMem(Size*2);
> because you get the number of characters, and the number of bytes 2* number of
> characters.
Thanks, it works, but I still have doubts.
1) On linux I will need that cwstring unit, right? This was a utf-8
test to be used on fpGUI, and possibly LCL. So can´t we just add
cwstring on another unit instead of the first of the program?
2) Does this work in case my string contains characters bigger then
#FFFF ? I mean, it seams that we suppose that each character will have
2 bytes, but this may not be true.
3) Shouldn´t we allocate Size * 2 + 2? I mean, we did not allocate
space for the null-terminator.
4) Here is this function on action:
procedure TGDICanvas.DoTextOut(const APosition: TPoint; const AText: String);
var
UnicodeEnabledOS: Boolean;
WideText: PWideChar;
AnsiText: string;
Size: Integer;
begin
UnicodeEnabledOS := True;
NeedFont(True);
if UnicodeEnabledOS then
begin
Size := Utf8ToUnicode(nil, PChar(AText), 0);
WideText := GetMem(Size * 2);
Utf8ToUnicode(WideText, PChar(AText), Size);
dynWindows.TextOutW(Handle, APosition.x, APosition.y, WideText, Size - 1);
FreeMem(WideText);
end
else
begin
AnsiText := Utf8ToAnsi(AText);
Windows.TextOut(Handle, APosition.x, APosition.y, PChar(AnsiText),
Length(AnsiText));
end;
end;
Notice the Size - 1 on the TextOutW call. If I use Size instead of
Size -1 it will display a wrong character as the last of my string.
Even if I clean the string filling it with zeroes before I pass it to
the conversion unit. Why is that? Size already counts the
null-terminator?
Umm, I think that Utf8ToUnicode needs to be documented. Even some
comments on the code would help, currently there are no comments at
all.
5) When I try to convert strings that contain Line-Endings it doesn´t
seam to work. What happens to Line-Ending marks on UTF-16? I mean, if
we are on linux, a utf-16 line ending marker cannot have just 1 byte,
can it?
If I convert a string with a line ending and pass that to ExtTextOutW,
a wrong character will appear on the place of the line-ending marker.
thanks,
--
Felipe Monteiro de Carvalho
More information about the fpc-pascal
mailing list