[fpc-pascal] Funny things about utf-8 strings on mac
Felipe Monteiro de Carvalho
felipemonteiro.carvalho at gmail.com
Tue Jun 12 09:28:06 CEST 2007
Hi,
I edited my source code with TextWrangler (a macintosh text editor),
setting the encoding to utf-8, and when I opened with Lazarus it would
show the beginning of the file like this:
Ôªøunit mainform;
Notice the first 3 funny characters (actually on lazarus I see
different characters, but they changed on copy+paste). FPC didn't seam
to care about this, and the file compiled without problems, so I
ignored them.
But then, utf-8 strings stoped working.
This, for example:
procedure TForm1.FormCreate(Sender: TObject);
var
MyStr: string;
i: Integer;
begin
MyStr := 'Texto ł ñ ø ß á';
WriteLn('[TForm1.FormCreate] Printing string values');
WriteLn('Length: ', Length(MyStr));
for i := 1 to Length(MyStr) do
Write(IntToHex(Integer(MyStr[i]), 2) + ' ');
WriteLn('');
Self.Caption := MyStr;
Label1.Caption := 'átomo tômo não';
Button1.Caption := 'łñø˘ðßßăŏ';
end;
Would result in non-sense results, like this:
[TForm1.FormCreate] Printing string values
Length: 15
54 65 78 74 6F 20 20 20 20 20 20 20 20 20 20
However, if I remove the 3 funny characters, everything work normally
again, and I see my UTF-8 characters on screen, and the text output
is:
[TForm1.FormCreate] Printing string values
Length: 20
54 65 78 74 6F 20 C5 82 20 C3 B1 20 C3 B8 20 C3 9F 20 C3 A1
Does anyone know what are those funny characters? (I suppose some kind
of encoding setting), and why they make utf-8 strings sudenlly stop
working?
thanks,
--
Felipe Monteiro de Carvalho
More information about the fpc-pascal
mailing list