[fpc-pascal] Funny things about utf-8 strings on mac
Jonas Maebe
jonas.maebe at elis.ugent.be
Tue Jun 12 09:46:14 CEST 2007
On 12 jun 2007, at 09:28, Felipe Monteiro de Carvalho wrote:
> I edited my source code with TextWrangler (a macintosh text editor),
> setting the encoding to utf-8, and when I opened with Lazarus it would
> show the beginning of the file like this:
>
> Ôªøunit mainform;
>
> Notice the first 3 funny characters (actually on lazarus I see
> different characters, but they changed on copy+paste).
They are the standard marker to identify a file as UTF-8. This is not
Mac-specific in any way, it's part of the unicode standard.
> (I suppose some kind
> of encoding setting), and why they make utf-8 strings sudenlly stop
> working?
You said things did initially work with the UTF-8 marker in place.
The default code page used by FPC is 8859-1. However, the scanner
detects the UTF-8 marker if present, and when it finds it then it
switches the code page to UTF-8. You can also set the code page
manually to UTF-8 using {$codepage utf-8}.
The UTF-8 marker maybe got mangled somehow by Lazarus or so. I don't
know why it worked again afterwards when you removed the marker.
Jonas
More information about the fpc-pascal
mailing list