[fpc-pascal] Funny things about utf-8 strings on mac

Jonas Maebe jonas.maebe at elis.ugent.be
Tue Jun 12 10:08:37 CEST 2007


On 12 jun 2007, at 10:00, Felipe Monteiro de Carvalho wrote:

>> The default code page used by FPC is 8859-1. However, the scanner
>> detects the UTF-8 marker if present, and when it finds it then it
>> switches the code page to UTF-8. You can also set the code page
>> manually to UTF-8 using {$codepage utf-8}.
>
> Why does the codepage matter in this case? I would imagine that FPC
> just reads my string as a bunch of bytes... fpc doesn't need to
> perform any operations with it.

The compiler internally stores such strings as widestrings. I don't  
know the details of what the scanner does exactly with utf-8 and why  
it does so, but there's quite a few utf-8 specific code in the scanner.


Jonas



More information about the fpc-pascal mailing list