[fpc-devel] UTF-8 string literals

Juha Manninen juha.manninen62 at gmail.com
Fri May 5 16:03:23 CEST 2017


On Fri, May 5, 2017 at 2:53 PM, Mattias Gaertner
<nc-gaertnma at netcologne.de> wrote:
> 1. When using a character outside BMP FPC stops with:
> Error: UTF-8 code greater than 65535 found
> For example:
> const Eyes = '👀';

I copy a related post from Lazarus list by myself and Sven Barth.
It belongs here:

On Fri, May 5, 2017 at 3:56 PM, Sven Barth via Lazarus
<lazarus at lists.lazarus-ide.org> wrote:
> That is mainly due to the compiler not supporting surrogate pairs for the
> UTF-8 -> UTF-16 conversion. If it would support them, then there wouldn't be
> a problem anymore...

That is a serious bug. Getting codepoints right is the absolute
minimum requirement for Unicode support. Surrogate pairs are the
UTF-16 equivalent of multi-byte codepoints in UTF-8.

Now I understand this was not caused by our UTF-8 run-time switch
"hack". It is a plain bug in FPC.
Is there a plan to fix it?

Juha



More information about the fpc-devel mailing list