[fpc-devel] UTF-8 string literals

Sven Barth pascaldragon at googlemail.com
Sun May 7 12:13:18 CEST 2017


Am 05.05.2017 16:08 schrieb "Sven Barth" <pascaldragon at googlemail.com>:
>
> Am 05.05.2017 16:03 schrieb "Juha Manninen" <juha.manninen62 at gmail.com>:
> >
> > On Fri, May 5, 2017 at 2:53 PM, Mattias Gaertner
> > <nc-gaertnma at netcologne.de> wrote:
> > > 1. When using a character outside BMP FPC stops with:
> > > Error: UTF-8 code greater than 65535 found
> > > For example:
> > > const Eyes = '👀';
> >
> > I copy a related post from Lazarus list by myself and Sven Barth.
> > It belongs here:
> >
> > On Fri, May 5, 2017 at 3:56 PM, Sven Barth via Lazarus
> > <lazarus at lists.lazarus-ide.org> wrote:
> > > That is mainly due to the compiler not supporting surrogate pairs for
the
> > > UTF-8 -> UTF-16 conversion. If it would support them, then there
wouldn't be
> > > a problem anymore...
> >
> > That is a serious bug. Getting codepoints right is the absolute
> > minimum requirement for Unicode support. Surrogate pairs are the
> > UTF-16 equivalent of multi-byte codepoints in UTF-8.
> >
> > Now I understand this was not caused by our UTF-8 run-time switch
> > "hack". It is a plain bug in FPC.
> > Is there a plan to fix it?
>
> Now it is fixed :D (revision 36116; maybe we should merge that to fixes
once I or someone else tested a big endian target)

Okay, it works correctly on big endian targets as well (and Mac OS X 10.4
even has valid characters for the console to test with :D ). Thus this
change could be merged to 3.0.3.

Regards,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20170507/5c989120/attachment.html>


More information about the fpc-devel mailing list