[fpc-devel] UTF-8 string literals

Mattias Gaertner nc-gaertnma at netcologne.de
Fri May 5 13:53:35 CEST 2017


Hi,

AFAIK FPC stores UTF-8 string literals (-Fcutf8) as widestrings
instead of UTF8String. Please correct me if I'm wrong.

This has several side effects:

1. When using a character outside BMP FPC stops with:
Error: UTF-8 code greater than 65535 found
For example:
const Eyes = '👀';

2. Assigning a UTF-8 literal to an UTF8String requires a
widestringmanager.
For example non ISO-8859-1 chars are mangled:
var u: UTF8String = 'äöüالعَرَبِيَّة';

3. PChar on a string literal does not work as expected. You get the
bytes of a widestring instead.


What would happen if FPC would be extended to store UTF-8
literals as UTF8String? 
What are the disadvantages?


Mattias



More information about the fpc-devel mailing list