[fpc-pascal] Unicode chars losing information
Graeme Geldenhuys
mailinglists at geldenhuys.co.uk
Tue Mar 9 01:18:26 CET 2021
On 08/03/2021 7:49 pm, Jonas Maebe via fpc-pascal wrote:
> It's not possible to safely use unicodestring without
> knowing how 16bit unicode works. The compiler can't solve that.
I disagree. Java does just that! The issue is the assumption of using
array indexing into the a string. I guess developers should stop doing
that.
The important point is:
But developer should be able to use Unicode strings without needing
to know the is and outs of Unicode and UTF-16 encoding. At least
that's what's possible with Java and other languages.
FPC need to introduce class helpers or something with methods like
MyUnicodeString.CharAt(x) and if the char at position x is a
surrogate, then return the surrogate. Implicitly include whatever is
needed to make that work. Other helper methods could return
the Byte or CodePoint at position x - depending on what the developer
wants. Naming these methods in a logical way is key, as they become
self-documenting. No need for 10 web pages explaining how to work with
a [unicode] string.
FPC (and Delphi) really need to get with the times.
Regards,
Graeme
More information about the fpc-pascal
mailing list