[fpc-devel] FPC 2.3.1 seems a mixed mess with Unicode support
Jonas Maebe
jonas.maebe at elis.ugent.be
Wed Sep 16 11:56:57 CEST 2009
On 16 Sep 2009, at 11:44, Michael Schnell wrote:
> Jonas Maebe wrote:
>>
>> Analysing strings by hand not a very smart thing to do with unicode
>> strings.
>
> How should it be avoided if I want to react on a user input or on a
> string read from a file ?
Don't analyse them character by character, but use standard functions
to compare them. Any unicode support library worth its salt will offer
you many different ways to compare strings, because depending on the
context you may need different ways:
a) the locale may matter (e.g., depending on whether "." means
"decimal point" or "thousands separator", a comparison result may be
different)
b) you have many different ways to order (unicode) strings. E.g.,
these are the options that Apple's CFString comparison offers: <http://developer.apple.com/mac/library/documentation/CoreFoundation/Reference/CFStringRef/Reference/reference.html#//apple_ref/doc/constant_group/String_Comparison_Flags
> (note that not all of those flags are about regular comparisons,
and some of them are just for performance reasons). See in particular
flags such as kCFCompareNonliteral, kCFCompareWidthInsensitive and
kCFCompareLocalized.
This indeed causes problems with Pascal's generic comparison
operators. I guess we will either have to define a particular
behaviour for them (presumably whatever CodeGear chose), add some
global variable that you can set to influence the behaviour, or tell
people to use CompareText() and friends (and probably add variants
with various options).
The upside of these complications (which have always existed, but most
people just ignored them and their programs only worked with one or
two locales and/or encodings), is that if you deal with it properly in
the context of unicode, then your code will probably automatically
behave "correctly" with many locales/scripts.
Jonas
More information about the fpc-devel
mailing list