[fpc-devel] Unicodestring branch, please test and help fixing

Martin Friebe fpc at mfriebe.de
Fri Sep 12 00:44:14 CEST 2008


listmember wrote:
>> Actually, UTF-8 can contain bidi info, it's indeed a matter of the
>> renderer.
> And, how do you propose doing a case-insensitive search in a given 
> text that contains multiple languages?
I assume you speak of multiply collations in on string?
IMHO You can't? But you could use a TStringList.

I also do not know of other apps that could do this. (And it may not be 
possible). Look around. Databses for example, AFAIK the most you can do 
is define a collation per column.

And how would you sort the following example, with mixed collation. Take 
the various german collations. ae can be used as a substitution for 
a-umlaut.

In some collation it sorts as ae (between ad and af), in others it sorts 
as "a-umlaut" (immediately behind "a")
1)   a, ab, ae
2)   a, ae, ab

How would you sort data where one source is of one collation, the other 
source of another (or even worse the collation changes halfway through)? 
It is impossible by definition.
Because taking the 2 Strings above, each of them can come first when 
sorted depending on the collation, but if more than one collation was 
involved the result was undefined.

I even thing that collation is not part of the string. it does not 
change the meaning of the string. It is only used in specific 
operations. And then it must be one collation for both strings. So if 
each of the string had a collation that would cause an issue.

Martin



More information about the fpc-devel mailing list