[fpc-pascal] Case insensitive comparison of strings with non-ascii characters

theo xpde at theo.ch
Sun Jul 26 09:43:06 CEST 2009


> lowercasemapping(a)=lowercasemapping(b)
>
> is the same as:
>
> IsSameText(a,b)
>
> is wrong at unicode levels.
>
>   
@JoshyFun

It depends on what you excpect from such a function and with which sort
of input data
you have to deal.
I wouldn't say it is wrong. It is not really accurate for all possible
language and unicode details but it's fast.

In your strict sense, AnsiCompareText didn't work either.

Even Swiss German de_CH (my language) differs from de_DE.
For example if a Swiss German user is looking for the word "schließlich"
(finally) in a german text he will type
"schliesslich" in the search box because the letter "ß" does not exist
on Swiss German keybords.

Does AnsiCompareText report these strings as equal? No. Same with 'ö' ->
'oe' or 'à'->'A'


> Write unicode functions in UTF8 is almost non-sense, most unicode
operations are not like we are used in the ANSI world

The latter is certainly true, but I don't understand what it has to do
with UTF-8 or UTF-16.

> The code is not optimized but if somebody wants to use them please ask

Yes please!

Regards Theo






More information about the fpc-pascal mailing list