[fpc-pascal] Case insensitive comparison of strings with non-ascii characters
JoshyFun
joshyfun at gmail.com
Sun Jul 26 16:23:10 CEST 2009
Hello FPC-Pascal,
Sunday, July 26, 2009, 9:43:06 AM, you wrote:
t> In your strict sense, AnsiCompareText didn't work either.
Yes, Ansi does not work fine also for some usual languages. But we are
used to "simulate" a comparetext using lowercase(a)=lowercase(b) where
the same character could have different foldings to be represented.
t> Does AnsiCompareText report these strings as equal? No. Same with 'ö' ->
'oe' or 'à'->>'A'
Yeah, you are completly right.
>> Write unicode functions in UTF8 is almost non-sense, most unicode
t> operations are not like we are used in the ANSI world
t> The latter is certainly true, but I don't understand what it has to do
t> with UTF-8 or UTF-16.
Because unicode operations many times needs scan forward and back,
rescan, several pass, etc, so processing it in native UTF-8 is a waste
of CPU instead a gain, except some trivial operations. Usually is
faster to pass the string to UTF-16 or UTF-32 and then perform all the
operations that encode and decode to UTF-8 constantly.
>> The code is not optimized but if somebody wants to use them please ask
t> Yes please!
I'll try to make it compile :) The code is mostly experimental, so
some functions are implemented to simply work and return a value not
to get the value fast.
I'll post a link in the list as soon as I can confirm that it at least
compile.
--
Best regards,
JoshyFun
More information about the fpc-pascal
mailing list