[fpc-devel] Peculiar string comparison behaviour ("0-0" and "-00" are same)

Denis Kozlov dezlov at gmail.com
Sun Aug 15 22:59:57 CEST 2021


Hello,

I have encountered a peculiar behaviour in FPC string comparison 
functions (UnicodeSameStr, UnicodeTextStr, UnicodeCompareStr, and Wide* 
variants). Basically, on some systems "0-0" and "-00" strings are 
considered to be same.

FPC string comparison functions use the WideStringManager, which I 
believe calls Win32CompareUnicodeString/Win32CompareWideString on 
Windows, which themselves call CompareStringW function (WinAPI). I was 
already aware of some Unicode equality rules for various locales, like 
"sharp s" (U+00DF) and "ss", but "0-0" and "-00" took me by surprise. As 
I researched more, it became apparent that the "-" (dash, minus, hyphen) 
symbol can be completely ignored by the CompareStringW function. 
Stranger yet, it affects only some systems, despite having configured 
the same locale and region.

I found several relevant articles which talk about the peculiarities of 
CompareString function:
http://archives.miloush.net/michkap/archive/2005/05/05/414845.html
http://archives.miloush.net/michkap/archive/2007/09/20/5008305.html

Question 1:
Is if ok for those FPC functions to treat strings like "0-0" and "-00" 
as same?

Question 2:
Can the inclusion of SORT_STRINGSORT flag in CompareString function fix 
this peculiar behaviour, and should this be included in FPC?

Regards,
Denis



More information about the fpc-devel mailing list