[fpc-devel] Memory consumed by strings

JoshyFun joshyfun at gmail.com
Sun Nov 23 16:51:04 CET 2008


Hello Daniƫl,

Sunday, November 23, 2008, 1:49:32 PM, you wrote:


DM> I am aware of that, but the combining cedille is not in the "easy to
DM> process range" of UTF-8. In other words, you cannot do
DM> "if char[i]=combining_cedille" in UTF-8.

DM> Instead UTF-8, you need to make sure the string has enough characters
DM> left, and then compare multiple characters. Heck, you even need to take
DM> care of the fact the the combining cedille can be encoded in 2, 3 or 4
DM> bytes.

Combined and uncombined strings are different things for different
tasks, the only common point is that both have the same visual
representation, but unicode function "CharAt" (or alike) over
uncombined string must never report the combined character as a
result. Some functions are designed to work over uncombined strings
and other over combined ones, because some things can not be done over
one of the formats.

-- 
Best regards,
 JoshyFun




More information about the fpc-devel mailing list