[fpc-devel] Unicode support (yet again)

Flávio Etrusco flavio.etrusco at gmail.com
Sun Sep 18 13:57:48 CEST 2011


On Sun, Sep 18, 2011 at 6:50 AM, Marco van de Voort <marcov at stack.nl> wrote:
> In our previous episode, Fl?vio Etrusco said:
>>
>> That's somewhat what I was thinking. Actually something like
>>
>>   UnicodeString = object
>>   (...)
> Such ability is not unique for an object. One can also do something like
> that with a native type.
>


Of course. That wasn't meant as a real implementation, I just decided
to write some code instead of explaining in words.
Basically my point was to people discussing endlessly without any data
or observations, that FPC already provides much of the tools for a
non-native implementation to be made and gather real and practical
data.

> It was discussed and rejected.
>  The trouble is that it is not that easy, consider the first thing a
> long time pascal user will do is fix his existing code which has many
> constructs that loop over a string:
>
> setlength(s2,s1);
> for i:=1 to length(s1) do
>  s2[i]:=s1[i];
>
> Now, to return codepoint[i], you need to parse all codepoints before [i].
>
> So instead of O(n) this loop suddenly becomes O(n^2)....

I hope then that either I'm wrong or that you change your mind ;-)
IMHO what must be changed is the way to deal with strings.
I must assume from this preoccupation that you're talking about a a
directive to make the String keyword instantiate a UnicodeString?
Also IMVHO in that compiler mode the code just needs to work, not
fast, and the user code be updated/fixed.
One obvious way to mitigate this would be to store the last
CodePoint->Char in the string record, so that at least the most common
case is covered.

Best regards,
Flávio

PS. Sorry for the double post, Marco.



More information about the fpc-devel mailing list