[fpc-devel] Unicode support (yet again)

DaWorm daworm at gmail.com
Sun Sep 18 18:49:44 CEST 2011


On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth
<pascaldragon at googlemail.com> wrote:
> On 18.09.2011 17:48, DaWorm wrote:

But isn't it O(n^2) only when actually using unicode strings?
Wouldn't you also be able to do something like String.Encoding := Ansi
and then all String[i] accesses would then be o(n) + x (where x is the
overhead of run time checking that it is safe to just use a memory
offset, presumably fairly short)? Of course it would be up to the user
to choose to reencode some string he got from the RTL or FCL that way
and understand the consequences.

What assumptions are the typical String[i] user going to make about
what is returned?  There will be the types that are seeing if the
fifth character is a 'C' or something like that, and for those there
probably isn't too much that is going to go wrong, they might have to
switch to "C" instead, or the compiler can make the 'C' literal a
"unicode char which is really a string" conversion at compile time.
There may be the ones that want to turn a 'C' into a 'c' by flipping
the 6th bit, and that will indeed break, and in a Unicode world,
perhaps that should break, forcing using LowerCase as needed.  And
there are those (such as myself) who often use strings as buffers for
things like serial comms.  That code will totally break if I were to
try to use a unicode string buffer, but a simple addition of
String.Encoding := ANSI or RawByteString or ShortString in the first
line would fix that, or I could bite the bullet and recode that quick
and dirty code the right way.  My point is that trying to keep the bad
habits of a single byte string world in a unicode world is
counterproductive.  They aren't the same, and all attempts to make
them the same just cause more problems than they solve.

As for the RTL and FCL, presumably they wouldn't be doing any of this
Sting[i] stuff in the first place, would they? So they aren't going to
suffer that speed penalty.  Just because one type of code is slow,
doesn't mean everything is slow.

Jeff.



More information about the fpc-devel mailing list