[fpc-devel] Unicode support in RTL - Roadmap

Jeff Wormsley daworm10 at comcast.net
Mon Nov 24 14:55:42 CET 2008


Martin Friebe wrote:
> I must agree with the "FPC can not to it all automatically" line (as 
> much as I regret, and admit the beauty there was, if fpc could).
>
> What I mean is:
>
> 1) Any Application/Program, that currently compiles and works (using 
> none utf8, never mind if ascii or ansi) will keep working, if compiled 
> using *none* utf8 mode.
This is reasonable.  It also implies that perhaps what everyone is 
trying to do is impossible.  With plain strings, or Ansi strings, we 
have code that works today.  If you change any of those to UTF*, then 
code that uses things such as SetLength, Length, stringvar[index], 
copy(string, index, count), pos etc. cannot work 100% reliably.  You 
don't know what the programmer wants when he says stringvar[3].  Does he 
mean the third character in the string?  Or the third byte in the memory 
array represented by the string (perhaps he was using a string as a 
buffer)?  If you assume one or the other, when one element of a string 
doesn't equal one byte, half of the time you'll be wrong, it doesn't 
matter which UTF type you are using, what locale you are in, or 
anything.  It almost seems to me, that if you want to use UTF strings as 
the default, you should either throw errors or at least stern warnings 
on any use of Length, SetLength, stringvar[index] et all and force any 
code using them to be rewritten with UTF variants.  It would make more 
sense to knowingly say all code using such constructs is broken in a 
Unicode environment than to leave it to chance that the way the code now 
interprets these constructs is the way the coder originally intended.

I know much of my code would break just using AnsiString as opposed to 
the original counted string.  For me, *any* UTF* version discussed here 
would break it even more.

I don't have any need for Unicode, so feel free to ignore anything I 
say.  But I don't want my code breaking in unpredictable ways, either, 
because the underlying string types change on me behind my back (ie, in 
the RTL/FCL).

Jeff.
--
I haven't smoked for 2 years, 3 months and 1 week, saving $3,736.95 and 
not smoking 24,913.01 cigarettes.



More information about the fpc-devel mailing list