[fpc-devel] Unicode RTL

Daniël Mantione daniel.mantione at freepascal.org
Thu Nov 17 12:42:14 CET 2005



Op Thu, 17 Nov 2005, schreef Mattias Gaertner:

> On Thu, 17 Nov 2005 11:31:45 +0100
> "Dr. Karl-Michael Schindler" <schindler at physik.uni-halle.de> wrote:
> 
> > Hi
> > 
> > Following this discussion, I want to throw in my 2 cents as well:
> >   On a real long term (like 5 or 10 years from now), the solution  
> > should be as "clean" as possible with as little awkward parts because  
> > of backward compatibility. This favors of a more separate solution  
> > with a compatibility layer. Sure enough, this means more work to set  
> > it up and maintain it. Therefore, it will take longer to have it  
> > running, but in the end it will prevent the situation, I'd like to  
> > call the A20 gate situation.
> 
> Who knows the future?
> Maybe in 10 years 16bit non multi 'byte' encoding is sufficient for all
> remaining languages.
> Or maybe 32bit encodings will become the default.

Deciding between UCS-2 and UTF-16 could be left up to the 
implementers of such a runtime library; all special UTF-16 positions are 
unallocated in UCS-2, so they do not bite each other. IMHO the advantage 
of UCS-2 is that you can use the [] array index safely. Or can indeed 
decide that you want a super wide string... Variable length strings have 
their problems tough, I've got some problems in my collection which would 
be quite hard to solve Unicode proof with UTF-8 strings.

I checked the situation a bit. There is one alive language that has 
symbols allocated beyond plane 0 and that is not surprisingly Chinese. 
These are 40000 words allocated in plane 2, archaic words though. Dead 
languages are allocated in plane 1.

You can decide to support them, or you can not. This is also up to font 
designers, try if you can render #&173733; in your browser. Mine did not 
in several fonts.

> Speaking for lazarus: we want to support the whole unicode and UTF8 is the
> easiest to achieve that.

Fair enough; there is nothing wrong with that. What you need to consider 
though is the following:

* How many Delphi apps in the wild are Unicode proof?
* How many apps would be Unicode proof if the libraries would use 
  widestrings?

That is what is IMHO going on in the minds of people that are asking for 
this.

So, I can tell Juras B. to go whining somewhere else and not bother, but I 
cannot deny validity of his points.

Daniël


More information about the fpc-devel mailing list