[fpc-pascal] Unicode file routines proposal

Martin Schreiber fpmse at bluewin.ch
Tue Jul 1 11:08:47 CEST 2008


On Tuesday 01 July 2008 10.35:00 Mattias Gaertner wrote:
> > A good example is text layout calculation where it is necessary to
> > iterate over characters (glyphs) over and over again.
>
> Text layout nowadays need to consider font widths and unicode specials.
> Iterating from character to character should be hardly measurable
> compared to this. For example synedit does not yet care much about font
> widths and unicode specials and the UTF-8 stepping is negligible.
>
I did it with utf-8 and UCS-2, beleave me, it was not negligible.

> > I think the best compromise for a GUI framework are referencecounted
> > widestrings where normally physical index = code point index. If one
> > needs characters which are not in the base plane, he must use
> > surrogate pairs and more complicated and slower processing. I assume
> > this will be seldom used.
>
> It depends if your code should solve a special problem or if you
> write a library that should work for all. The RTL and FCL should work
> for all. So they must support UTF-16 and can not use a
> limited widestring.
>
That's why I wrote "for a GUI framework". There we have always the possibility 
to access the OS with optimized routines independent from RTL and FCL and to 
provide the optimozed stringhandling routines for the chosen internal string 
representation. What is necessary for the toolkit user is automatic 
conversion from the GUI framework internal string type to the system 
encoding. That already exists for widestrings.

Martin



More information about the fpc-pascal mailing list