[fpc-pascal] Unicode file routines proposal
Marco van de Voort
marcov at stack.nl
Tue Jul 1 09:23:52 CEST 2008
> On Mon, Jun 30, 2008 at 11:35 AM, Marco van de Voort <marcov at stack.nl> wrote:
> > borders?
>
> Gtk can load XML files, somewhat equivalent to our LFMs. They use
> UTF-8 everywhere.
GTK is unix centric on other systems. They don't have a firm leg in both the
Unix as the Windows world as we do. I can't judge the wxwidgets situation,
since I know nobody that uses it.
> Java is cross-platform and uses UTF-16 everywhere.
Java has to emulate everything (read: put up a barrier) from the outside
anyway, and not doing that is one of our fortes.
> multiple encodings:
>
> * More complex
> * Innovative solution, no known example of a implementation of this
> system exists = uncertainty if it works at all, or if it is convenient
> for developers
> * Depends on a not yet implemented string type
Needs to be done anyway, since widestring on windows is COM, and that must
be also retained. So it is about adding 1 vs 2, and the work will be huge,
with UTF-16 too, and to make it worthwhile the best, not the quikest
solution should be sought.
> * Potentially will have a higher performance then a single encoding
> system, but only if you use this new special string type
Certainly. Can you imagine loading a non trivial file in a tstringlist and
saving it again and the heaps of conversions?
Moreover, there is an important reason missing:
* Being able to declare the outside world in the right encoding, without
manually inserting conversions in each header.
* Does not make one of the two core platforms (Unix/windows) effectively
second rate.
* Can be done phased, IOW in the beginning lots of conversion, but later
have more and more routines in the right encoding ready.
> Single encoding:
>
> * Simple, proved solution
Simple solution, complex implementation (needs conversions anywhere).
> * Does not need any new string type, can start being implemented immediately
It does. And you can start making UTF-16 routines anyway
> * Potentially has a lower performance due to string conversions.
More information about the fpc-pascal
mailing list