[fpc-pascal] Unicode file routines proposal

Marco van de Voort marcov at stack.nl
Tue Jul 1 14:49:54 CEST 2008


> On Tue, Jul 1, 2008 at 9:21 AM, Marco van de Voort <marcov at stack.nl> wrote:
> > Well, euh, the main reason is that euh, most programs and data on the system uses
> > the system encoding?
> 
> So you are saying that FPC should privilege platform-specific software
> development to cross-platform software development?

No, we should privilege cross-platform software development over an portable emulation
of a 3rd platform (the Java principle).

> This is in the inverse direction of all other cross-platform development
> platforms in existence.

I'm only having FPC and Lazarus requirements on the table here. I don't care
about the others. They have other starting points (being very unix or
windows centric, or work with portable sandboxes)
 
> If you are writting cross-platform software you will wish to avoid as
> much as possible the system routines, and a known encoding is good.
> 
> Florian's proposal shines here. You get the string with no conversion
> and a marker for the encoding, so you can convert it to whatever you
> want easily.

And in my case you specify you want UTF-16 by making the parameter
"utfstring16", and the compiler inserts a conversion for you if sb calls it
with a utfstring8. No manual runtime check necessary. 
 
> But it doesn't solve the TStringList problem, because there you have
> no parameters to know the encoding of the file being loaded.

No there is no solution for that except making the string type really fat.
Which is not our way.
 
> > Then I'd say you convert. But that is the point. The need for conversion should be
> > the exception (different from the default system encoding), not the rule.
> 
> I think there should be no conversion at all (unless explicitly asked)
> in the contents of the stringlist.

Well, that means the tstringlist is a blind store without any methods. It
isn't since any operation requires knowledge about the insides.

> >> In my system I propose that simply a TWideStringList be implemented,
> >> so both ways of storing data are available everwhere.
> >
> > But I don't have an utf-8 type in your system to operate on.
> 
> How do you know what I want to do with the data?

Does it matter? I just want to be able to tailor to the most common
scenario's. See my other msg that restates the proposal in simpler terms.

> Or save them back to another file? (or any operations which don't involve
> system routines which need a specific string encoding)

You've really lost me now. I think you are still confusing general
unicodestrings with unicodifying a few filename using routines.



More information about the fpc-pascal mailing list