[fpc-pascal] Unicode file routines proposal

Felipe Monteiro de Carvalho felipemonteiro.carvalho at gmail.com
Tue Jul 1 14:11:16 CEST 2008


On Tue, Jul 1, 2008 at 9:02 AM, Marco van de Voort <marcov at stack.nl> wrote:
> A solution for unicode should be for everything, not just for UIs and
> filenames. I should be able to carry data within it also, because otherwise
> we are having this dicussion next week again if Joost needs unicode for DB
> related issues etc.

Ok, but how do you know that everyone wants to store data in the
"system" encoding?

What if I want to store data using ansistring in Windows because my
file is UTF-8?

In my system I propose that simply a TWideStringList be implemented,
so both ways of storing data are available everwhere.

> How? I can't express the foreign encoding because I have no type for it. I
> only have ansistring that can mean pretty much everything, and that
> constitutes no compiletime safety.

ansistrings don't mean everything. They mean either ISO or utf-8. They
can never hold a utf-16 string (or at least there are no routines to
cover this case).

>> I bet you would convert automatically from whatever to ansi when going
>> to a ansistring, but Lazarus uses utf-8 in ansistrings.
>
> But that is lazarus specific.

Lazarus is by far the largest project using Free Pascal?

> Because the decision to put utf-8 in ansistrings is too fundamentally flawed
> to implement such a thing, since it perfectly legal if an ansistring does
> not contain utf8

We concluded that utf-8 in ansistrings is a very convenient solution
for us which works very well today. It provided a smooth migration
path and keeps the vast majority of code working.

We may some day migrate to a possible utf8string type when it gets implemented.

-- 
Felipe Monteiro de Carvalho



More information about the fpc-pascal mailing list