Summary on Re: [fpc-pascal] Unicode file routines proposal

Marco van de Voort marcov at stack.nl
Tue Jul 1 14:58:23 CEST 2008


[ Charset ISO-8859-1 unsupported, converting... ]
> On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort <marcov at stack.nl> wrote:
> > My is having both an UTF8string and a UTF16string, on all platforms that support
> > unicode. So I don't get this remark.
> 
> Unless I understood your proposal wrong it involves a TMarcoString
> which will be declared like this:
> 
> {$ifdef Linux}
>   If SystemEncoding = utf-8 then TMarcoString = Utf8String
>   else TMarcoString = ansistring;
> {$endif}
> {$ifdef Windows}
>   If WindowsNT then TMarcoString = utf16string
>   else TMarcoString = ansistring;
> {$endif}
> {$ifdef Darwin}
>   TMarcoString = utf8string;
> {$endif}
> 
> Just how do you implement a string routine with TMarcoString? Fill it
> with ifdefs?

No. Just utf8string and utf16string, with tutf16string aliased to the
identifier that Tiburon nems it.

But on Linux the RTL is mostly utf-8, and on Delphi the RTL is mostly
utf-16. And if you pass the utf-16 filename that you got from the Lazarus
.dfm (that is apparantly already set to remain utf-16) to a filename routine
a converrsion will automatically happen.

People that make a FPC distro can decide if their Linux version contains a
full complement of overloaded utf16 routines or not. If not more conversions
will happen if you have pure utf-16 code, but that can be worthwhile for
embedded/minimalist distributions if those ever emerge.

And only the few special cases with var are a problem, as you correctly
pointed out, and a these few cases can be fixed for the RTL by overloading
them aslo for utf16.

> > It is just that on unix, the fileroutines will be defined as utf8string
> So you are going to convert in non utf8 unix?

Maybe I should have said "in the native encoding" then. So if the it's a
utf-16 unix it will be utf-16.  In principle at least. We will have to see
how this fares with the shared character of the unix rtl.

Note that both is also possible, e.g for  most used string routines (like
extractfilename etc) they can be simply overloaded to support both.




More information about the fpc-pascal mailing list