[fpc-devel] RTL and Unicode filenames operations.

Martin Schreiber fpmse at bluewin.ch
Sun Mar 21 17:29:16 CET 2010


On Sunday 21 March 2010 16:17:49 dmitry boyarintsev wrote:
> On Tue, Mar 21, 2010 at 2:01 PM, Marco van de Voort
>
> <http://bugs.freepascal.org/view.php?id=15795> wrote:
> > I'll reiterate my opinion that first a decision about what the working
> > stringtype of the RTL will be. IMHO there is no decent solution till
> > there is a real utf-8 type (read cpnewstr)
>
> The whole reason for the package, is not to wait for cpnewstr being
> implemented. That's the point that package's interface uses UnicodeString,
> rather than ambiguous "String" type (that's currently system dependent).
>
> > Note that the recent update does not tackle my main gripe at all, the
> > main units interface is still UTF-16 on Unix, it just translates it to
> > UTF-8 in the backend.
>
> Yes, it does so and i see no problems here. The package also converts
> UTF-16 to ansi encoding for Win9x.
> Converting string from UTF16 to UTF8 (or any other encoding) is not
> much time penalty comparing to the time of the file operation itself.

Agreed. The MSEgui file utils do the same with good results. If I understand 
right, Marco insists that on Linux utf-8 should be used for file operations 
because it is the "native" file name encoding on Linux. There are still Linux 
installations and partitions with other encodings than utf-8 and on Linux, 
filenames are an array of bytes without any encoding AFAIK, so invalid utf-8 
sequences are possible and must be filtered and converted to valid sequences 
in both utf-8 or utf-16.
I don't know how that can be achieved BTW.
On Windows and Mac utf-16 is then native filename encoding so I don't 
understand why the API can't be standardised to utf-16.

Martin



More information about the fpc-devel mailing list