[fpc-devel] Unicode support (yet again)
Mattias Gaertner
nc-gaertnma at netcologne.de
Thu Sep 15 18:01:48 CEST 2011
On Thu, 15 Sep 2011 17:21:47 +0200
Felipe Monteiro de Carvalho <felipemonteiro.carvalho at gmail.com> wrote:
> On Thu, Sep 15, 2011 at 11:15 AM, Graeme Geldenhuys
> <graemeg.lists at gmail.com> wrote:
> > in fpGUI:
> > UTF8Copy(...)
> > UTF8CharAtByte(...)
> >
> > in fpGUI:
> > Length(...) result is in bytes
> > UTF8Length(...) result is in "characters"
>
> Well, here we see why I started this thread. fpGUI and the LCL are
> implementing things which should be on the RTL ...
>
> But the RTL is not UTF-8 friendly and the result is things like these,
> or the worse variants:
>
> // file operations
> function FileExistsUTF8(const Filename: string): boolean;
> function FileAgeUTF8(const FileName: string): Longint;
> function DirectoryExistsUTF8(const Directory: string): Boolean;
> function ExpandFileNameUTF8(const FileName: string): string;
> function ExpandUNCFileNameUTF8(const FileName: string): string;
> function ExtractShortPathNameUTF8(Const FileName : String) : String;
> function FindFirstUTF8(const Path: string; Attr: Longint; out Rslt:
> TSearchRec): Longint;
> function FindNextUTF8(var Rslt: TSearchRec): Longint;
> procedure FindCloseUTF8(var F: TSearchrec);
> function FileSetDateUTF8(const FileName: String; Age: Longint): Longint;
> function FileGetAttrUTF8(const FileName: String): Longint;
> function FileSetAttrUTF8(const Filename: String; Attr: longint): Longint;
> function DeleteFileUTF8(const FileName: String): Boolean;
> function RenameFileUTF8(const OldName, NewName: String): Boolean;
> function FileSearchUTF8(const Name, DirList : String;
> ImplicitCurrentDir : Boolean = True): String;
> function FileIsReadOnlyUTF8(const FileName: String): Boolean;
> function GetCurrentDirUTF8: String;
> function SetCurrentDirUTF8(const NewDir: String): Boolean;
> function CreateDirUTF8(const NewDir: String): Boolean;
> function RemoveDirUTF8(const Dir: String): Boolean;
> function ForceDirectoriesUTF8(const Dir: string): Boolean;
>
> Lazarus is literally being forced to implement it's own RTL...
Same for MSEGui and fpGUI.
I guess all three have the opinion that a x-platform library is easier
to use/maintain with only one string type across all platforms.
On the other hand is the FPC team, that has to solve an impossible
puzzle: Compatibility with old code requires AnsiStrings in system
encoding, compatibility to new Delphi requires UTF-16 under
Windows, Linux requires UTF-8 byte encoding and automatic conversion
will slow down all uses of TStrings.
Call me a pessimist, but my guess is, whatever the FPC team does, the
three libraries will still provide their own functions.
> With the currently planned Unicode RTL it will just get worse, we will
> then need to either migrate to UnicodeString (a lot of work, no
> benefit at all) or give up using the RTL at all and simply finish
> rolling our own set of functions. We will even need to roll our own
> TStringsUTF8, TFileStreamUTF8, etc...
It seems to me, that all of the above functions and classes could be
solved for applications using UTF-8 ansistrings with a special
widestring manager and filefunction manager, right?
Mattias
More information about the fpc-devel
mailing list