[fpc-devel] Unicode support (yet again)
Luiz Americo Pereira Camara
luizmed at oi.com.br
Fri Sep 16 04:49:45 CEST 2011
On 15/9/2011 23:11, Luiz Americo Pereira Camara wrote:
> On 15/9/2011 12:21, Felipe Monteiro de Carvalho wrote:
>>
>> Lazarus is literally being forced to implement it's own RTL...
>>
>> With the currently planned Unicode RTL it will just get worse, we will
>> then need to either migrate to UnicodeString
>
> No.
>
> Lazarus can continue to use UTF-8.
>
> Just there will be an implicit conversion when using those functions.
> The overhead is minimum.
>
> Lazarus is doing the (explicit) conversion already for some functions
> with no problems at all. In fact the code will be clearer
Take the example of FileExists:
The current LCL implementation - the UTF8 -> UTF16 conversion is done
with the need of auxiliary code:
FileGetAttr_ : function (const FileName: String): Longint =
@FileGetAttrWide;
function FileGetAttrWide(const FileName: String): Longint;
begin
Result:=Integer(Windows.GetFileAttributesW(PWideChar(UTF8Decode(FileName))));
end;
function FileExistsUTF8(const Filename: string): boolean;
var
Attr: Longint;
begin
Attr:=FileGetAttrUTF8(FileName);
if Attr <> -1 then
Result:= (Attr and FILE_ATTRIBUTE_DIRECTORY) = 0
else
Result:=False;
end;
function FileGetAttrUTF8(const FileName: String): Longint;
begin
Result:=FileGetAttr_(FileName);
end;
The pure UTF-16 RTL: one implicit conversion is done conversion with
clean/direct code
function FileExists(const Filename: unicodestring): boolean;
var
Attr: Longint;
begin
Attr:=FileGetAttr(FileName);
if Attr <> -1 then
Result:= (Attr and FILE_ATTRIBUTE_DIRECTORY) = 0
else
Result:=False;
end;
function FileGetAttr(const FileName: unicodeString): Longint;
begin
Result:=Integer(Windows.GetFileAttributesW(PWideChar(FileName)));
end;
The duplicate UTF8/UTF16 RTL - a conversion still necessary but
internally and explicit
function FileExists(const Filename: utf8string): boolean;
var
Attr: Longint;
begin
Attr:=FileGetAttr(FileName);
if Attr <> -1 then
Result:= (Attr and FILE_ATTRIBUTE_DIRECTORY) = 0
else
Result:=False;
end;
function FileGetAttr(const FileName: utf8String): Longint;
begin
Result:=Integer(Windows.GetFileAttributesW(PWideChar(UTF8Decode(FileName))));
end;
So:
- It will be a conversion anyway as is current. In fact current is worse
because of temporary var created by chained typecasts/function calls
- The double multi RTL UTF8/UTF16 only adds extra code with no
performance gain
- The fpc core/Marco proposition will help Lazarus have clearer/smaller
code (at least regarding those RTL functions , TStrings etc is other
story and where the real problem resides)
Luiz
>
>
> A conversion when calling those functions is necessary anyway because:
>
> - if LCL changes to UnicodeString the conversion will be needed in
> unix (UTF-16 -> UTF-8)
> - creating own UTF8 RTL functions will do the conversion internally
> anyway
>
> So no problem at all with the proposition of Marco
>
> Luiz
>
> _______________________________________________
> fpc-devel maillist - fpc-devel at lists.freepascal.org
> http://lists.freepascal.org/mailman/listinfo/fpc-devel
>
More information about the fpc-devel
mailing list