[fpc-devel] Unicode support (yet again)

Graeme Geldenhuys graemeg.lists at gmail.com
Wed Sep 14 08:40:31 CEST 2011


On 14/09/2011 03:56, Luiz Americo Pereira Camara wrote:
> 
> I propose that the above behavior be implemented as a type named RTLString

The Object Pascal language already has enough damn string types. I
really don't think we should be adding fuel to the fire, by adding yet
more string types!


> So the RTL under unix will have functions compiled with UTF8 strings 
> giving no overhead interacting with native API
> The RTL under Windows will have compiled functions with UTF16 strings 
> giving no overhead with native API

That's exactly what I said.


> If a program is pass a UnicodeString to a RTL function under Windows no 
> conversion is made
> When this same program is compiled under unix the UnicodeString should 
> be converted to UTF8 automatically using the encoding info of the string

No, why must unix environments take a performance hit?? This is not
needed if UnicodeString is really what the same suggests. Any unicode
type string. Unicode standard is defined as UTF-8, UTF-16 and UTF-32. So
UnicodeString should really be any of those encodings - living up to
it's name.

If FPC has true unicode support, then all functions should work correct
with just the UnicodeString type. That type's encoding is based on the
native encoding of each platform. NO performance hit required.


I'd even be happier if UnicodeString was dropped too, and String becomes
unicode enabled. One less string type to worry about.

String could be define as follows... [ignore the syntax]

IFDEF unix
   String = String(utf8);
ENDIF
IFDEF windows
  String = String(utf16)
ENDIF
IFDEF OldDelphi
  String = AnsiString  //  of if some String(xxx) could be used
ENDIF


Then if you wanted your project to use some other specific encoding,
then you can simply define your own string type and use that. The
various string types know what encoding they are in, so auto-conversion
is possible too (with possibility of data loss in case of unicode -> ansi)
eg:
type
  { say I want to use UTF-32 in my apps for some reason }
  TfpgString = String(utf32);

var
   s: String;  //  as defined above - could be utf8, utf16 etc..
   m: TfpgString;
   a: AnsiString;
begin
  m := 'Hello world!';
  s := m;  // automatic conversion happens here
  a := s;  // auto conversion, with data loss (compiler warning)
end;


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/




More information about the fpc-devel mailing list