[fpc-devel] Unicode and Lazarus

Graeme Geldenhuys graemeg.lists at gmail.com
Thu Nov 20 11:35:03 CET 2008

On Thu, Nov 20, 2008 at 12:18 PM, Felipe Monteiro de Carvalho
<felipemonteiro.carvalho at gmail.com> wrote:
> using Lazarus uses fpc), it would be interresting if we actually work
> more or less in the same direction to provide a good unicode solution,
> instead of each part ignoring what the other is doing. And also

I fully agree.

> So, what kind of support could be implemented in Free Pascal to
> improve things for Lazarus and it´s users?
> Maybe a real UTF8String?

My first take would be the following real string types: AnsiString,
ShortString, UTF8String and UTF16String.

Then a compiler directive that decides what the String type actually
represents. Just like the $H directive says String = AnsiString or
String = ShortString. So the new compiler directive can toggle between
any of the 4 string types.

This way Lazarus and fpGUI can toggle String to actually mean the real
UTF8String. MSEgui could toggle String to actually be a UTF16String.

I don't see the point or need for a UCS2String type as suggested by
others. UTF16String will cover UCS2 support and more.

As an optional extra, to prevent unneeded automatic conversion, I
would suggest Linux and most other unix variants default String =
UTF8String and Windows and WinCE default String = UTF16String.

Obviously with the compiler directive, the developer can override the
default behaviour.

As for loading files. It's 99.9% that all files are in ANSI or UTF8
encoding and UTF8 being fulling backward compatible with ANSI makes
this a good thing. So under Windows, TStringList.LoadFromFile() will
default to UTF8 loading (or auto detect if possible). Once the content
is loaded, convert it to the platforms default encoding UTF16 for
Windows, or stay as is (UTF8) for most unix variants.

  - Graeme -

fpGUI - a cross-platform Free Pascal GUI toolkit

More information about the fpc-devel mailing list