[fpc-pascal] Unicode support

Paul Ishenin paul.ishenin at gmail.com
Fri May 18 03:12:20 CEST 2012


17.05.12 2:51, Andrew Brunner wrote:
> I wanted to ask what the present state of unicode support is now.

Compiler supports AnsiStrings with codepage information and converts 
them between ansistrings and between other string types with implicit 
codepage conversion. RawByteString and UTF8String are both supported by 
compiler too.

For 2 byte encodings compiler has WideString and UnicodeString types.

RTL has only basic support for codepage aware strings and unicodestring 
type. Most of the code still works with AnsiString.
  > I'm running into problems with some various strings.  Sometimes when a
> string contains a unicode character postgresql won't allow the
> insert/update.  Also, some MP3 tags contain UTF16, UTF16BE and I really
> don't know of the best practice on how to handle multi-byte characters
> in code.

It is difficult to say what is wrong - we don't see the code.

> Anyone want to comment on direction of FPC for Unicode... Ie I think the
> string field should be able to re-map to UTF8.  Is that something that
> can be done?

What string field? Of what class? If you are about string type itself 
you can use {$H+}, {$codepage UTF-8} and 
SetMultiByteConversionCodePage(CP_UTF8) if you want to have string type 
to be utf8 string in your project.

> What can I do to support unicode in the serving of
> pages/files/documents/music without having to have encoding aware code.

What you want is probably UnicodeString RTL which is not available at 
the moment.

Best regards,
Paul Ishenin



More information about the fpc-pascal mailing list