[fpc-pascal] RTL and Unicode Strings

Graeme Geldenhuys mailinglists at geldenhuys.co.uk
Wed May 11 11:18:30 CEST 2016


On 2016-05-11 09:21, Jonas Maebe wrote:
> In other cases, like LacaK said, you will have to read the data as plain 
> bytes into e.g. a RawByteString and next use 
> http://www.freepascal.org/docs-html/rtl/system/setcodepage.html (with 
> the last parameter set to "false") to afterwards specify the code page 
> this data has.

But this is where I'm getting a bit confused too.

The RTL and FCL uses String data type predominantly.
  eg: TField.AsString: String.

The RTL and FCL uses String (AnsiString) with default encoding set to Auto.

In my application I enable unicodestring mode. So I'm reading data from
a Firebird database. The data is stored as UTF-8 in a VarChar field. The
DB connection is set up as UTF-8.  Now lets assume my FreeBSD box is set
up with a default encoding of Latin-1.

So I read the UTF-8 data from the database, somewhere inside the SqlDB
code it gets assigned to a TField's String property. ie: UTF-8 ->
Latin-1 conversion.

Then I read the field value into my application. ie: Latin-1 -> UTF-16

The problem as I see it, is that I already lost data when SqlDB
converted it to Latin-1. Am I not understanding the problem?

I checked the FPC 3.x db.pas unit. It uses {$mode objfpc}{$H+} - it
doesn't use UnicodeString and neither does in use RawByteString. So a
text encoding conversion to AnsiString(latin-1) [based on my example] is
going to happen, right?

Regards,
  Graeme

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp



More information about the fpc-pascal mailing list