[fpc-devel] TField.AsString and Databases with UTF-8 charset

Michael Van Canneyt michael at freepascal.org
Fri Jul 24 16:57:01 CEST 2009

On Fri, 24 Jul 2009, Graeme Geldenhuys wrote:

> Michael Van Canneyt wrote:
>> Which field should it use according to you then ?
> "f.rdb$character_length" to report TField.Size and TParam.Size
> See below...
>>> So SqlDB with Firebird is in fact wrong when it returns Size = 8
>>> for a Char(2) with UTF8 charset enabled.
>> Yes, but assume that a size of 2 is returned. This means a buffer of
>> 2 bytes (in ansistring byte=character) will be reserved for the data.
> OK Michael, you are confusing what TField.Size means. You also don't seem to 
> take into account TField.DataSize. See the following URL.
> http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/DB_TField_DataSize.html
> TField.Size and TParam.Size report back the x number of "characters" 
> irrespective of what character set is being used. This value should be the 
> same as the Char(x) type definition.

Good point.

>> So SQLDB "agrees with firebird" and reserves 8 bytes because that is
>> the max what can be returned.
> Why to use the TField.DataSize to reserve the correct about of bytes.

-> This seems like a good solution. We'll have to look at this.

>> The problem is deeper than you see, and is not related to SQLDb, but
>> to the implicit assumption in TBufDataset that for TStringField, 1
>> char = 1 byte:
> I think it's more a case of TField.DataSize not being taken into account, and 
> always assumes TField.Size and TField.DataSize are the same for Char(x) field 
> definitions.

That's currently exactly so, and is what needs to be fixed, however, this
is at a deeper level as SQLDB.

>> As a consequence, my prediction is that, because it reports a size in
>> characters, the postgres implementation will suffer of buffer
>> overflows as soon as strange (=multibyte) unicode characters are
> Just did a test. PostgreSQL reports back the correct TField.Size, but 
> somewhere the content is being clipped. I ran this through a modified tiOPF 
> with SqlDB_PG persistence layer.

If it's clipped, that means the copy operation to the dataset buffer takes 
care of the overflow.


More information about the fpc-devel mailing list