[fpc-devel] TField.AsString and Databases with UTF-8 charset

Michael Van Canneyt michael at freepascal.org
Fri Jul 24 16:57:01 CEST 2009



On Fri, 24 Jul 2009, Graeme Geldenhuys wrote:

> Michael Van Canneyt wrote:
>> 
>> Which field should it use according to you then ?
>
> "f.rdb$character_length" to report TField.Size and TParam.Size
> See below...
>
>
>>> So SqlDB with Firebird is in fact wrong when it returns Size = 8
>>> for a Char(2) with UTF8 charset enabled.
>> 
>> Yes, but assume that a size of 2 is returned. This means a buffer of
>> 2 bytes (in ansistring byte=character) will be reserved for the data.
>
>
> OK Michael, you are confusing what TField.Size means. You also don't seem to 
> take into account TField.DataSize. See the following URL.
>
> http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/DB_TField_DataSize.html
>
> TField.Size and TParam.Size report back the x number of "characters" 
> irrespective of what character set is being used. This value should be the 
> same as the Char(x) type definition.

Good point.

>> So SQLDB "agrees with firebird" and reserves 8 bytes because that is
>> the max what can be returned.
>
> Why to use the TField.DataSize to reserve the correct about of bytes.

-> This seems like a good solution. We'll have to look at this.

>
>> The problem is deeper than you see, and is not related to SQLDb, but
>> to the implicit assumption in TBufDataset that for TStringField, 1
>> char = 1 byte:
>
> I think it's more a case of TField.DataSize not being taken into account, and 
> always assumes TField.Size and TField.DataSize are the same for Char(x) field 
> definitions.

That's currently exactly so, and is what needs to be fixed, however, this
is at a deeper level as SQLDB.

>> As a consequence, my prediction is that, because it reports a size in
>> characters, the postgres implementation will suffer of buffer
>> overflows as soon as strange (=multibyte) unicode characters are
>
> Just did a test. PostgreSQL reports back the correct TField.Size, but 
> somewhere the content is being clipped. I ran this through a modified tiOPF 
> with SqlDB_PG persistence layer.

If it's clipped, that means the copy operation to the dataset buffer takes 
care of the overflow.

Michael.



More information about the fpc-devel mailing list