[fpc-devel] TField.AsString and Databases with UTF-8 charset
Michael Van Canneyt
michael at freepascal.org
Fri Jul 24 16:57:01 CEST 2009
On Fri, 24 Jul 2009, Graeme Geldenhuys wrote:
> Michael Van Canneyt wrote:
>>
>> Which field should it use according to you then ?
>
> "f.rdb$character_length" to report TField.Size and TParam.Size
> See below...
>
>
>>> So SqlDB with Firebird is in fact wrong when it returns Size = 8
>>> for a Char(2) with UTF8 charset enabled.
>>
>> Yes, but assume that a size of 2 is returned. This means a buffer of
>> 2 bytes (in ansistring byte=character) will be reserved for the data.
>
>
> OK Michael, you are confusing what TField.Size means. You also don't seem to
> take into account TField.DataSize. See the following URL.
>
> http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/DB_TField_DataSize.html
>
> TField.Size and TParam.Size report back the x number of "characters"
> irrespective of what character set is being used. This value should be the
> same as the Char(x) type definition.
Good point.
>> So SQLDB "agrees with firebird" and reserves 8 bytes because that is
>> the max what can be returned.
>
> Why to use the TField.DataSize to reserve the correct about of bytes.
-> This seems like a good solution. We'll have to look at this.
>
>> The problem is deeper than you see, and is not related to SQLDb, but
>> to the implicit assumption in TBufDataset that for TStringField, 1
>> char = 1 byte:
>
> I think it's more a case of TField.DataSize not being taken into account, and
> always assumes TField.Size and TField.DataSize are the same for Char(x) field
> definitions.
That's currently exactly so, and is what needs to be fixed, however, this
is at a deeper level as SQLDB.
>> As a consequence, my prediction is that, because it reports a size in
>> characters, the postgres implementation will suffer of buffer
>> overflows as soon as strange (=multibyte) unicode characters are
>
> Just did a test. PostgreSQL reports back the correct TField.Size, but
> somewhere the content is being clipped. I ran this through a modified tiOPF
> with SqlDB_PG persistence layer.
If it's clipped, that means the copy operation to the dataset buffer takes
care of the overflow.
Michael.
More information about the fpc-devel
mailing list