[fpc-devel] TField.AsString and Databases with UTF-8 charset
Graeme Geldenhuys
graemeg at opensoft.homeip.net
Fri Jul 24 12:49:51 CEST 2009
Michael Van Canneyt wrote:
>
>> That way, SqlDB can return copy(fieldvaluestring, 0, character_len)
>> as the actual field text value, which trims off the padding of
>> spaces.
>
> If you look carefully, you'll see that the padding of spaces happens
> in code in the case of a CHAR field. Maybe we should do something
> about that.
I'm fine with padding for CHAR() field types if the content is less than
the Char() length.
eg: Char(5) field definition should always return 5 characters
irrespective if you insert only 2 character of data.
The UTF8 implementation is greatly flawed in Firebird. It does not
adhere to the max character length as defined by Char(x).
The UTF-8 value of "en" IS "en" because UTF-8 is a variable byte length
implementation. Plus the first 254 (there abouts) characters in UTF-8
only take up 1 byte per character. The UTF-8 encoded string of "en" in
NOT "en " like Firebird is returning!
In Summary:
-------------
There are two problems here.
1) SqlDB and Firebird are reporting the wrong TParam.Size and
TField.Size results. SqlDB is using byte length instead of character length.
2) The UTF-8 implementation of Firebird is seriously flawed. Firebird
makes as if UTF-8 is a fixed byte algorithm and just returns rubbish
results from a Char(x) field and breaks the DDL rule of what the maximum
character length is. Using the metadata, SqlDB *can* fixes this by using
something like: copy(fieldvalue, 0, MaxCharacterLength)
Regards,
- Graeme -
--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://opensoft.homeip.net/fpgui/
More information about the fpc-devel
mailing list