[fpc-devel] TField.AsString and Databases with UTF-8 charset
graemeg at opensoft.homeip.net
Fri Jul 24 12:49:51 CEST 2009
Michael Van Canneyt wrote:
>> That way, SqlDB can return copy(fieldvaluestring, 0, character_len)
>> as the actual field text value, which trims off the padding of
> If you look carefully, you'll see that the padding of spaces happens
> in code in the case of a CHAR field. Maybe we should do something
> about that.
I'm fine with padding for CHAR() field types if the content is less than
the Char() length.
eg: Char(5) field definition should always return 5 characters
irrespective if you insert only 2 character of data.
The UTF8 implementation is greatly flawed in Firebird. It does not
adhere to the max character length as defined by Char(x).
The UTF-8 value of "en" IS "en" because UTF-8 is a variable byte length
implementation. Plus the first 254 (there abouts) characters in UTF-8
only take up 1 byte per character. The UTF-8 encoded string of "en" in
NOT "en " like Firebird is returning!
There are two problems here.
1) SqlDB and Firebird are reporting the wrong TParam.Size and
TField.Size results. SqlDB is using byte length instead of character length.
2) The UTF-8 implementation of Firebird is seriously flawed. Firebird
makes as if UTF-8 is a fixed byte algorithm and just returns rubbish
results from a Char(x) field and breaks the DDL rule of what the maximum
character length is. Using the metadata, SqlDB *can* fixes this by using
something like: copy(fieldvalue, 0, MaxCharacterLength)
- Graeme -
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
More information about the fpc-devel