[fpc-devel] TField.AsString and Databases with UTF-8 charset
Michael Van Canneyt
michael at freepascal.org
Tue Jul 21 16:16:15 CEST 2009
On Tue, 21 Jul 2009, Graeme Geldenhuys wrote:
> Hi,
>
> I working on a Firebird database that has a default Charset = UTF8. I'm using
> the SqlDB package via tiOPF to save and read my data from the database.
>
> I have a table with a field defined as Char(2) and I have set it with a value
> "en" which represents a Country Code.
>
> When I read back that value, by TCountry unit tests fail with the following
> message.
>
> ==========================
> 1) textrunner.SQL Database tests.TTestCountry.TestCountry_ReadList:
> ETestFailure
> at $0807F9B4
> "Check #1: Failed on ID
> Expected:
> "en"
> But was:
> "en "
> ==========================
>
>
> Further investigation revealed that when I read back a Char(2) field via
> TField.AsString the resulting string is 8 bytes wide. I can see where this
> value comes from - Firebird obviously uses 4 bytes per UTF-8 character,
> because that is the largest amount of bytes a UTF-8 character can use. Not
> very efficient, but that is what Firebird 2.1 does when the Charset = UTF8.
>
> Now I was wondering how to resolve this.
>
> 1) Switch all my Char() fields to VarChar() which resolves the problem.
>
> 2) Trim(lField.AsString) before I store it in each TCountry.Code property.
>
> 3) SqlDB does the TField.AsString trimming for any Char() fields.
>
>
> Is option #3 viable? This will resolve any similar issue to all future
> developers using Firebird with Char() type fields and Charset=UTF8.
You may never trim these fields;
The database contains always the declared number of characters: the value
is trimmed with a space till the number of declared characters is reached.
TField.AsString returns what is in the database, which seems to be 8 characters
in your case.
> Or should I opt for options #1 or #2 instead. Option #1 is probably the least
> amount of effort because the project I am working on is new, so there is no
> existing database that need to be converted. Problem being that another
> developer down the line might add a new field of type Char() and then we sit
> with the same problem again.
>
> What is your thoughts on this. As far as I know MSEgui does automatic
> trimming of spaces in the TField, so should this maybe be done in SqlDB
> (Interbase/Firebird) as well?
No, because it is fundamentally wrong.
The question should be:
Why does Firebird report 8 characters to your application ?
Michael.
More information about the fpc-devel
mailing list