[fpc-devel] TField.AsString and Databases with UTF-8 charset

Michael Van Canneyt michael at freepascal.org
Tue Jul 21 16:16:15 CEST 2009



On Tue, 21 Jul 2009, Graeme Geldenhuys wrote:

> Hi,
>
> I working on a Firebird database that has a default Charset = UTF8. I'm using 
> the SqlDB package via tiOPF to save and read my data from the database.
>
> I have a table with a field defined as Char(2) and I have set it with a value 
> "en" which represents a Country Code.
>
> When I read back that value, by TCountry unit tests fail with the following 
> message.
>
> ==========================
>  1) textrunner.SQL Database tests.TTestCountry.TestCountry_ReadList: 
> ETestFailure
>     at $0807F9B4
>      "Check #1: Failed on ID
> Expected:
> "en"
> But was:
> "en      "
> ==========================
>
>
> Further investigation revealed that when I read back a Char(2) field via 
> TField.AsString the resulting string is 8 bytes wide. I can see where this 
> value comes from - Firebird obviously uses 4 bytes per UTF-8 character, 
> because that is the largest amount of bytes a UTF-8 character can use. Not 
> very efficient, but that is what Firebird 2.1 does when the Charset = UTF8.
>
> Now I was wondering how to resolve this.
>
> 1) Switch all my Char() fields to VarChar() which resolves the problem.
>
> 2) Trim(lField.AsString) before I store it in each TCountry.Code property.
>
> 3) SqlDB does the TField.AsString trimming for any Char() fields.
>
>
> Is option #3 viable?  This will resolve any similar issue to all future 
> developers using Firebird with Char() type fields and Charset=UTF8.

You may never trim these fields; 
The database contains always the declared number of characters: the value
is trimmed with a space till the number of declared characters is reached.
TField.AsString returns what is in the database, which seems to be 8 characters 
in your case.

> Or should I opt for options #1 or #2 instead. Option #1 is probably the least 
> amount of effort because the project I am working on is new, so there is no 
> existing database that need to be converted. Problem being that another 
> developer down the line might add a new field of type Char() and then we sit 
> with the same problem again.
>
> What is your thoughts on this.  As far as I know MSEgui does automatic 
> trimming of spaces in the TField, so should this maybe be done in SqlDB 
> (Interbase/Firebird) as well?

No, because it is fundamentally wrong.

The question should be: 
Why does Firebird report 8 characters to your application ?

Michael.



More information about the fpc-devel mailing list