[fpc-devel] TStringField, String and UnicodeString and UTF8String

LacaK lacak at zoznam.sk
Thu Jan 13 09:49:28 CET 2011

Joost van der Sluis  wrote / nap’sal(a):
> On Wed, 2011-01-12 at 14:59 +0100, LacaK wrote:
>>> No. It is mandatory that you send/receive UTF8 to/from GUI LCL
>>> elements. 
>> As LCL elements are using TStringField.Text property, then this property 
>> should return UTF8String, right (not AnsiString in ANSI code page) ?
>> If yes, then also TStringField must store internaly data in any unicode 
>> format (to not lose any characters), right ?
>> So it can be UTF-8, UTF-16 or UTF-32 ... in all cases we must allocate 
>> space 4*[max.number of characters in field], right ?
>> So in what encoding are string data stored now in TStringField ?
> The encoding you've specified. In the connection-string or some other
> database-server dependent setting.
But then there is problem in buffer size allocated for TStringField 
(ftString), does not ?
See please at bug report: http://bugs.freepascal.org/view.php?id=17376
There is described situation with SQLite (TSQLite3Connectin) , which 
returns UTF-8 strings, so there is no problem in encoding,
 but problem is in fact, that for char(n),varchar(n) fields is created 
TStringField with Size=n and in record buffer is also allocated space 
with Size+1, where n is number of characters (not bytes). So truncation 
of data occurs, when writting UTF-8 encoded string into record buffer.
So IMHO there must be:
1. allocated space in record buffer in size 4*TFieldDef.Size+1 (and so on)
2. redefine meaning of Size property (as number of bytes not characters) 
and create fielddefs with Size*4
hm, according to 
http://docwiki.embarcadero.com/VCL/XE/en/DB.TStringField.Size is Size 
number of characters
but according to http://docwiki.embarcadero.com/VCL/en/DB.TFieldDef.Size 
is Size number of bytes in underlaying database

but TField is created from TFieldDef and TField.Size=TFieldDef.Size ... 
so isn't it curious ?
> Not that when you want to use UTF-16 (or 32) you have to use
> TWideStringFields.
So TWideStringField is "no-encoding-agnostic" field (is it designed to 
be everytime UTF-16 encoded) ?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20110113/ecdacce0/attachment.html>

More information about the fpc-devel mailing list