[fpc-devel] TStringField, String and UnicodeString and UTF8String

LacaK lacak at zoznam.sk
Thu Jan 13 09:15:03 CET 2011


> Didn't I explain this to you and others a few times?
>   
;-) If so, then please excuse me

> The database-components itself are encoding-agnostic. This means:
> encoding in = encoding out.
>
> So it is up to the developer what codepage he want to use. So
> TField.Text can have the encoding _you_ want.
>
> So, if you want to work with Lazarus, which uses UTF-8, you have to use
> UTF-8 encoded strings in your database. 
>   
So this is answer, which i have looked for:
"In Lazarus TStringField MUST hold UTF-8 encoded strings."

But I guess (I have theory), that in time, when Borland introduced 
TStringField, the design goal was:
TStringField was designed for SBCS (because DataSize=Size+1) string data 
encoded in system ANSI code page
and
TWideStringField was designed for DBCS widestring (UTF-16) character data

May be, that I was mistaken by this view.
(or may be, that there is different approach in Delphi ("no agnostic") 
and different in FPC ("agnostic")?)

> If there is some strange reason why you don't want the strings in your
> database to be UTF-8 encoded,
SQL Server does not support UTF-8 (AFAIK)
SQL Server provides non-UNICODE datatypes - char, varchar, text
 and UNICODE (UCS-2) datatypes - nchar, nvarchar, ntext

>  you have to convert the strings from the
> encoding your database uses to UTF-8 while reading data from the
> database.
>
> Luckily, you can specify the encoding of strings you want to use for
> most databases. Not only the encoding in which the strings are stored,
> but also the encoding which has to be used when you send and retrieve
> data from the database. And you can set this for each connection made.
>
> Ie: you can resolve the problem by changing the connection-string, or by
> adding some connection-parameter.
>
>   
Yes, it is true for example for MySQL or Firebird ODBC driver,
 but for SQL Server or PostgreSQL ODBC driver there are no such options
 (but PostgreSQL ODBC driver exists in ANSI and UNICODE version)
 SQL Server ODBC driver supports "AutoTranslate", see: 
http://msdn.microsoft.com/en-us/library/ms130822.aspx
 "SQL Server *char*, *varchar*, or *text* data sent to a client 
SQL_C_CHAR variable is converted from character to Unicode using the 
server ACP, then converted from Unicode to character using the client ACP."
> There's also another solution you can find on the forum and other
> places. You can convert the strings to UTF-8 not only when they are read
> from the database, but also when they are read from the internal memory.
> There's a hook for that.
>
>   
Thanks for your patience
-Laco.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20110113/5573c0fa/attachment.html>


More information about the fpc-devel mailing list