[fpc-pascal] code example where AnsiString used in FCL (SqlDB) causes data loss

Michael Van Canneyt michael at freepascal.org
Wed May 11 16:38:11 CEST 2016



On Wed, 11 May 2016, Andreas Dorn wrote:

> All in all Graeme is right. FPC looks pretty much broken to me, too.
> For my projects I pulled the emergency-break on anything FPC.
>  
> The most serious flaws for me of FPC 3.0 are:
> - assuming that it's possible to assign an encoding to every string
> - using an (unsafe) guess about the encoding for auto-conversions
>  
> It's not possible to assign a valid encoding to every string (not automatically, and not even manually).

Please stop spreading FUD, this is plainly a false statement.

>  
> Some examples:
> 1) String-Buffers
> Split a UTF-8 String into chunks of 1024 bytes. Trying to assign an encoding to
> those chunks, and allowing auto-conversions will just lead to corruption.
>  
> Where is the string-type for string-buffers gone?

There never was one, this would break in 2.6.4 too.

If you thought there was in 2.6.4, you are simply mistaken.

>  
> 2) Most programming languages out there use something like "sequence of UTF-16 codepoints" as a string-type.
> (That's not the same as UTF-16 string !!!!!)
> It's a proper string type for "UTF-16 buffer" - pretty much nobody out there uses a low-level string-type that assumes
> that the content is a complete UTF-16 string.  

No-one stops you from using Unicodestring ?

> 3) Filenames on Windows
> You can't convert any random filename on Windows to UTF8 and back without dataloss.
> There simply isn't any encoding that correctly fits to all possible filenames.

You will need to explain what you mean by this.

> A lot of APIs use buffers. You can try to assign an encoding to a buffer, but if you use that encoding
> to auto-convert anything you made a blatant mistake. Assuming that anything from the outside world
> (WindowsAPI, C#, Java...) is UTF-16 is yet another blatant mistake...
>  
> 4) some Barcodes,
> 5) Various File-Format-Standards,
> 6) anything that uses ASCII + some Control-Bytes for communication,
> 7) some encodings used in databases, ...
> all that won't fit into the FCP scheme of 'known encodings'..

FPC 3.0.0 has not changed with regard to 2.6.4 in this regard.
   
> The most obvious showstoppers for FPC 3.0 are:
> FPC 3.0 doesn't have a useful type for string-buffers.

Please explain what you mean with 'string buffers'.

When using e.g. windows or C apis, the string buffer you need to use is 
either "Array of char" or "array of widechar".

Which one you should use depends on the API you want to access.

In the case of Array of Char, you must take care of encoding, but this was so in 2.6.4 as well.

Nothing has changed in this regard.

> FPC 3.0 doesn't have a useful type for Filenames

Just use the native filename type, or UnicodeString.

> FPC 3.0 adds unsafe auto-conversions

Why do you think it is unsafe ?

Michael.


More information about the fpc-pascal mailing list