[fpc-pascal] Status of UTF8

Andrew Brunner atbrunner at aurawin.com
Sat Sep 28 14:38:59 CEST 2013


I wanted to get the status of WideString support.  Specifically 
Multi-lingual (Ansi/UTF8 strings) using Threads.

I noticed that FPC is now automatically converting (encoding/decoding) 
the contents of a string during various operations.  These conversions 
seem to be using system locale settings as a basis for selecting which 
code page it uses.

I did see that I can change the default code page, but that is not 
acceptable where incoming MIME data is concerned.  With MIME messages - 
the content type and character set are passed along with each section of 
text.  Blocks specifiy with which code page to employ.

The problem I have is that I process content of many different code 
pages in parallel.  String manipulation occurs at a mult-threaded level 
so switching global variables is going to expose the current way of 
processing as dangerous.

What are the best practices for taking a chunks of string data (of 
various code pages), and converting to UTF8 given a multi-threaded 
application?

Supporting thread-safe system-wide code page string manipulation is a 
basic requirement moving forward.

If interested, I can contribute a converter I wrote that offers 
encoding/decoding of strings as perhaps some inspiration that may lead 
to FPC internal support for such a mechanism.


Thanks,

Andrew Brunner
Aurawin LLC

512.574.6298

https://aurawin.com



More information about the fpc-pascal mailing list