[fpc-pascal] Status of UTF8
Andrew Brunner
atbrunner at aurawin.com
Sat Sep 28 14:38:59 CEST 2013
I wanted to get the status of WideString support. Specifically
Multi-lingual (Ansi/UTF8 strings) using Threads.
I noticed that FPC is now automatically converting (encoding/decoding)
the contents of a string during various operations. These conversions
seem to be using system locale settings as a basis for selecting which
code page it uses.
I did see that I can change the default code page, but that is not
acceptable where incoming MIME data is concerned. With MIME messages -
the content type and character set are passed along with each section of
text. Blocks specifiy with which code page to employ.
The problem I have is that I process content of many different code
pages in parallel. String manipulation occurs at a mult-threaded level
so switching global variables is going to expose the current way of
processing as dangerous.
What are the best practices for taking a chunks of string data (of
various code pages), and converting to UTF8 given a multi-threaded
application?
Supporting thread-safe system-wide code page string manipulation is a
basic requirement moving forward.
If interested, I can contribute a converter I wrote that offers
encoding/decoding of strings as perhaps some inspiration that may lead
to FPC internal support for such a mechanism.
Thanks,
Andrew Brunner
Aurawin LLC
512.574.6298
https://aurawin.com
More information about the fpc-pascal
mailing list