[fpc-devel] Unicode support (yet again)

Thu Sep 15 20:52:18 CEST 2011

Graeme Geldenhuys schrieb:
> On 14/09/2011 19:17, Hans-Peter Diettrich wrote:
>> How many users will have to deal with chars outside the Unicode BMP?
> 
> You  are very  narrow  minded! It  depends on  the  application you  are
> developing. Lets  take   a  Science  application  as   an  example. Many
> scientific  symbols  fall  outside  the BMP. Now  lets  take  a  another
> example. Egyptian Hieroglyphs. Or a Music  program. Or your next version
> of Skype or some IM app using all those emoticons.

I'm well aware of that, but you failed to answer my question: How many 
users?

And which of these users has already written code for your examples, 
using 8-bit chars?

> Looking at  the following  chart [see  url below],  Emoticons, Transport
> and  Map  symbols,  Alchemical   Symbols,  Pictographs,  Playing  Cards,
> Mathematical symbols etc all fall outside the BMP. There are all symbols
> that could quite easily  be used in a variety of  applications - so yes,
> accessing symbols in Plane 1- 14 is rather important!
> 
> http://en.wikipedia.org/wiki/Basic_Multilingual_Plane#Supplementary_Multilingual_Plane
> 
> The applications  we develop at our  work use symbols outside  the BMP -
> Maths, Science, Alchemical etc..

Feel free to use symbols as you like :-)

But what's the difference in code, between using astral characters in 
UTF-8 and UTF-16?

>> outside the  BMP UTF-8  is a  waste of space,  and lacks  indexed char
>> access in any case.
> 
> Yeah, and indexed  access for UTF-16 encoded strings needs  to check for
> surrogate  pairs too! Otherwise  your  app is  not  Unicode enabled  but
> rather UCS-2 only.

See above. If you honestly *want* to use the full range of Unicode 
characters, you have to write appropriate code. But who will *force* 
other coders to do the same?

When I want a program for German or French users, I'll hire an coder 
with experience in those *languages*, not with experience only in Unicode.

> And  considering the  amount  of  text processing  apps  I have  written
> (plenty of them), indexed character access  is really not a top priority
> or a often used feature.

Right, text processing deserves special coding, and again much more 
experience than only with Unicode. Now tell me the number of coders, 
which are *capable* of writing a text processing application, and how 
many of these have problems with Unicode and encodings?

DoDi