[fpc-devel] Unicode support (again)
    Jonas Maebe 
    jonas.maebe at elis.ugent.be
       
    Tue Nov 11 13:44:43 CET 2008
    
    
  
On 11 Nov 2008, at 13:39, Michael Schnell wrote:
>> a) "ü": "LATIN SMALL LETTER U WITH DIAERESIS", encoded as $C3 $BC
>> b) "ü": "LATIN SMALL LETTER U", encoded as $75, followed by  
>> "COMBINING DIAERESIS", which is encoded as $CC $88
> I see, but I fail to see the sense of providing two different UTF8  
> code variants for the same unicode character.
Probably because different kinds of string processing can work more  
efficiently with one or the other encoding. Anyway, why it is the case  
is moot: the fact is that this is possible (regardless of whether you  
use UTF-8, UTF-16 or UTF-32) and therefore you have to deal with it  
when you use unicode.
Jonas
    
    
More information about the fpc-devel
mailing list