[fpc-devel] Unicode support (yet again)

cobines cobines at gmail.com
Fri Sep 16 01:03:47 CEST 2011


2011/9/15 Hans-Peter Diettrich <DrDiettrich1 at aol.com>:
> cobines schrieb:
>> When doing:
>> MyChar := MyString[1]
>>
>> appropriate function retrieves first unicode character, regardless of
>> encoding.
>
> This is just wrong :-(
>
> MyString[1] accesses the first element of the *physical* character array,
> regardless of any encoding. Also Length returns the array size, not the
> number of *logical* characters in it.

Right. My point was if I come from Ansi knowing MyString[1] retrieves
first character and know nothing about Unicode, I might still think it
continues to retrieve first character in Unicode regardless of string
encoding (RTL handles that). It is as you say wrong, therefore the
need to adapt the code by developer if he uses such access, but people
might don't know this. Having UTF-16 RTL might help them in a sense
they they will never have to learn, until they deal with characters
outside of the BMP.

>> Whether it's utf8, utf16, utf32 or any other future encoding the code
>> should work the same.
>
> Very new functions are required for dealing with *logical* characters, in
> every MBCS encoding.

Hence the need to remove indexed access like MyString[1].

--
cobines



More information about the fpc-devel mailing list