[fpc-devel] Unicode support (yet again)
waldo kitty
wkitty42 at windstream.net
Sat Sep 17 03:10:13 CEST 2011
On 9/15/2011 19:03, cobines wrote:
> 2011/9/15 Hans-Peter Diettrich<DrDiettrich1 at aol.com>:
>> cobines schrieb:
>>> When doing:
>>> MyChar := MyString[1]
>>>
>>> appropriate function retrieves first unicode character, regardless of
>>> encoding.
>>
>> This is just wrong :-(
>>
>> MyString[1] accesses the first element of the *physical* character array,
>> regardless of any encoding. Also Length returns the array size, not the
>> number of *logical* characters in it.
>
> Right. My point was if I come from Ansi knowing MyString[1] retrieves
> first character and know nothing about Unicode, I might still think it
> continues to retrieve first character in Unicode regardless of string
> encoding
+100000000000000000~
this is something that i'm having to deal with with 30+ years of pascal
programing... i'm still trying to wrap my head around this GUI coding stuff...
while i do have some similar experiences with other languages from way back
(dBIII, dBIV and such that have/had forms) forms style coding is still alien to
me... i'm used to simply clearing an 80x25 screen and then drawing my next
screen... if i need to easily return to a previous screen, i might redraw it or
i might restore it from a saved buffer and "blit" it back onto the screen...
i don't know the difference between thisstring and thatstring and there are
times that this is one of the worst problems i face... my first hurdle was
clearing the 255 character strings that i'm so used to dealing with... it used
to be that i used a custom written ACSIIZ convertor routine but now it seems
that these are in the run time libraries and i need only to choose the proper
strings to convert between... even then, it can be quite the chore :?
> (RTL handles that). It is as you say wrong, therefore the
> need to adapt the code by developer if he uses such access, but people
> might don't know this. Having UTF-16 RTL might help them in a sense
> they they will never have to learn, until they deal with characters
> outside of the BMP.
moew old school stuff here... a BMP is a windows style graphic... what are you
guys calling a BMP???
and agreeing (fully!) there really should be some sort of "hidden(?)" overrides
taking place so that folks like myself don't really have to worry about this
stuff... but then again, maybe? i dunno? i'm not sure, these days, after a
year+, if i'm no the right track or not...
>>> Whether it's utf8, utf16, utf32 or any other future encoding the code
>>> should work the same.
>>
>> Very new functions are required for dealing with *logical* characters, in
>> every MBCS encoding.
>
> Hence the need to remove indexed access like MyString[1].
removing that is not a GoodThing<tm> is it? really? as i say above, it would
seem to me to be best for the library to handle all of this stuff in the same
way that borland handled similar things way back when... on the one hand, it
would seem easy enough to handle with automatic overrides but then again, i may
not be thinking of things in the same way as others used to working with this OO
oriented method of coding...
More information about the fpc-devel
mailing list