[fpc-devel] ansistrings and widestrings
drdiettrich at compuserve.de
Fri Jan 7 16:06:18 CET 2005
Florian Klaempfl wrote:
> > The only universal international representation for strings is Unicode
> > (currently 32 bit), that doesn't require any conversions.
> That's not true. E.g. the german umlauts can be represented by 2 chars
> when using UTF-32 (the char and the two dots), same apply to a lot of
> other languages.
Okay, this is where I didn't understand the difference between code
points and whatsoever. Doesn't in the umlaut and accented case exist a
unique glyph and according code, that could be used in the first place?
In other languages (Arabic...) the glyph may vary with the context, here
I have no idea how to compare such text, but the native writers
(speakers) of such glyphs should know ;-)
> Encoding isn't the main problem, you need dedicated procecures and
> functions for unicode comparision, upper/lower conversion etc.
Agreed, these will become the string class methods. It may be necessary
to partition Unicode into code pages, with different methods for
In the worst case, if we cannot find or agree about a so-far unique
representation for text, an "uncomparable" value has to become a valid
result of a comparison.
> To achive this platfrom independend is very hard ...
How that? I agree that here the existence of definitely
compatible/portable OS services is not guaranteed. But when the methods
have to be implemented for platforms that do not have such services at
all, then these implementations can be used on all other platforms as
All in all I'd say that we do not intend to implement a text processing
or translation system. What we can do is to define a string or text
class, that contains text in a well defined form, for processing with
all specified methods. The key point is the import of text into an
object of any such class. If no appropriate class has been implemented,
the import is simply impossible. Inside, i.e. between these classes, all
the methods should work. Perhaps with graceful "uncomparable" or
"unconvertable" results, when somebody insists in using incompletly
We don't want the impossible, the doable will be sufficient ;-)
More information about the fpc-devel