[fpc-devel] String and UnicodeString and UTF8String

Marco van de Voort marcov at stack.nl
Thu Jan 13 10:57:26 CET 2011


In our previous episode, Hans-Peter Diettrich said:
> >>> "non-native" strings, it can also be a performance win).
> >> IMO a single encoding, i.e. UTF-8, can cover all cases.
> > 
> > Well, for starters, it doesn't cover the existing Delphi/unicode codebase.
> 
> Because it's bound to UTF-16? That's not a problem, because WideString 
> will continue to exist, and according conversions are still inserted by 
> the compiler.

That is DIY compatibility, or, in other words, no compaibility. 

Widestring will also grind the application to a halt due to being COM based
on Windows.
 
> >> While some hard core Ansi coders may whine about such a convention, the
> >> absence of implicit string conversions (except in external library calls)
> >> will make such applications more performant than mixed-encoding versions.
> > 
> > I don't see why this is the case. A current system encoding application does
> > not do any conversion. (except for GUI output, and that can be considered
> > negiable to the actual GUI overhead)
> 
> When system encoding changes with the target platform, indexed access to 
> such strings can lead to different results. Unless the compiler can read 
> the coder's mind...

You don't have to. The Delphi model provides a stringtype for the system
encoding, and then as such all strings from the system can be labeled. With
other stringtypes, the necessary conversions can be edited.

Likewise, e.g. win32 console routines can be labeled with OEMString. (Since
windows uses a different default encoding for the console)
 
> >> Why spend time in the design of multiple RTL/LCL versions, when 
> >> a single version will be perfectly sufficient?
> > 
> > Why spent 13 years being compatible when you can throw it away in a
> > second?
> 
> It's sufficient to throw away what's no more needed :-)

The previous message from Jeff shows that even shortstring is still in major
production use. Nothing is unused and can be clipped without a long winded
transition, or Delphi 2009 like painful breaks.

Moreover, these discussions are useless since you know as well as I do that
no one stringtype will ever satisfy everybody. So IMHO it is time to take
the consequences from the 500 posts on this subject on the unicode subject
on this and other FPC/Lazarus lists and start thinking in solutions to
manage that, instead of reiterating the "one type to rule them all" mantra
ad infinitum.



More information about the fpc-devel mailing list