[fpc-devel] RFC: proper interpretation and implementation of Unicode Support

Michael Schnell mschnell at lumino.de
Tue Dec 2 13:58:58 CET 2014


On 11/28/2014 08:19 PM, Hans-Peter Diettrich wrote:
>
> In that discussion I found several errors, which are not detected by 
> the compiler nor handled in the RTL. In the concrete entry the illegal 
> use of the *generic* CP_NONE identifier is mentioned. That's why I 
> felt a need to address several specific topics in above draft.
Yep.

You can't do a type brand the encoding of which is as well static as 
dynamic.

This is what causes the complete mess introduced by RawByteString (and 
Delphi and in fpc).

So IMHO the only way to go is to suggest to the users (or force them) 
use the type RawByteString (i.e. CO_NONE) exactly as the name suggests: 
no encoding brand is known, so it can't be auto-converted in any other 
encoding, and it can't preserve the encoding of anything that is 
assigned to it.

This said, we don't have any (pseudo-) dynamically encoded type any 
more, and hence the "encoding-type" (and "element-size") field in the 
string header does not make any sense any more any can be dropped 
altogether.

But as the implementation (in Delphi and) in fpc already provides 
"encoding-type" and "element-size" fields, I suggest using them for an 
additional decently dynamic type "DynamicString" (CP_ANY = $FF00), which 
(IMHO) can be introduced without braking any compatibility or 
introducing any noticeable performance degradation, and allows for doing 
versatile code (including standard  library APIs).

-Michael



More information about the fpc-devel mailing list