[fpc-devel] new string - question on usage

Hans-Peter Diettrich DrDiettrich1 at aol.com
Tue Oct 11 11:20:46 CEST 2011


Michael Schnell schrieb:
> On 10/11/2011 12:23 AM, Martin wrote:
>>
>>
>> Utf8ToLower is, (and should) be declared expecting a Utf8String.
> Why should a function Utf8ToLower be used (or even be  defined for 
> normal use) ?

Because it expects and UTF8 argument, and provides an UTF8 result, so 
that no further conversions are required when used with strings of 
exactly that encoding.

> With dynamically encoded Strings "ToLower" should work for any encoding.

You mean something like this?
   function ToLower(s: RawByteString): RawByteString;
[dunno whether RawByteString is an allowed Result type at all]

Then this function has to determine the encoding internally, convert 
strings of unhandled encodings, and then do the conversion implemented 
for the given or converted encoding. When the result is used, another 
check of the encoding and possible conversion has to be inserted by the 
compiler.


IMO you should understand that the new "string" type is bound to one 
specific encoding, a dynamic re-encoding is not possible. Even Delphi 
does not work with "polymorphic" strings, the generic "string" type is 
UTF-16 encoded.

Use RawByteString instead, if you want strings with no fixed encoding. 
But RawByteStrings imply an overhead, since the compiler must insert 
checks and conversions whenever two strings of *possibly* different 
encoding are involved in any operation, maybe assignment...

DoDi




More information about the fpc-devel mailing list