[fpc-devel] Unicode proceedings

Michael Schnell mschnell at lumino.de
Wed Nov 16 16:30:22 CET 2011


On 11/16/2011 02:56 PM, Hans-Peter Diettrich wrote:
> Delphi uses the native/generic AnsiString(0),
A native /generic type is exactly what is _not_ available in the A) 
suggestion of definitions. Here a type name stands for exactly one 
encoding variant. No dynamic encoding (implemented by having a variable 
hold a field that denotes which encoding is used) is possible.
>
>> There still are ambiguous cases:
>>  - "intersexual" variables (strict Typed variable that happens to be 
>> correctly done but with an encoding that does not match the type name)
>
> This should never occur.
As with a system that provides Type names and additionally does dynamic 
encoding such beasts _can_ occur (e.g. if you do a diabolic "move()"), 
So the implementation definition needs to state what has to be done if 
such a case is detected (e.g. ignore or fire an exception).
>
>>  - RAW variables that in fact have a dynamic encoding definition of 
>> "NONE/RAW" and are to be converted into a strictly typed variable.
>> Handling this might be considered an "implementation detail". So 
>> Delphi compatibility is not necessary.
>
> Nobody wants that, except you.

RAW strings are exceptionally useful.

E.G.:
You want to do a function that provides the first (ASCII-coded) number 
in a string.

If you do this function with a RAW parameter you don't need multiple 
overloaded variants of the function and the code in the function can 
take care of the actual encoding of the content of the string without 
doing any actual (time consuming) conversion, as the ASCII numerals to 
be extracted never need a conversion.

Moreover, in fact, strings can be used for simply holding a sequence of 
bytes, words or DWORDs, just to take advantage of the reference counting 
provided by the string types. Here of course it does not make any sense 
to define any encoding scheme.

In this case a "NONE" encoding-variant for the string type "RAW" is 
valuable.

Delphi does provide this by the encoding number $FFFF.

Moreover Delphi seems to define the encoding number $0000 as "to be 
done" as it determines the conversion to be done from the dynamic 
encoding of the target of a ":=" statement.  I am not sure if this 
really is the case, but IMHO this is an extremely nasty implementation. 
IMHO the history of the target in ":=" never should affect what is done 
in this statement.

-Michael



More information about the fpc-devel mailing list