[fpc-devel] cpstrrtl/unicode branch merged to trunk

Michael Schnell mschnell at lumino.de
Mon Sep 9 12:09:17 CEST 2013


On 09/06/2013 03:59 PM, Hans-Peter Diettrich wrote:
> You defer the problem from the interface to the implementation. When a 
> procedure has an string parameter of an unknown encoding, the 
> implementation must check every time the *current* encoding of the 
> parameter, and proceed as appropriate.

IMHO you are partly right. If a function needs to know the encoding of a 
string it handles, it of course needs to initiate a conversion or some 
ether encoding-specific action. With the appropriate compiler magic for 
this (not Delphi predefined) type conversion would happen automatically 
as soon as the user decides to assign the input string to a variable 
with any predefined fixed string encoding he chooses. In this case the 
overhead would be close to none regarding a conversion when calling the 
function.

But there are many cases where the called function does not need to know 
(and IMHO should not know) the encoding type of the string(s) it works on:

A1) A function that just calls some other function with the string it 
got as an argument. (Silly example:  a function that sticks three things 
together to one.) (Supposedly some library functions like "copy()" might 
have build-in compiler magic to avoid conversion when used with - say - 
UTF8, anyway.)

A2) A function that does not interpret the content of a string but 
simply stores the String. Most prominent example being TStrings (and 
here TStringList as it's child). To store the string it does not need to 
know the encoding of the string but just it's byte-count, that is easy 
enough to be calculated


Or
B1) A (specially designed) function that is able to work on multiple 
encoding types without the necessity to completely convert the string 
into some predefined encoding.
An example would be a function that counts the printable characters (or 
more simple to do counts the number of digits) contained in a string.

> I.e. *all* possible encodings must be handled (implemented) inside the 
> procedure, what's almost impossible. 
Not true at all (see above). This seems to be _the_ misconception with 
this issue. Especially TStrings/TStringList and friends would suffer 
severely when implemented in a Delphi-isch (thus Windowish) way with 
fixed predefined encoding for the user code interface.

Regarding the said (specially designed) function the implementing user 
of course needs to know which encoding he needs to support and thus to 
take care of when implementing.

-Michael



More information about the fpc-devel mailing list