[fpc-pascal] Generic String Functions
Michael Schnell
mschnell at lumino.de
Fri Feb 28 15:01:48 CET 2014
On 02/28/2014 01:04 PM, Marco van de Voort wrote:
> Moreover, will operations that use character access make sense at all
> if you don't know what the actual encoding is?
The administrative record of each "New Delphi string" contains the
encoding type and the byte-count for each code. So "you" (the compiler
and the RTL) do know it.
The "only" shortcoming in Delphi is that the handling is completely
"static":
- if the encoding definition of the type the string is created with is
not "RAW", the encoding needs to be known at compile time (i.e. the
encoding type is not allowed to be modified at run time)
- if the encoding definition of the type the string is created with is
"RAW", auto-conversion from this string to a non-RAW is not done.
Hence (including - but not only - for decent use on multiple OSes) an
additional "fully dynamically encoded" type (I suggest to call the type
of this Strings "Generic") is necessary.
> (not only s[] but also
> pos,delete,insert etc). The same code can seem to behave differently
> because different code-paths make the same parameter have different
> encodings.
I suppose that you are right. But not only the "funny" position numbers
pos(), delete(), insert() and friends use, create a problem, but also
the the String type they are defined to use does:
- If using any statically encoded type for same, it is close to
impossible to create decently fast programs for string manipulation
(unless they by chance use the correct encoding type), as
auto-conversion to and fro is invisibly introduced.
- If using the suggested dynamically encoded type, we will have
problems when combining strings of different types in a code snippet
that calls these functions.
I don't know if / how / to_what_extent compiler magic can help here
(doing auto-conversion "when necessary" similar to when simply assigning
strings of different encoding types).
In the end, I feel it would be very un-desirable but might be the only
"easy" solution to go with full Delphi compatibility and handle all
strings encoding but UFT16 in a very un-decent way. This would force
Lazarus to provide a (Delphi compatible) LCL-API completely done with
UTF16. This completely contradicts all they did in the last few years :-) .
-Michael
More information about the fpc-pascal
mailing list