[fpc-devel] Performance of string handling in trunk
Michael Schnell
mschnell at lumino.de
Thu Jun 27 12:55:06 CEST 2013
On 06/26/2013 06:29 PM, Hans-Peter Diettrich wrote:
>
> Then you have two choices:
> 1) convert the string as required
> 2) copy the content unconverted, but update the encoding
What do you mean by "you have two choices" ?
In fact the compiler designer has the choice to implement some behavior:
1) convert the string as required
(seems most sensible)
2) copy the content unconverted, but update the encoding
(does not seem sensible at all as with that the static encoding type
of the normal target String does not match the dynamic encoding type any
more). At other locations in the code the compiler creates will
implicitly use the static encoding type (e.g. to decide whether or not a
conversion is necessary) and the content will be interpreted wrong.
3) issue a warning or (better) an error at compile time for any
assignment of a RawByteString to a normal String
(as conversion is not implemented and not converting leads to
unpredictable behavior)
4) issue an exception at runtime when the types don't match
(not nice but consistent)
Of course appropriate "Delphi Quirks" modes could influence the compiler
on that behalf.
>
> IMO a reasonable decision should take into account the use of the
> RawByteString type in RTL code, e.g. for concatenation.
The RTL of course needs to perfectly match the compiler. But as both are
"under construction" right now (regarding the behavior with this kind of
Strings <however they are called>) I think that is easily doable.
>
> Can you show us your intended code for these functions?
What functions ? We are talking compiler behavior.
I think I already did write down what I meant (the version with just
RawByteString and not with an additional String Type of another name
that might be even more "attractive".
I can do this again in a matrix instead of a the text version I wrote;
When assigning "such" Strings (I hope the monospace is visible in the
List):
(The compiler does the test for encoding using the static (compile time)
encoding type with normal strings and the dynamic (in the string record)
encoding type value for RawByteString.)
Source: | normal String |
RawByteString
target: | |
normal String with the same static encoding | set pointer | set
pointer (after checking dynamic encoding)
normal String with different static encoding | call conversion |
call conversion(after checking dynamic encoding)
RawByteString(dynamic type ignored) | set pointer | set
pointer (checking dynamic encoding not necessary)
Note:
- if the static types match (be it Raw or not) just set pointer.
- the compiler only needs to issue code to check the dynamic type
if the source is RawByteString.
- the dynamic type of the target is ignored by the compiler. Only
the conversion function will use it.
- the static type of source and target is not used by the conversion
library function. It can work according to the dynamic types and thus
just needs to be given the two string variables (Pointers) in the call
the compiler creates (in assembler object code).
- for a normal String, a mismatch between static and dynamic type
(that would be erroneous in DXE as well) can't happen.
- for RawByteString, a "normal" dynamic type means: "this is
printable information" and a dynamic type $FFFF (that had been assigned
to the string when instantiating) means: this String just holds just
bytes with no encoding assumed.
-Michael
More information about the fpc-devel
mailing list