[fpc-devel] Trying to understand the wiki-Page "FPC Unicode support"
DrDiettrich1 at aol.com
Sat Nov 29 17:36:16 CET 2014
Jonas Maebe schrieb:
> On 28/11/14 21:30, Hans-Peter Diettrich wrote:
>> I prefer to specify and document everything *before* coding, so that
>> everybody can expect that the code will behave as specified.
> If certain behaviour is explicitly undefined, it *is* specified and
> documented. It means that your program is buggy if it triggers such
> behaviour, and that the effect of triggering it could be anything.
> An example from FPC itself is accessing an array beyond its bounds when
> range checking is switched off.
After this hint I reviewd the "Code page identifiers" section again, and
probably could find the source of misunderstandings.
CP_NONE: this value indicates that no code page information has been
associated with the string data. The result of any explicit or implicit
operation that converts this data to another code page is undefined.
Does this mean "CP_NONE is not an allowed *dynamic* (string *data*)
encoding", just like any other undefined encoding value?
In this case the description is correct, but it describes an special
case of some *undefined* general rule, about valid and invalid dynamic
encodings in general. Then this general rule should be documented
before, not only for CP_NONE. Then also documentation of the *intended*
purpose of CP_NONE, for the *static* encoding of the RawByteString type,
is missing at all.
As Delphi doesn't allow for a dynamic encoding of CP_NONE, I don't
understand the purpose of the FPC description. Now in turn some FPC
developer might have misunderstood the (Delphi) handling of
RawByteStrings, assuming that it were okay to omit a conversion in an
assignment of RawByteString to an AnsiString of a different encoding.
That's why I think that the incorrect handling of such RawByteString
assignments in FPC should be fixed, according to the general rule of
assignments to an string of a different (static) encoding. CP_NONE
definitely *is* different from any other encoding, and Delphi does not
define an exception for RawByteStrings.
> Exactly the same goes for converting strings with code page CP_NONE to a
> different code page: your program is broken when it tries to do that,
> and we cannot guarantee any outcome. This is exactly what "the behaviour
> is undefined" means.
When a string *really* has a *dynamic* encoding of CP_NONE, this of
course is illegal and thus will result in an undefined result. ACK, so
far. But since Delphi (quietly) changes an SetCodePage to CP_NONE into
the current CP_ACP, the undefined situation (invalid dynamic encoding)
must have been forced by some illegal *hack* before, or in the FPC case
by some erroneous (not Delphi conforming) RTL code.
More information about the fpc-devel