[fpc-devel] [Suggestion] Enumeration range-check intrinsic

Ondrej Pokorny lazarus at kluug.net
Thu Jul 4 23:02:34 CEST 2019


On 04.07.2019 21:20, Jonas Maebe wrote:
> On 03/07/2019 09:26, Ondrej Pokorny wrote:
>> On 02.07.2019 23:34, Jonas Maebe wrote:
>>> Invalid data means undefined behaviour, always. "is" is not a special
>>> case that is immune to this.
>> Don't you really see the need to handle invalid data with a /defined/
>> behavior?
> My point is that is impossible to do so, so trying to do it in a way
> that works in some/most cases, is much more dangerous than categorically
> refusing to try to do it, as it creates a false sense of security.
>
> For example:
>
> {$mode objfpc}
> type
>    tc = class
>      a: array[1..2000] of longint;
>    end;
>
>    tcc = class of tc;
>
>    trec = record
>      p: pointer;
>    end;
>    prec = ^trec;
>
> var
>    c: tc;
>    p: prec;
>    cc: tcc;
> begin
>    new(p);
>    cc:=tc;
>    p^.p:=pointer(cc);
>    c:=tc(p);
>    writeln(c is tc);
> end.
>
> This will print true, even though it's obviously bogus.
>
>> Enum and subrange types /can/ store invalid data and they /do/
>> store invalid data.
> The same goes for classes, and by extension for any pointer type.

You have such a validity check for objects in FPC on the compiler side: 
there are compiler intrinsics fpc_check_object & fpc_check_object_ext 
that can be enabled with -Cr and -CR. So actually you do try to check 
the validity of objects.

But I see that this is an internal feature of FPC that is not to be 
executed manually and there is no assurance of correctness - it is only 
a helper feature for debugging.

With this said, I do accept that it is a no-go to check a valid 
enumeration value against its own type in the code (MyEnum is TMyEnum). 
We are left with a check of a simple value (ordinal, enumeration, 
subrange) against a different enumeration/subrange type:

(MyInt is TMyEnum)
(MyEnum2 is TMyEnum)
etc. as a shorthand for ((Ord(Low(TMyEnum)) <= MyInt) and (MyInt <= 
Ord(High(TMyEnum))).

This is still better than nothing.

+++++++

BUT (big BUT): I suggest to do an enumeration/subrange validity check on 
the compiler side when range checking is used. This is similar to the 
current fpc_check_object intrinsic.

When range checking is used, this check should be injected (at least) 
before every case-of and the new is/as operators if they are replaced 
with true. When range checking is disabled, the check should not be 
injected at all. Examples for range checking:

1.) case MyEnumValue of // <<< here an enumeration validity check should 
be injected
2.) IsValid := MyEnumValue is TEnumValue; //  <<< replace with TRUE but 
still insert a validity check intrinsic if range checking is used. If 
not replaced with true, do not insert validity check (so that false can 
be returned without a runtime error).

IMO such a solution is acceptable for both sides:
Your point: The clarity of the language will stay untouched because the 
new validity check will be only informative and used only for debugging 
without any assurance of correctness. (You see the difference: 
MyEnumValue is TEnumValue will raise an exception for an invalid value 
and not return false! False will be returned only for (MyInt is 
TEnumValue).)
My point: We will get an information about invalid enum values (range 
check error) instead of "undefined behavior" when debugging - and we 
will be able to find code where invalid enum values can occur and fix 
the code where these values were assigned. Although this information may 
not be (theoretically) absolutely accurate (the same as fpc_check_object 
is not absolutely accurate), it will still be pretty accurate and it 
will be a big help during the debugging.

Ondrej



More information about the fpc-devel mailing list