[fpc-devel] Dangerous optimization in CASE..OF

Jonas Maebe jonas at freepascal.org
Sun Jul 16 13:40:32 CEST 2017


On 16/07/17 09:12, Ondrej Pokorny wrote:
> On the one hand you say that the compiler can generate invalid values 
> and on the other hand you say that the compiler can assume the enum 
> holds only valid values.

It can assume this because a program's behaviour is only defined if you 
initialise your variables first with valid values. This is not just the 
case for enums, for any type. If you don't initialise a local longint 
variable, the compiler could transform your program as if it was 
initialised with -12443, for example. The reason is that the behaviour 
of your program is undefined in that case, so any behaviour the compiler 
could come up with would be "correct".

The same goes if you use an enum variable that does not contain a valid 
value.

> For now, there is absolutely no range checking 
> and type safety for enums - so you can't use it as an argument.

There is just as much range checking and type safety for enums as there 
is for any other type.

>  1.) Give us full type safe enums with full range checking that CANNOT hold invalid values after any kind of operation (pointer, typecast, assembler ...). Then I am fully with you: keep the case optimization as it is (and introduce more optimizations).
> 
> 2.) Keep the enums as they are (not type safe) and don't do any optimizations on type safety assumptions on the compiler level. Because there is no type safety.
> 
> From my knowledge, the (1) option is utopia in a low-level languages along with Pascal
With your argument, there is no type safety for ansistrings either. 
After all, it's very easy to get an ansistring with an invalid initial 
value with something like this:

type
   trec = record
     a: ansistring;
   end;
   prec = ^trec;
var
   p: prec;
begin
   getmem(p,sizeof(trec));
   p^.a:='abc'; // undefined behaviour
end.

Nevertheless, the compiler never adds code to check whether an 
ansistring actually points to valid ansistring data before it uses it. 
It simply assumes that you, as a programmer, made sure it is properly 
initialised at all times.

The fact that (Borland/FPC-style) Pascal is a low-level language and 
includes features like arbitrary explicit/unchecked typecasts, pointers, 
inline assembly, and that it does not force initialisation of memory 
with values that are valid for the type of the data you will store in 
that location (or even require you to define the type of a memory block 
when you allocate it), means that as a programmer you are responsible 
for upholding your end of the bargain as far as the type-safety is 
concerned. It's not a one-way street, and never has been (for not a 
single type).

The compiler can check either statically or dynamically whether there 
are any potential errors when you implicitly convert values from one 
type to another, in case the source may contain values that are invalid 
for the target type. The programmer, on the other hand, are responsible 
for ensure that all source values are properly initialised. As mentioned 
before, this is the the garbage-in, garbage-out principle.

Your argument is therefore unrelated to enum types. On the other hand, 
you are, however, correct that as far as base enums specifically are 
concerned, it does seem they should be treated like in C (i.e., as a 
shorthand for plain constant declaration). You will still have the same 
problem with (at least integer, and possibly also enum) subranges though.


Jonas



More information about the fpc-devel mailing list