[fpc-devel] Dangerous optimization in CASE..OF

Martok listbox at martoks-place.de
Sun Jul 16 18:43:02 CEST 2017


> And you also have subranges of enum types. Can any assumptions made 
> about those in your opinion?

> Does that mean that you would consider the same transformation of a
> case-statement when using a subrange type as correct? And that putting a
> value outside the range of a subrange into such a variable as a
> programmer error? (as opposed to doing the same with a non-subrange enum
> type?)
Depends on the compiler version sadly :/

Subranges for TP5 are documented as "don't rely on anything at runtime, we only
check compiletime", TP7 documents "outside range is an RTE (independent of $R
state)", Delphi is documented like TP5 again.

My intuition was shaped by learning the language with D4 (and D3 books), but
I've always thought that as weird and makes subranges a bit pointless.

I would think that
type
  TEnum = (a,b,c);
  TSubEnum = a..c;

should have the same semantics, but at the same time they can't if subranges are
strict and enums are not. I see now where you're coming from.
(I'll get back to that example at the end.)

And then there's bitpacked records...


> But I finally understand where the disconnect comes from. I have always 
> thought of enums as more or less equivalents of subrange types, simply 
> with an optional name for the values. You, and indeed the Pascal 
> standards, treat them differently.
Getting back to the terms Ondrej introduced yesterday, I think that "normal"
enums may or may not be High-Level enumerations, but enums with explicit
assigment can *only* be Low-Level enumerations. Can we safely distinguish them
in the compiler? Does it even make sense to add that complexity?

This gets weirder. I think Borland already made that distinction, but... not?
<http://docwiki.embarcadero.com/RADStudio/XE5/en/Simple_Types#Enumerated_Types_with_Explicitly_Assigned_Ordinality>

"""An enumerated type is, in effect, a subrange whose lowest and highest values
correspond to the lowest and highest ordinalities of the constants in the
declaration. [...] but the others are accessible through typecasts and through
routines such as Pred, Succ, Inc, and Dec."""

So that's about the "gaps": they're valid, just unnamed.
But for subranges, they write:

"""incrementing or decrementing past the boundary of a subrange simply converts
the value to the base type."""
So we can also leave the min..max range and transparently drop to the parent
type. This raises in $R+, _but is valid otherwise_. (* This is the exact same
text as in the TP5 langref *)

Logical conclusion from that: a variable of a subrange of a
 1) High-Level enum becomes invalid when we leave the declared enum elements
 2) Low-Level enum remains valid by way of dropping to the base type.
Having both variants in the type system is too complex IMO - although it would
be something where the programmer clearly has to state her intentions.


My initial proposed trivial solution was to keep this undefined (maybe document
the difference to BP), and simply change codegen to be undefined-safe normally
and only undefined-unsafe in -O4. I am, however, no longer so sure if that is
really a good solution.

There has to be a reason why everybody else chose Low-Level enums, except that
it is far simpler to implement, right?


> And it would also require us to conditionalise every future optimisation 
> based on type, in particular separating the treatment of enums from that 
> of integers. That's a lot of (future) work and care to deal with what I 
> still consider to be bad programming.
Delphi optimizes always based on the full-range base type:

type
 TB = (a,b,c,d,e); // Sizeof(TB)=1
 TT = a..e;
var
  t: TT;
begin
  t:= TT(2);
  if t <= e then           // does not get removed
  if Ord(t) <= 255 then    // 'Condition is always true'




Martok



More information about the fpc-devel mailing list