[fpc-devel] Dangerous optimization in CASE..OF

Martok listbox at martoks-place.de
Sun Jul 2 00:26:28 CEST 2017


Hi all,

continuing the discussion from bug 32079 here, as per request by Jonas. New
thread instead of the DFA one because this is about a concrete issue *with the
optimizer*.

TL;DR: there is an incredibly dangerous, unexpected and unavoidable
level1-optimization in the codegen for case..of, leading to potential arbitrary
code execution in any code that switches on enums.

The full story is going to get a bit long-winded, so grab a cup...


CASE $Ordinal OF is defined in the Reference Manual, 13.2.2 like so:
"""
The compiler will evaluate the case expression. If one of the case constants’
value matches the value of the expression, the statement that follows this
constant is executed. After that, the program continues after the final end.

If none of the case constants match the expression value, the statement list
after the else or otherwise keyword is executed. This can be an empty statement
list. If no else part is present, and no case constant matches the expression
value, program flow continues after the final end.
"""

Depending on the number of case labels, tcgcasenode.pass_generate_code
(ncgset.pas:1070) may choose one of 3 strategies for the "matching" task:
jumptable, jmptree or linearlist. Jmptree and linearlist are basically "lots of
if/else", while jumptable is a computed goto. The address of every possible
label's code block is put into a table that is then indexed in a jmp.
Example for x86, switching on a byte-sized #EXPRESSIONRESULT:
   mov    #EXPRESSIONRESULT,%al
   sub    #LOWESTLABEL,%al
   and    $ff,%eax
   jmp    *0x40c000(,%eax,4)

What if EXPRESSIONRESULT is larger than the largest case label? To be safe from
that, the compiler generates a range check, so the example above actually looks
like this:
   mov    #EXPRESSIONRESULT,%al
   sub    #LOWESTLABEL,%al
   cmp    (#HIGHESTLABEL-#LOWESTLABEL),%al
   ja     $#ELSE-BLOCK
   and    $ff,%eax
   jmp    *0x40c000(,%eax,4)

This is very fast because modern CPUs will correctly branch-predict the JA and
start caching the jumptable so we effectively get the check for free, and still
way faster than the equivalent series of "if/else" of the other strategies on
old CPUs.

So far, so good. This is where Delphi is done and just emits the code.
FPC however attempts one more optimization (at LEVEL1, so at the same level that
enables jumptables in the first place): if at all possible, the range check is
omitted (which was probably a reasonable idea back when branches were always
expensive). The only criterion for that is if the highest and lowest value of
EXPRESSIONRESULT's type have case labels, ie. if the jumptable will be "full".
This makes sense for simple basetypes like this:
  case aByteVar of
    0: DoSomething;
    // ... many more cases
    255: DoSomethingLast;
  end;

The most likely case where one might encounter this is however is with
enumerations. Here, the criterion becomes "are the first and last declared
element present as case labels?", and we're no longer necessarily talking about
highest and lowest value of the basetype.

This is fine if (and only if) we can be absolutely sure that the
EXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax in
the example above may be anywhere up to high(basetype)'th element of the
jumptable, loading an address from anything that happens to be located after our
jumptable and jumping there. This is, I cannot stress this enough, extremely
dangerous! I expect not everyone follows recent security research topics, so
just believe me when I say that: if there is any way at all to jump "anywhere",
a competent attacker will find a way to make that "anywhere" be malicious code.

So, to be able to do that optimisation safely (remember, LEVEL1 should be safe
and not change the behaviour) we must be absolutely sure that this cannot happen.
Good thing we're talking about enumerations here, so only declared elements can
ever occur, right? Right!?
Or, as Jonas put it:
> The only way to get data with an invalid value in an enum in Pascal is by 
> using an unchecked (aka explicit) typecast, by executing code without range 
> checking (assigning an enum from a larger parent enum type into a smaller 
> sub-enum type), or by having an uninitialised variable.

Turns out this is really not true. There are also as "esoteric" things as using
Read(File). Or TStream.Read. Or the socket implementation of your choice. Or by
calling a library function. There are many ways to have an invalid value in an
enum in any meaningful code. Pretty much everything that is not a direct
assignment from a constant is a potential candidate.

So, the only choice for us is to assume that enumerations are just what they are
in C: fancy constants on a base type. This is, incidentally, exactly the wording
used in the documentation cited above, so if we'd want to play by
'rules-as-written' that is exactly what we should have assumed anyway.

> Just compile with {$RANGECHECKS ON}, then!
I've been trying really hard for the past couple of hours, but I haven't gotten
the compiler to emit a single check at all when doing anything with enums. And
even if, a runtime error is usually not what you want. You'd probably want to
tell the user a file is corrupt instead of killing the program...

> But that still doesn't mean we have to worry about any of this in the
> codegen for CASE..OF - just tell the programmer to manually check their input
> after reading from wherever!
Well, yeah, except... there is no way to do that.
  if EnumValue in [low(TEnumType)..high(TEnumType)] then
will not work for sparse enums or a basetype larger than Byte, and
  case Enumvalue of
    All,
    Expected,
    Values.... : doSomething;
  else
    raise EFileError.Create('Invalid data');
  end;
will obviously also not work because this is just what we're trying to do here
in the first place (NB: this was my original use case).

So, we have a problem here: either the type system is broken because we can put
stuff in a type without being able to check if it actually belongs there, or
Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
be clever by omitting an essentially free check for very little benefit.
I know which interpretation I would choose: the one with the easier fix ;-)


I would very much like for someone to at least acknowledge that there is a
problem here, because I can think of several more or less clever fixes for that
(except the obvious) and would prefer discussing these instead of having to
prove that "not-as-defined"-behaviour is not the same as "undefined behaviour".


Kind regards,

Martok



More information about the fpc-devel mailing list