[fpc-devel] Dangerous optimization in CASE..OF

Florian Klämpfl florian at freepascal.org
Sun Jul 2 10:19:39 CEST 2017


Am 02.07.2017 um 00:26 schrieb Martok:
> Depending on the number of case labels, tcgcasenode.pass_generate_code
> (ncgset.pas:1070) may choose one of 3 strategies for the "matching" task:
> jumptable, jmptree or linearlist. Jmptree and linearlist are basically "lots of
> if/else", while jumptable is a computed goto. The address of every possible
> label's code block is put into a table that is then indexed in a jmp.
> Example for x86, switching on a byte-sized #EXPRESSIONRESULT:
>    mov    #EXPRESSIONRESULT,%al
>    sub    #LOWESTLABEL,%al
>    and    $ff,%eax
>    jmp    *0x40c000(,%eax,4)
> 
> What if EXPRESSIONRESULT is larger than the largest case label? To be safe from
> that, the compiler generates a range check, so the example above actually looks
> like this:
>    mov    #EXPRESSIONRESULT,%al
>    sub    #LOWESTLABEL,%al
>    cmp    (#HIGHESTLABEL-#LOWESTLABEL),%al
>    ja     $#ELSE-BLOCK
>    and    $ff,%eax
>    jmp    *0x40c000(,%eax,4)
> 
> This is very fast because modern CPUs will correctly branch-predict the JA and
> start caching the jumptable so we effectively get the check for free, and still

If you read about branch-prediction a little bit more in detail, then you will learn that branch
prediction fails to work well as soon as their is more than one cond. jump per 16 bytes, this
basically applies to all CPUs.

> way faster than the equivalent series of "if/else" of the other strategies on
> old CPUs.
> 
> So far, so good. This is where Delphi is done and just emits the code.
> FPC however attempts one more optimization (at LEVEL1, so at the same level that
> enables jumptables in the first place): if at all possible, the range check is
> omitted (which was probably a reasonable idea back when branches were always
> expensive). The only criterion for that is if the highest and lowest value of
> EXPRESSIONRESULT's type have case labels, ie. if the jumptable will be "full".
> This makes sense for simple basetypes like this:
>   case aByteVar of
>     0: DoSomething;
>     // ... many more cases
>     255: DoSomethingLast;
>   end;
> 
> The most likely case where one might encounter this is however is with
> enumerations. Here, the criterion becomes "are the first and last declared
> element present as case labels?", and we're no longer necessarily talking about
> highest and lowest value of the basetype.
> 
> This is fine if (and only if) we can be absolutely sure that the
> EXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax in
> the example above may be anywhere up to high(basetype)'th element of the
> jumptable, loading an address from anything that happens to be located after our
> jumptable and jumping there. This is, I cannot stress this enough, extremely
> dangerous! I expect not everyone follows recent security research topics, so
> just believe me when I say that: if there is any way at all to jump "anywhere",
> a competent attacker will find a way to make that "anywhere" be malicious code.

Indeed. The same problem as with any array on the stack. You have to ensure by any means, that the
index of the array is within the declared range of the array. If you have an array with an enum as
index, you have to ensure that the enum is within the declared range, else you get the same problem
as with the case.

> 
> So, we have a problem here: either the type system is broken because we can put
> stuff in a type without being able to check if it actually belongs there, or
> Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
> be clever by omitting an essentially free check for very little benefit.
> I know which interpretation I would choose: the one with the easier fix ;-)

Yes, checking the data. I can easily create a similar problem as above with the "range checks" for
the jump table by reading a negative value into the enum. Unfortunately, the checks are unsigned ...

The correct solution is to provide a function which checks an integer based on rtti if it is valid
for a certain enum. Everything else is curing only symptoms.



More information about the fpc-devel mailing list