[fpc-devel] Const optimization is a serious bug

Mattias Gaertner nc-gaertnma at netcologne.de
Thu Jul 7 10:04:26 CEST 2011


On Thu, 07 Jul 2011 00:15:54 -0500
Chad Berchek <ad100 at vobarian.com> wrote:

>[...]
> The difference between a feature and a bug is the specifications. Here 
> the specifications are the documentation. I have not found any 
> documentation in either FPC or Delphi that there is some implicit 
> contract whereby the programmer promises not to modify other variables 
> which happen to refer to the same instance as a const parameter. Many 
> people have repeatedly stated that this is the programmer's fault. If 
> there is an implicit agreement with the programmer, then yes I agree 
> with these statements and I believe it is not a compiler bug (although 
> certainly not good language design).

It is a general problem of languages supporting pointers that you can
change "const" data. Const can never protect 100%. 
Maybe this should be mentioned more prominent in the docs, probably with
some examples, although I don't know where a good place is.

And it is a general problem that compile time checks/protection only
works for a limited scope. Code outside the scope can not be checked by
the compiler. A const parameter is only checked for the scope of the
body. If you leave the body then the compiler can not help.
And sharing memory (s1:=s2) is a run time problem, which means it
is outside of the compiler scope too.

 
>[...]
> Yes, of course you can always fool the compiler, it just shouldn't be
> the other way around. The example you gave is very different for one
> very important reason: you show using explicit allocating and freeing of
> an object. With strings, the programmer does not, and cannot, explicitly
> allocate or deallocate the resources, and the problem lies in the
> behavior of the automatic allocation and deallocation. Thus there is
> nothing in common between this example and the problem at hand.

Why do you think so? You can explicitly allocate and deallocate using
SetLength, UniqueString, ...

 
>[...]
> Florian wrote:
> > It affects more types, even shortstring suffers from it
> 
> I must respectfully disagree. In the case of shortstring, the value of 
> the const parameter does get modified, but that is to be expected. If my 
> understanding is correct (and I'm open to be corrected), the semantics 
> of ShortString are different. With AnsiString, assigning one string 
> variable to another is supposed to create the illusion that they are 
> unique instances.

Yes, but it is documented that it is only an illusion.
AnsiStrings share data.


> Hence there is copy-on-write. With short strings, 
> assigning one to another literally means they are the same instance. 
> Again this comes back to the difference between instance and variable, 
> and the illusion implicit in AnsiString and dynamic arrays, which I 
> think is not the case with ShortString (but again I could be wrong).
> 
> The problem at issue here is the fact that the compiler can actually 
> free memory prematurely. In the case of shortstring, it won't crash. 

A crash is just a boundary check by the OS.
With Ansistring the crash happens some time after the real bug. And it
does not need to manifest as a crash. The same happens if you do that
with shortstrings or records or whatever.
As Florian already wrote this belongs to a whole category of nasty bugs
and there are tools to find them.


>[...]
> Now I will acknowledge that const can be used in certain limited 
> situations without harm. 

The term "limited" is pretty wrong. For example const string is
used at several thousands places in the lazarus and fpc sources. And
afaik they were used wrong less than ten times. In many cases it was
right and became wrong, because some called function changed.


>[...]
> To summarize:
> 
> 1. To the programmer, each AnsiString and dynamic array variable is 
> supposed to be unique, i.e., after doing A := B, modifying A does not 
> affect B and vice versa.

That's only true for a limited set of functions and operators.
For example using Move on the characters will modify both.

 
> 2. The fact that multiple variables (or parameters) actually can refer 
> to a single instance is an implementation optimization. That is of no 
> concern to the programmer.

Huh?
It is one of the main reasons to use ansistring instead of shortstring.
And many implementations rely on the runtime difference O(1) vs O(n).


> The optimization is implemented by the 
> combination of reference counting and copy-on-write.
> 
> 3. The programmer cannot be aware of what the compiler decides to do 
> regarding how it implements reference counting and copy-on-write. The 
> programmer should simply know that unique variables are unique instances 
> for all practical purposes (except var parameters obviously, since that 
> is the whole point of having to declare it var).

The 'all practical purposes' is not clear.
I guess some people think a Move is a practical function.

 
> 4. If the programmer cannot be aware of when an instance is shared with 
> multiple variables, an implicit contract that the programmer cannot 
> modify other variables which by chance are using the same instance is 
> impossible to obey, and therefore const would be useless.

And still it is not useless.

 
>[...]

Mattias



More information about the fpc-devel mailing list