[fpc-devel] Boehm garbage collector for freepascal

Hans-Peter Diettrich DrDiettrich1 at aol.com
Tue Nov 16 18:12:13 CET 2010

Thaddy schrieb:
> The rants about finalizers versus destructors in the context of GC are a 
> many.
> Let me make it clear that these discussions almost invariably presume 
> the context of the C(++) language.
> The main thing being that C++ allocates objects/classes from the stack 
> by default. In Freepascal (in Delphi mode or objfpc mode) 
> objects/classes are guaranteed to be allocated from the heap.

I doubt that this is a reasonable argument, since stack objects do not 
deserve any GC at all.

> In the context of stack allocated objects there is (are many!) a 
> conflict between finalizers and destructors which makes them mutually 
> exclusive..
> In de context of heap allocating languages like Freepascal or Delphi 
> these conflicts do not exist, not even in a serialized context.

Except for Object, the deprecated C-style object implementation.

> Those arguments are therefore invalid. Even local objects are from the 
> heap in Freepascal, as soon as the object goes out of scope it is marked 
> for deletion by the GC

That's not a general rule, unless there exist language features that 
control (disallow) e.g. passing of according references to subroutines, 
or storing them in other data structures (lists...) of extended lifetime.

> For a good understanding why this is so you have to understand how a 
> destructor in object pascal relates to the finalzer as used in the Boehm 
> GC.
> The finalizer is called - and only called - if the memory (object) is 
> already marked for deletion. This means the memory is technically 
> already unreachable.
> It is perfectly legal to call a destructor on such an object even in a 
> asynchroneous context..

Maybe, but what about owned objects and other references, still residing 
in an unreachable object? And how to prevent multiple destructor calls?

>> IMO destructors and finalizers are mutually exclusive, I remember a 
>> note like "Why a garbage collector never should call an destructor", 
>> that at least applies to mark-sweep GC.
> This is *only* true for stack allocated objects like in C++ but 
> definitely not for heap allocated objects like in freepascal and Delphi. 
> Strongly put: the fact that Freepascal allocates from the heap makes it 
> extremely suitable for the GC.

Distributed object *references*, in the stack, the objects themselves, 
and in global variables, are the biggest problem in GC. The distinction 
between stack and heap objects in contrast is not a problem, since the 
GC knows about all heap objects.

> If you read the documentation for the Boehm collector you can deduct 
> that. (Also in the context of the Java discussions on the same subject.)
>> It should be clear that a destructor, that destroys further (owned) 
>> objects, will confuse an mark-sweep garbage collector, since it can 
>> invalidate the marks. Consequently all allocated memory areas/objects 
>> should be flagged as either managed or unmanaged. Then FreeMem can 
>> decide, inside the memory manager, whether the memory block should be 
>> released immediately (if unmanaged), or should be marked for later 
>> deletion (if managed). Dunno about the concrete Boehm implementation...
> This is not the case: When the finalizer is called the memory (always 
> allocated from the heap) is already outside of the scope of the normal 
> program flow and can safely be released by the GC.

Finalization of an object can occur at any time, not only during GC.

> Mark/sweep will work safely.

Safety by the conservative approach (if in doubt, let it survive).

> The Boehm GC is smart enough to "see" the live pointers that nest inside 
> the main object.

At a high cost, as long as it has to guess what *are* pointers.

>>> Regarding refcounted strings: the way it is implemented here doesn't 
>>> carry any prize for beauty, but it seems to work alright.
>> That's a different GC model, not mark-sweep. Eventually the un/managed 
>> flag has to be extended, into managed-by-refcount and 
>> managed-by-mark-sweep.
> Mark sweep will - empirically, granted - work after f.e. the refcount is 
> 0, not before. The inner workings of the string mechanism are simply not 
> changed. That's mainly why I consider my solution not pretty, btw.
> And, frankly, I am not sure, as I wrote before.

I don't see any need why refcounted objects should be subject to another 
garbage collection. When a mark-sweep GC tries to deal with e.g. dynamic 
strings or arrays, the result can only be *false* references to other 
objects. That's why both managers should work independently.

> For a heap based language there is no conflict between a finalizer and a 
> destructor. It is perfectly acceptable to call a destructor in the 
> finalizer, because it doesn't touch the stack at all. Eliminating the 
> standard " objections" .

Your stack argument is meaningless, for several reasons. Stack objects 
never are recognized and handled by mark-sweep GC, their "destructors" 
don't deallocate their memory at all. When FreeInstance does nothing in 
the Pascal Class model - as it *must* do in a GC environment - there is 
no more difference between the destruction of stack and heap objects.

Nonetheless I'm still in doubt, whether (typical) destructor code is 
applicable as a finalizer, since it may affect other objects by calling 
their destructors. We all know about problems with multiple calls to the 
destructor of an object, when the destructor does not finalize (nil) all 
references to the destroyed objects; this is common practice, because a 
destroyed object goes away immediately after the destructor has been 
called, and some people argue that the use of FreeAndNil in an 
destructor indicates a bad design! And now you are going to call the 
destructor of an object during finalization, without knowing whether 
that destructor already has been called! :-(

The real finalization problems reside elsewhere, see the Boehm 
documentation about the finalization semantics, how to avoid loops etc. 
- all that is related to heap objects as well.


More information about the fpc-devel mailing list