[fpc-devel] Boehm garbage collector for freepascal
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Tue Nov 16 18:12:13 CET 2010
Thaddy schrieb:
> The rants about finalizers versus destructors in the context of GC are a
> many.
> Let me make it clear that these discussions almost invariably presume
> the context of the C(++) language.
> The main thing being that C++ allocates objects/classes from the stack
> by default. In Freepascal (in Delphi mode or objfpc mode)
> objects/classes are guaranteed to be allocated from the heap.
I doubt that this is a reasonable argument, since stack objects do not
deserve any GC at all.
> In the context of stack allocated objects there is (are many!) a
> conflict between finalizers and destructors which makes them mutually
> exclusive..
> In de context of heap allocating languages like Freepascal or Delphi
> these conflicts do not exist, not even in a serialized context.
Except for Object, the deprecated C-style object implementation.
> Those arguments are therefore invalid. Even local objects are from the
> heap in Freepascal, as soon as the object goes out of scope it is marked
> for deletion by the GC
That's not a general rule, unless there exist language features that
control (disallow) e.g. passing of according references to subroutines,
or storing them in other data structures (lists...) of extended lifetime.
> For a good understanding why this is so you have to understand how a
> destructor in object pascal relates to the finalzer as used in the Boehm
> GC.
> The finalizer is called - and only called - if the memory (object) is
> already marked for deletion. This means the memory is technically
> already unreachable.
> It is perfectly legal to call a destructor on such an object even in a
> asynchroneous context..
Maybe, but what about owned objects and other references, still residing
in an unreachable object? And how to prevent multiple destructor calls?
>> IMO destructors and finalizers are mutually exclusive, I remember a
>> note like "Why a garbage collector never should call an destructor",
>> that at least applies to mark-sweep GC.
>>
> This is *only* true for stack allocated objects like in C++ but
> definitely not for heap allocated objects like in freepascal and Delphi.
> Strongly put: the fact that Freepascal allocates from the heap makes it
> extremely suitable for the GC.
Distributed object *references*, in the stack, the objects themselves,
and in global variables, are the biggest problem in GC. The distinction
between stack and heap objects in contrast is not a problem, since the
GC knows about all heap objects.
> If you read the documentation for the Boehm collector you can deduct
> that. (Also in the context of the Java discussions on the same subject.)
>> It should be clear that a destructor, that destroys further (owned)
>> objects, will confuse an mark-sweep garbage collector, since it can
>> invalidate the marks. Consequently all allocated memory areas/objects
>> should be flagged as either managed or unmanaged. Then FreeMem can
>> decide, inside the memory manager, whether the memory block should be
>> released immediately (if unmanaged), or should be marked for later
>> deletion (if managed). Dunno about the concrete Boehm implementation...
>>
> This is not the case: When the finalizer is called the memory (always
> allocated from the heap) is already outside of the scope of the normal
> program flow and can safely be released by the GC.
Finalization of an object can occur at any time, not only during GC.
> Mark/sweep will work safely.
Safety by the conservative approach (if in doubt, let it survive).
> The Boehm GC is smart enough to "see" the live pointers that nest inside
> the main object.
At a high cost, as long as it has to guess what *are* pointers.
>>> Regarding refcounted strings: the way it is implemented here doesn't
>>> carry any prize for beauty, but it seems to work alright.
>>
>> That's a different GC model, not mark-sweep. Eventually the un/managed
>> flag has to be extended, into managed-by-refcount and
>> managed-by-mark-sweep.
>>
> Mark sweep will - empirically, granted - work after f.e. the refcount is
> 0, not before. The inner workings of the string mechanism are simply not
> changed. That's mainly why I consider my solution not pretty, btw.
> And, frankly, I am not sure, as I wrote before.
I don't see any need why refcounted objects should be subject to another
garbage collection. When a mark-sweep GC tries to deal with e.g. dynamic
strings or arrays, the result can only be *false* references to other
objects. That's why both managers should work independently.
> For a heap based language there is no conflict between a finalizer and a
> destructor. It is perfectly acceptable to call a destructor in the
> finalizer, because it doesn't touch the stack at all. Eliminating the
> standard " objections" .
Your stack argument is meaningless, for several reasons. Stack objects
never are recognized and handled by mark-sweep GC, their "destructors"
don't deallocate their memory at all. When FreeInstance does nothing in
the Pascal Class model - as it *must* do in a GC environment - there is
no more difference between the destruction of stack and heap objects.
Nonetheless I'm still in doubt, whether (typical) destructor code is
applicable as a finalizer, since it may affect other objects by calling
their destructors. We all know about problems with multiple calls to the
destructor of an object, when the destructor does not finalize (nil) all
references to the destroyed objects; this is common practice, because a
destroyed object goes away immediately after the destructor has been
called, and some people argue that the use of FreeAndNil in an
destructor indicates a bad design! And now you are going to call the
destructor of an object during finalization, without knowing whether
that destructor already has been called! :-(
The real finalization problems reside elsewhere, see the Boehm
documentation about the finalization semantics, how to avoid loops etc.
- all that is related to heap objects as well.
DoDi
More information about the fpc-devel
mailing list