[fpc-devel] DW_AT_external and other additions to FPC generated dwarf
Martin Frb
lazarus at mfriebe.de
Thu Mar 23 09:45:21 CET 2023
After the brief exchange on
https://gitlab.com/freepascal.org/fpc/source/-/issues/40208
There are various considerations (ideas/requests) to hopefully help
improve debugging experience.
I have recently added 3 issues, but there is more. And I wanted to add a
bit of background here, since it is not all black and white.
1) Scoped Enums https://gitlab.com/freepascal.org/fpc/source/-/issues/40208
2) Unit Search order
https://gitlab.com/freepascal.org/fpc/source/-/issues/40209
3) DW_AT_external for types
https://gitlab.com/freepascal.org/fpc/source/-/issues/40210
4) "official" marker for string vs pchar vs array
5) Duplicated (artificial) types under Windows
5a) Missing address for class methods
6) "var param" for function calls / managed param
...
-----------
1) is simple to reason (IMHO)
There is an example in sysutils:
{$SCOPEDENUMS OFF}
TUseBoolStrs = (False, True);
If the debugger reads this, before getting to the definition of the
"True" (boolean), then expressions could fail if they contain the bool
constants true/false.
-----------
2) is required for looking up global vars.
A global var of the same name can exist in different units.
If paused on some code
if GlobalFoo > 5 then
The debugger needs to work out which GlobalFoo that is.
2) May or may not have an impact on type lookups. See 3.
-----------
3) DW_AT_external (or visibility) for types
After reconsidering, that one is actually more debatable. But IMHO still
useful.
unit foo;
interface
implementation
type PCHAR = ^widechar; // does not want to be seen outside this unit.
Granted this is not the most likely case to happen. But it may happen.
At first types seem to be save-ish. If a variable is declared in the
current unit (or otherwise found in the correct unit, according to "unit
search order", then the debug info of that variable points to the
correct type.
No ambiguity, not even with global types.
The issue occurs, when a user writes a watch, using type casts with
global types (that aren't from the current unit).
pchar(foo)
pinteger(foo) // this one can be ^smallint from unit system,
though that is not a implementation vs interface
TForm1(Sender)
In each case the debugger needs to find the correct type (if more that
one exists). And in each case, that is never 100% accurate, unless only
one type exists.
But imho can still benefit from the difference between implementation
and interface. Unless fully qualified, the user is unlikely to want the
above "pchar=^widechar" from some unit (maybe not even known to him).
As a side note, initially I thought that once unit-search-order is
known, the issue would be solved for good. But it wont. For
"TForm1(Sender)": "Sender: TControl" can be in units that do not use
"unit1". Yet the user would expect the debugger to find it.
And (on windows) a "uses unit1; var TempForm: TForm1" copies the
definition of "TForm1" into that unit. In that case the debugger will
always think as "TForm1" to belong to that unit. Which will likely be
correct, while paused in that unit, but may not be correct, if paused in
another unit, and just searching for the global definition of "TForm1".
So in the end the debugger will need to deal with the possibility of
ambiguity.
=> if that includes "types from implementation" is therefore not so big
of an issue. (still might be useful).
-----------
4) "official" marker for string vs pchar vs array
Not sure if that is reported already. Depending on dwarf version
"string" (ansistring) is a pointer (either TAG pointer/reference or
location expression) to
- char (dwarf 2)
- array of char (dwarf 3)
Currently for dwarf 2, the debugger can't tell the difference. If the
user says: foo[1]
The debugger does not know, if the first or second char is meant (0 or 1
based index)
With dwarf 3 the difference would be in the display format "('a', 'b',
'c') vs 'abc'.
But currently the debugger (fpdebug) can tell the difference, because
fpc has a tiny difference in how it encodes the "stride".
That is obviously an implementation detail, and not very future proof.
Therefore an "official" marker would be nice.
- it appears there is none in dwarf
- it could be a custom addition to dwarf
- documenting an "implementation detail" (such as the stride), so
fpdebug can safely rely on it.
-----------
5) Duplicated (artificial) types under Windows
As mentioned, declaring
var foo: TStringList
copies the type definition of TStringList to that unit (Windows only),
on Linux there is a cross compilation unit reference (well at least, if
the source unit has debug info, otherwise IIRC it also is a copy).
Maybe those copies should be marked DW_AT_artificial ?
From DWARF
>> A compiler may wish to generate debugging information entries for
objects or types that were not actually declared in the source of the
application
>> Any debugging information entry representing the declaration of an
object or type artificially generated by a compiler and not explicitly
declared by the source program may have a DW_AT_artificial attribute.
Then again contrary to those statements in the list of attributes for
each DW_TAG, many tags that match the description do not have it listed.
Knowing it is only a "copy" means less entries to consider when looking
up a type across units.
Especially, if the debugger may end up, having to determine if two types
in two units are equal or not. Which for structure types can mean a lot
of work. And they may even differ, because the copy omits addresses for
methods.
And that is the 2nd issue, the copy omits addresses for methods.
If a user does
MyStringList.count()
fpdebug can not call the function.
Knowing the type def is a copy, would be half of the solution. It would
still need to know, where the original resides, to find the address.
For this DW_AT_decl_file could help => though only if that file has full
debug info.
- if that file has no debug info, then there is an issue
- if that file declares the type, but does not contain a variable of
that type, the type def may have been stripped too.
Not currently sure how best to solve this.
----
6) "var param" for function calls
I have actually to double check some of this....
They simply implemented by either
- DW_TAG_reference (which if only used for this, would be ok)
- deref in the location expression (which would also apply to pointer types)
The problem occurs when fpdebug does function eval in watches.
procedure Foo(var Bar: integer);
procedure Other(Bar: PInteger);
if the user does watch: Foo(myInt);
the debugger needs to decide if to call that function. Or if to give an
error for non-matching params.
The same/similar happens for managed types. Currently the debugger must
rely on hardcoded, fpc-specific info to know if it must call ansi_decref
on a string returned from a function call (which must not be done if it
is a pchar / and if it is an array, yet another helper function needs to
be called).
Not only does the debugger need to know if a type needs calls do manage
refcounts, it also needs to know what those calls are.
More information about the fpc-devel
mailing list