[fpc-devel] Record types with unbounded trailing data
J. Gareth Moreton
gareth at moreton-family.com
Mon Jun 24 05:57:35 CEST 2019
Hi everyone,
So after a problem over at https://bugs.freepascal.org/view.php?id=35753
that led to a change of https://bugs.freepascal.org/view.php?id=35671 ,
it became apparent that there may need to be better support for
unbounded arrays in record types - not open arrays, but unbounded ones.
One of the main culprits is the LOGPALETTE structure that is defined in
the Windows unit:
LOGPALETTE = record
palVersion : WORD;
palNumEntries : WORD;
palPalEntry : array[0..0] of PALETTEENTRY;
end;
PLogPalette = ^LOGPALETTE;
(PALETTEENTRY is a relatively simple record type with four Byte fields)
This record type, or rather, a pointer to it, is used for passing in new
palette data to the CreatePalette() function. The number of entries
that the API function reads is equal to the palNumEntries field.
However, writing these entries beforehand is difficult to do without
raising compiler warnings because the array index will be out of
bounds. Because the Windows API uses such constructs in places, it is
necessary to support them to retain full functionality. Normally, you
have to call GetMem to allocate memory for a pointer of type
PLogPalette, equal to "SizeOf(LOGPALETTE) + (EntryCount - 1) *
SizeOf(PALETTEENTRY)" - this itself is untidy because SizeOf(LOGPALETTE)
contains the length of one iteration of SizeOf(PALETTEENTRY) due to the
array being defined as having one element (granted, forgetting the "- 1"
is usually not critical).
We can't really use a new keyword to define an unbounded array, so I'm
wondering if we can work with the syntax for open arrays. For example,
say we redefined LOGPALETTE to the following:
LOGPALETTE = record
palVersion : WORD;
palNumEntries : WORD;
palPalEntry : array of PALETTEENTRY;
end;
Currently, this doesn't work because palPalEntry contains a hidden
pointer and length field (which is why "@OpenArray" is not equal to
"@OpenArray[0]"), so passing a pointer to this block into
CreatePalette() will cause apparent garbage to appear in the first few
entries, but I'm wondering if it might it be possible to rework the
compiler slightly so if an open array exists as part of a record type,
the pointer and length are instead stored before the record's fields in
memory. That is, something akin to:
[Array pointer] -$10
[Array length] -$08
[palVersion]
[palNumEntries] +$02
[palPalEntry] +$04 (Array length * SizeOf(PALETTEENTRY))
With such a setup, one can declare a variable of type LOGPALETTE rather
than a pointer, use SetLength to set the array length, fill in the
elements conventionally, then pass the pointer (or a var parameter) into
CreatePalette(). There are some nuances to work out, like does
@LogPaletteVar return the address of the first field or the hidden array
pointer. The latter is more consistent, but might lead to problems with
external API support since it won't point to the expected data. The
former approach fixes this, but this leads to the problem of how a
pointer of type PLogPalette is handled by GetMem and FreeMem.
Other things to consider is what happens if a record type has more than
one unbounded array, or a regular field is declared after it. Also,
what about if a variable is defined as an array of the record... how is
that stored in memory? Truth be said, when I ask these questions, it
makes me wonder if we should use a new keyword, such as "open record" or
"unbounded record" or declaring the field as "open array of
PALETTEENTRY;" or "unbounded array of PALETTEENTRY;", as it will then be
easier to enforce the limitations and minimise the chance of the
compiler doing something unexpected. This leads to the problem of a new
keyword possibly causing existing code to no longer compile, although
since we're dealing with variable and type definitions, the compiler and
the code editor (for the sake of highlighting it) should be smart enough
to realise that the word "open", if it appears in a code block, is a
method and not a directive, and hopefully people won't declare a new
type named "Open" (truthfully, "Unbounded" sounds more likely as a type
name, possibly to represent an untyped pointer to a large block of
data). Additionally, due to how such a record is put together,
variables of such types become internal pointers rather than direct values.
One final point... if the record type is known to be unbounded, then a
potential syntax choice could be to define a variable as follows
"PaletteData: LOGPALETTE[256]". From a code generation point of view,
all the memory required for the variable can be reserved in the function
prologue in one go (possibly on the stack if it's not too large), saving
an expensive reallocation by not requiring a call to SetLength, plus it
gives the compiler a distinct upper index to work with, thus returning a
genuine error if you attempt to go out of bounds.
It's a lot to think about for sure, and is probably overengineering a
solution to a very niche problem (I personally can't think of where one
would use such constructs instead of something with a fixed-size array
or just having the header and open array stored separarely), but is it
worth discussing?
Gareth aka. Kit
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
More information about the fpc-devel
mailing list