[fpc-devel] Record types with unbounded trailing data

J. Gareth Moreton gareth at moreton-family.com
Mon Jun 24 05:57:35 CEST 2019


Hi everyone,

So after a problem over at https://bugs.freepascal.org/view.php?id=35753 
that led to a change of https://bugs.freepascal.org/view.php?id=35671 , 
it became apparent that there may need to be better support for 
unbounded arrays in record types - not open arrays, but unbounded ones.

One of the main culprits is the LOGPALETTE structure that is defined in 
the Windows unit:

LOGPALETTE = record
   palVersion : WORD;
   palNumEntries : WORD;
   palPalEntry : array[0..0] of PALETTEENTRY;
end;
PLogPalette = ^LOGPALETTE;

(PALETTEENTRY is a relatively simple record type with four Byte fields)

This record type, or rather, a pointer to it, is used for passing in new 
palette data to the CreatePalette() function.  The number of entries 
that the API function reads is equal to the palNumEntries field.  
However, writing these entries beforehand is difficult to do without 
raising compiler warnings because the array index will be out of 
bounds.  Because the Windows API uses such constructs in places, it is 
necessary to support them to retain full functionality.  Normally, you 
have to call GetMem to allocate memory for a pointer of type 
PLogPalette, equal to "SizeOf(LOGPALETTE) + (EntryCount - 1) * 
SizeOf(PALETTEENTRY)" - this itself is untidy because SizeOf(LOGPALETTE) 
contains the length of one iteration of SizeOf(PALETTEENTRY) due to the 
array being defined as having one element (granted, forgetting the "- 1" 
is usually not critical).

We can't really use a new keyword to define an unbounded array, so I'm 
wondering if we can work with the syntax for open arrays.  For example, 
say we redefined LOGPALETTE to the following:

LOGPALETTE = record
   palVersion : WORD;
   palNumEntries : WORD;
   palPalEntry : array of PALETTEENTRY;
end;

Currently, this doesn't work because palPalEntry contains a hidden 
pointer and length field (which is why "@OpenArray" is not equal to 
"@OpenArray[0]"), so passing a pointer to this block into 
CreatePalette() will cause apparent garbage to appear in the first few 
entries, but I'm wondering if it might it be possible to rework the 
compiler slightly so if an open array exists as part of a record type, 
the pointer and length are instead stored before the record's fields in 
memory.  That is, something akin to:

[Array pointer]   -$10
[Array length]    -$08
[palVersion]
[palNumEntries]   +$02
[palPalEntry]     +$04 (Array length * SizeOf(PALETTEENTRY))

With such a setup, one can declare a variable of type LOGPALETTE rather 
than a pointer, use SetLength to set the array length, fill in the 
elements conventionally, then pass the pointer (or a var parameter) into 
CreatePalette().  There are some nuances to work out, like does 
@LogPaletteVar return the address of the first field or the hidden array 
pointer.  The latter is more consistent, but might lead to problems with 
external API support since it won't point to the expected data.  The 
former approach fixes this, but this leads to the problem of how a 
pointer of type PLogPalette is handled by GetMem and FreeMem.

Other things to consider is what happens if a record type has more than 
one unbounded array, or a regular field is declared after it.  Also, 
what about if a variable is defined as an array of the record... how is 
that stored in memory?  Truth be said, when I ask these questions, it 
makes me wonder if we should use a new keyword, such as "open record" or 
"unbounded record" or declaring the field as "open array of 
PALETTEENTRY;" or "unbounded array of PALETTEENTRY;", as it will then be 
easier to enforce the limitations and minimise the chance of the 
compiler doing something unexpected.  This leads to the problem of a new 
keyword possibly causing existing code to no longer compile, although 
since we're dealing with variable and type definitions, the compiler and 
the code editor (for the sake of highlighting it) should be smart enough 
to realise that the word "open", if it appears in a code block, is a 
method and not a directive, and hopefully people won't declare a new 
type named "Open" (truthfully, "Unbounded" sounds more likely as a type 
name, possibly to represent an untyped pointer to a large block of 
data).  Additionally, due to how such a record is put together, 
variables of such types become internal pointers rather than direct values.

One final point... if the record type is known to be unbounded, then a 
potential syntax choice could be to define a variable as follows 
"PaletteData: LOGPALETTE[256]".  From a code generation point of view, 
all the memory required for the variable can be reserved in the function 
prologue in one go (possibly on the stack if it's not too large), saving 
an expensive reallocation by not requiring a call to SetLength, plus it 
gives the compiler a distinct upper index to work with, thus returning a 
genuine error if you attempt to go out of bounds.

It's a lot to think about for sure, and is probably overengineering a 
solution to a very niche problem (I personally can't think of where one 
would use such constructs instead of something with a fixed-size array 
or just having the header and open array stored separarely), but is it 
worth discussing?

Gareth aka. Kit


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the fpc-devel mailing list