[fpc-devel] RFC: Support for new type "tuple" v0.1

Sun Jan 27 02:35:51 CET 2013

Sven Barth schrieb:

> * Description
> 
> What are tuples? Tuples are an accumulation of values of different or 
> same type where the order matters. Sounds familiar? They are in this 
> regard similar to records, but it's only the order of an element that 
> matters, not its name. So what does make them special? Unlike records 
> you can only query or set all the elements of a tuple at once. They 
> basically behave like multiple assignments. In effect they allow you to 
> return e.g. a multivalued result value without resorting to the naming 
> of record fields (you'll still need to declare a tuple type) or the need 
> for out parameters. This in turn allows you to use them for example in 
> "for-in" loops.

The lack of element names results in bloated code and runtime overhead. 
See below.

> * Declaration:
[...]

> The usage of constructors and destructors also allows a realisation of 
> group assignment:
> 
> === code begin ===
> 
> var
>   a, b, e: Integer;
>   c, d: String;
> begin
>   a := 42;
>   c := 'Hello World';
>   (b, d) := (a, c);
>   a := 21;
>   b := 84;

>   (a, b) := (b, a); // the compiler needs to ensure the correct usage of 
> temps here!

What will happen here?

At compile time a tuple type (integer; integer) has to be defined, and 
an instance must be allocated for it. Initialization and finalization 
information/code must be added if required.

At runtime the arguments are copied into that tuple instance, then 
copied into the target variables. All "copies" may be subject to type 
conversions and reference counting.

Consider memory usage and runtime when tuples are nested, or contain 
large data structures (records, static arrays...).

>   a := 42;
>   (a, e) := (a * 2, a); // (a, e) should be (84, 42), not (84, 84)

Such code tends to become cryptic with larger tuples.
High level (source code) debugging will be impossible :-(

[...]
> * Possible extensions
> 
> Note: This section is not completely thought through!
> 
> An possible extension would be to allow the assignment of tuples to 
> records and/or arrays (and vice versa). [...]

Without references to distinct tuple elements the coder has to provide 
local variables for *all* tuple elements, then decompose the *entire* 
tuple, before access to a single element will be possible. This may be 
accomplished with less source code when a tuple can be assigned to a 
record variable, but then it would be simpler to use records *instead* 
of tuples.

When a record type is modified, during development, all *compatible* 
tuples and tuple types must be updated accordingly.

> * Possible uses
> 
> - use for group assignments which can make the code more readable
... or unreadable (see above).

> - use for multivalues return values which can make the code more 
> readable (instead of using records or out parameters)

This IMO makes sense only when such tuples are passed along many times, 
before references to their elements occur. Otherwise full tuple 
decomposition is required when e.g. only a succ/fail indicator in the 
result tuple has to be examined.

> - use as result value for iterators (this way e.g. key and data of 
> containers can be queried)

This reminds me of SQL "SELECT [fieldlist]", where *specified* record 
fields are copied. But I wonder how confusion can be eliminated in the 
order of the tuple elements. Will (k,v) or (v,k) be the right order for 
key and value? What are the proper types of key and value?

> * Implementation notes
> 
> Tuples need to pay attention to managed types (strings, interfaces, 
> etc.). Thus an Init RTTI will be required (which needs to be handled by 
> fpc_initalize/fpc_finalize accordingly).
> It might be worthwhile to add a new node type for tuple 
> constructors/deconstructors (one node type should be sufficient) and 
> handle them in assignment nodes accordingly.

I'd reuse the record type node for that purpose.

> * Open issues
> 
> Should anonymous tuples (together with tuple constructors) be allowed to 
> participate in operator search as well? This would on the one hand allow 
> the following code, but on the other hand make operator lookup rules 
> less clear (because of assignment compatibility rules):
> 
> === code begin ===
> 
> type
>   TDoubleVector = tuple of (Double, Double, Double, Double);
> 
> operator + (aLeft, aRight: TDoubleVector): TDoubleVector;
> // implement by e.g. using SSE instructions
> 
> // somewhere else
> begin
>   (d1, d2, d3, d4) := (d1, d2, d3, d4) + (1.0, 2.0, 3.0, 4.0);
> end;
> 
> === code end ===

SSE should be used with array types, where all elements *definitely* 
have the same type. Then one "+" operator can be implemented for open 
arrays of any size, what looks quite impossible for tuples.

Conclusion:

IMO tuples are *abstract* templates (mathematical notation) for 
*concrete* (record...) implementations. I see no need or purpose in the 
introduction of such an abstract type into any concrete language, except 
when that languages lacks an already existing record (or equivalent) type.

Nonetheless the discussion revealed some possible improvements of record 
handling, like default constructors/initializers for records, outside 
"const" clauses.

DoDi