[fpc-devel] RFC: Support for new type "tuple" v0.1

Hans-Peter Diettrich DrDiettrich1 at aol.com
Sun Jan 27 18:46:32 CET 2013


Preface: In the following I assume that tuples can be implemented by 
records. The proposed syntax extensions can be applied to records as 
well, they are not restricted to a new type.

Alexander Klenin schrieb:
> On Sun, Jan 27, 2013 at 12:35 PM, Hans-Peter Diettrich
> <DrDiettrich1 at aol.com> wrote:
>>>   (a, b) := (b, a); // the compiler needs to ensure the correct usage of
>>> temps here!
>>
>> What will happen here?
>>
>> At compile time a tuple type (integer; integer) has to be defined, and an
>> instance must be allocated for it. Initialization and finalization
>> information/code must be added if required.
>>
>> At runtime the arguments are copied into that tuple instance, then copied
>> into the target variables. All "copies" may be subject to type conversions
>> and reference counting.
>>
>> Consider memory usage and runtime when tuples are nested, or contain large
>> data structures (records, static arrays...).
> 
> Any time large data structures are used as values (as opposed to references),
> the incur memory and run-time overhead -- parameter passing, assignment, etc.
> Tuples are neither better not worse in this aspect.

My point is the artificial overhead, introduced in above example. The 
code may *look* elegant, but in fact is a resource hog.


> Also note that in my version of proposal, nested tuples are impossible.

I consider this an inacceptable restriction. Records can contain other 
records.

> 
>>>   a := 42;
>>>   (a, e) := (a * 2, a); // (a, e) should be (84, 42), not (84, 84)
>> Such code tends to become cryptic with larger tuples.
>> High level (source code) debugging will be impossible :-(
> Why do you think so? I think adding tuples as watchable item to debugger
> is rather straight forward, although a bit redundand.

Watching tuples (records) is not enough for debugging such complex 
assignments.

> Actually, even totally independently from tuples,
> allowing to enter several comma-separated values in evaluate/modify window
> to be displayed at once would be a useful feature.

Indeed it would be nice to have more specialized watches for e.g. 
records or classes. When e.g. a class (TWinControl...) contains records 
or lists, these elements may hide following and more interesting members 
of derived classes in the output.

>>> * Possible extensions
>>>
>>> Note: This section is not completely thought through!
>>>
>>> An possible extension would be to allow the assignment of tuples to
>>> records and/or arrays (and vice versa). [...]
>> Without references to distinct tuple elements the coder has to provide local
>> variables for *all* tuple elements, then decompose the *entire* tuple,
>> before access to a single element will be possible. This may be accomplished
>> with less source code when a tuple can be assigned to a record variable, but
>> then it would be simpler to use records *instead* of tuples.
> This is why I think tuples as a data type are not needed.
> Instead, I suggest them as a powerful way to manipulate
> records, arrays, and parameter lists using one relatively simple concept.

I doubt that a concept alone will help. I see the proposed extensions as 
an attempt to introduce elements from *functional programming* 
languages, for which OPL lacks many prerequisites.

> Also, note "nil as lvalue" paragraph of my proposal, which allows to
> deconstruct only part of tuple,
> ignoring the rest.

Just another syntax for accessing record members.


> BTW, I just remembered another tuple usage example:
> type
>   T3DMatrix = array of array of array of Double;
>   TMatrixPos = array [1..3] of Double;
> 
> procedure P(A: T3DMatrix; APos1: TMatrixPos);
> begin
>   A[Tuple(APos1)] := 1.0; // Instead of A[APos[1], APos[2], APos[3]]
> end;

I'd accept a TMatrixPos as an array of Integer, but not of Double. What 
result do you expect from an index array of (-5.2, 3.7, 9.5)? Pascal 
requires explicit rounding (round, trunc, floor...) of real values to 
integers, for good reason. Tuple deconstructors can use only one of 
these functions for *every* tuple, so that the choice may be good for 
one use, but inappropriate in other situations, regardless of the 
choosen method.

Why do you expect any difference between
    A[Tuple(APos1)] := 1.0;
and
    A[APos1] := 1.0;
?

The compiler will have to create the same code for both notations, or 
use the same RTL subroutine for doing the job.


>> When a record type is modified, during development, all *compatible* tuples
>> and tuple types must be updated accordingly.
> True.

This was my point.

> Similarly, when a procedure is modified, all call sites must be updated --
> note that there is deliberate analogue between tuples and parameter
> lists in my proposal.

That's something very different, IMO. I frequently miss the lack of 
subroutine argument "templates", when e.g. implementing event handlers. 
Currently it's impossible to declare a method like
   procedure OnClick(TNotifyEvent);
or
   handler OnClick: TNotifyEvent;
with the compiler creating the right method signature (of TNotifyEvent), 
or of some much more complex event handler type.

But subroutine signatures are *declarative* elements with their own 
syntax, different from record or tuple declarations. Would you really 
extend the tuple declaration syntax, so that it could be used like e.g.
   type MyTuple: (var Integer; array Of Char; const String default '');


>>> - use for multivalues return values which can make the code more readable
>>> (instead of using records or out parameters)
>> This IMO makes sense only when such tuples are passed along many times,
>> before references to their elements occur. Otherwise full tuple
>> decomposition is required when e.g. only a succ/fail indicator in the result
>> tuple has to be examined.
> 
> Not necessarily. There are, as you know, two basic styles of error handling:
> exceptions and error codes. Most of Pascal code is based on exceptions,
> but there are cases, for example embedded programming, server/service
> programming,
> where using error codes is advantageous.
> In that case, using (result, error) tuple as a standard return value
> is a good convention.
> There are languages (Go is a most modern example), where this convention is used
> as a primary method across all standard library.

SEH has one big advantage: nobody can forget to test an error value. 
That's why it is used in Java, with further support (throws...).

One problem with error codes is their data type. Booleans are not always 
sufficient, integers have no obvious meaning. That's why I prefer to use 
enumerated types with descriptive names. Which types would your 
*standard* return tuple contain?

Again your proposal is syntactic sugar, possibly causing overhead during 
construction and deconstruction of the tuple. Since the memory for a 
complex (record) result has to be allocated by the caller, the 
subroutine copies the return values into that memory, from which they 
have to be retrieved by the caller. Of course this could be optimized, 
when the compiler is extended to move the result tuple into distinct 
hidden subroutine arguments (types omitted):
from
   function P(some; args): (result, error);
into
   procedure P(some; args; out result; out error);
what could be achieved as well by the following shorter syntax:
   procedure P(some; args; out(result; error));
with out(...) meaning that "out" applies to all parameters in the list.


>>> - use as result value for iterators (this way e.g. key and data of
>>> containers can be queried)
>> This reminds me of SQL "SELECT [fieldlist]", where *specified* record fields are copied.
> That is correct analogy -- in some variants of relational theory,
> term "tuple" is used to mean "database record".
> 
>> But I wonder how confusion can be eliminated in the order of the
>> tuple elements. Will (k,v) or (v,k) be the right order for key and value?
>> What are the proper types of key and value?
> If you want to find a substring in a string, do you write
> Pos(str, substr) or Pos(substr, str) ?
> To find out, use function signature (in the case of tuples, return
> value type fields).

The compiler cannot find out when two or more (tuple) members of the 
same type are in wrong order.

> If all else fails, there is (maybe) documentation :)

I often have to consult the documentation in such cases :-(


> Note that in my version of tuples proposal tuples are decoupled from
> "records without field names".

I'm not sure what you exactly mean. In my proposal, that tuples are 
nothing but records, it's immediately clear whether and how every syntax 
extension can be implemented. Any "decoupling" risks to run into dead 
ends, because the implementation efforts are unpredictable.


>> SSE should be used with array types, where all elements *definitely* have
>> the same type. Then one "+" operator can be implemented for open arrays of
>> any size, what looks quite impossible for tuples.
> 
> Speaking specifically of SSE, array size is fixed by the processor architecture,
> so there is no need for arbitrary sizes.
> Nevertheless, I agree that in this example, tuples are useful mostly
> as an array constructor/
> deconstructor, and the operation itself should be defined on array type.

I never used SSE (etc.), perhaps I confused it with the Cray vector 
processing.


>> IMO tuples are *abstract* templates (mathematical notation) for *concrete*
>> (record...) implementations. I see no need or purpose in the introduction of
>> such an abstract type into any concrete language, except when that languages
>> lacks an already existing record (or equivalent) type.
> I agree with the premise, but not with the conclusion. As already
> stated, I propose tuples
> as a tool/syntax for manipulation of records, not as a separate type.

So we agree basically?

It would help a lot in understanding your ideas, when you mention the 
record-based implementation for every proposed syntax extension.

DoDi




More information about the fpc-devel mailing list