Tuples are a simple mathematical concept which turns out to be useful in programming languages as well. For instance, if a function receives multiple parameters, this is equivalent to receiving a tuple of values as a single parameter. Similarly, a function could easily return multiple values by returning a tuple of values, without needing to define a particular record or class type for that purpose or resorting to VAR, out, or reference parameters.
Combined with a template or generics mechanism,
the approach to interpret multiple parameters as a single tuple parameter
opens the way for type-safe variable parameter lists.
In the example below,
an overloaded function print
is defined
for several different parameter types,
such as char
, int
, double
, etc.
Furthermore,
a templated version of print
is defined
which accepts any type of parameter T
including tuple types:
// Simple definitions. void print (char c) { cout << c; } void print (int i) { cout << i; } void print (double d) { cout << d; } // Template definition. template <typename T> void print (T values) { print(^values); print($values); }Calls to
print
with a single argument,
such as print(1)
or print(2.0)
,
directly invoke the corresponding simple definition
whose parameter type is matched exactly.
A call with multiple arguments,
such as print('x', 1, 2.0)
,
is interpreted as a call with a single tuple argument ('x', 1, 2.0)
that invokes the template definition of the function
with the parameter values
representing this tuple value.
^
and $
are assumed to extract the first member of the given tuple (as a simple value)
and the remaining members (as a tuple if necessary), respectively.
Therefore,
^values
returns the character value 'x'
,
while $values
returns the tuple (1, 2.0)
.
Calling print
with the former
invokes the simple definition with parameter type character
,
while calling it with the latter
“recursively” invokes the template definition
with parameter values
equal to the shortened tuple (1, 2.0)
.
^values
and $values
correspond to the simple values 1
and 2.0
,
respectively,
i. e., the calls to print
will invoke the simple definitions with parameter type int
and double
,
respectively.
A sequence is a kind of tuple
whose elements are all of the same type,
similar to an array.
In contrast to arrays (and other standard container types), however,
sequences are immutable,
similar to strings in Java and other languages.
There are operators to determine the length of a sequence s
(#s
),
to retrieve a single element (s[i]
),
to select arbitrary subsequences (s(i, j)
),
and to concatenate two sequences (s # t
).
By using these primitive operations,
it is easily possible to perform more complex operations,
e. g.:
s = s # x // Append x to s. s = x # s // Prepend x to s. s = s(2, #s) // Remove first element of s. s = s(1, #s - 1) // Remove last element of s. s = s(1, i-1) # s(i+1, #s) // Remove i-th element of s.In the examples above, the expression on the right hand side of the assignment operator constructs a new sequence by selecting subsequences of the existing sequence
s
(which is immutable)
and/or concatenating sequences to form larger sequences.
Afterwards,
the resulting sequence is assigned to the variable s
that contained the original sequence.
The fact that sequences are immutable
has at least two important conceptual advantages
over mutable containers:
First,
an iteration through a sequence
can never be disturbed by update operations.
Second,
if some type T
is a subtype of some other type S
(e. g., Student
is a subtype of Person
),
i. e., objects of type T
can polymorphically be used as objects of type S
,
sequences of T
s can safely be used as sequences of S
s, too
(e. g., a sequence of Student
s can be used
where a sequence of Person
s is required).
This is similar to the subtyping relationship
between the array types T[]
and S[]
in Java,
but without the possibility of ever receiving an ArrayStoreException
,
since assignments to individual sequence elements are generally forbidden.
Furthermore,
immutable sequences offer several possibilities
for an efficient implementation:
To construct a subsequence of a given sequence, especially a large one, it is not necessary to copy all selected elements; instead, a new sequence can be constructed which simply references the selected part of the original one. Therefore, subsequences of any length can basically be constructed in constant time.
Similarly, the concatenation of two sequences can be constructed as a new sequence which simply references the two original ones, resulting again in a constant time effort.
Even when sequences are copied or passed as function arguments by value, it suffices to copy a reference to the original sequence.
Strings are nothing else but sequences of characters. In fact, sequences are a generalization of “Ropes: An Alternative to Strings,” proposed by H.-J. Boehm et al. in Software–Practice and Experience 25 (12) December 1995.
It is possible to define sequences
whose elements are not specified explicitly,
but are computed on demand by invoking a user-defined function
with the element's index as a parameter.
Such virtual sequences are useful in various ways.
For example,
an operating system file
might be represented as a virtual sequence of bytes or characters,
whose computation function reads and returns the character
at a specified index position.
In contrast to reading the entire file at once
and constructing a normal sequence containing all its characters,
this approach can result in significant space and execution time savings.
In an application such as a text editor,
such a virtual sequence can be used in the same way as a normal sequence,
e. g., by selecting subsequences,
concatenating them with normal or other virtual sequences, etc.
Sequences are implemented as a tree-like structure whose leaf nodes contain small to medium-sized arrays of elements. Not too large subsequences and concatenations are constructed by copying their elements, while larger result sequences are represented by tree nodes which simply reference (parts of) the trees representing the original sequences. To make sure that accessing individual elements still remains reasonably fast, care must be exercised to produce trees of limited depth whose subtrees are similar in size. In contrast to Boehm et al. (cited above), who use binary trees, a data structure more akin to B* trees, combined with a clever indexing strategy, is used in order to speed up random access to individual elements.