User-Defined Operator Symbols and Control Structures

Concept

Programming languages such as Ada and C++ support the concept of operator overloading, i. e., the possibility to redefine the meaning of built-in operators for user-defined types. Other languages, e. g., Smalltalk, Prolog, and modern functional languages such as ML and Haskell, also allow the programmer to introduce new operator symbols in order to express application-specific operations more directly and naturally than with overloaded built-in operators or with methods or functions.
The introduction of new operator symbols (especially if they denote infix operators) immediately raises the question about their binding properties, i. e., their precedence with respect to built-in and other user-defined operators, and their associativity. In the above languages, the programmer introducing a new operator symbol is forced to assign it a fixed precedence level on a predefined absolute scale (e. g., an integral number between 0 and 9). This approach is both inflexible (for example, it is impossible to define a new operator that binds stronger than plus and minus but weaker than mult and div, if there is no gap between these operator classes in the predefined precedence scale) and overly prescriptive (because the programmer is always forced to establish precedence relationships between all operators, even though some of them might be completely unrelated and never appear together in a single expression).

The approach adopted for advanced procedural programming languages advances the above approaches in the following ways:

Examples

Unary Operators

The following declarations introduce new unary operators $ and # and define their meaning when applied prefix and/or postfix to any container type C:

    new operator $ unary;
    new operator # unary;
 
    // Return first element of container c (prefix application).
    template <typename C>
    typename C::value_type operator $ (const C& c) {
        return c.front();
    }
 
    // Return last element of container c (postfix application
    // indicated as in C++ by an additional dummy argument of type int).
    template <typename C>
    typename C::value_type operator $ (const C& c, int postfix) {
        return c.back();
    }
 
    // Return number of elements of container c (prefix or postfix
    // application since dummy argument is optional).
    template <typename C>
    typename C::value_type operator # (const C& c, int postfix = 0) {
        return c.size();
    }

Binary Operators

The following declarations introduce a new right-associative operator ^^ that binds stronger than the built-in operator * and define its meaning when applied infix or prefix to operands of type double:

    new operator ^^ right stronger *;
 
    // Return x raised to the power of y (infix application).
    double operator ^^ (double x, double y) {
        return pow(x, y);
    }
 
    // Return e (base of the natural logarithm) raised to the power of x
    // (prefix application).
    double operator ^^ (double x) {
        return exp(x);
    }

Flexary Operators

The following declarations introduce a new left-associative operator +/ whose precedence is between + and / and define its meaning when applied to any number of double operands:

    new operator +/ stronger + weaker /;
 
    // Auxiliary structure to store intermediate result.
    struct Avg {
        double sum; // Sum of values processed so far.
        int num;    // Number of these values.
        Avg (double s, int n) : sum(s), num(n) {}
    };
 
    // Application of +/ to first two operands.
    Avg operator +/ (double x, double y) {
        return Avg(x+y, 2);
    }
 
    // Application of +/ to intermediate result and next operand.
    Avg operator +/ (Avg a, double z) {
        return Avg(a.sum + z, a.num + 1);
    }
 
    // Conversion of intermediate result to final result.
    // This pseudo-operator function is called implicitly
    // at the end of an expression or subexpression.
    double operator ... (Avg a) {
        return a.sum / a.num;
    }

Operators with Lazily Evaluated Operands

The following declarations introduce a new operator AND with lazily evaluated operands whose behaviour is exactly equivalent to the C++ built-in operator &&. In particular, its second operand is evaluated only when necessary.

    new operator AND left equal && lazy;
 
    template <typename X, typename Y>
    bool operator AND (lazy<X> x, lazy<Y> y) {
        return x() && y();
    }

Fixary Operator Combinations

It is possible to define operator combinations such as FIRST/ALL/COUNT - FROM - WHERE that can be used similar to database queries in SQL, e. g.:

    struct Person {
        string name;
        bool male;
        ......
    };
    set<Person> db;
 
    Person p;
    Person ch = FIRST p FROM db WHERE p.name == "Heinlein";
    set<Person> men = ALL p FROM db WHERE p.male;
    int abcd = COUNT p FROM db WHERE "A" <= p.name && p.name < "E";

User-Defined Control Structures

By allowing statement blocks to be used as lazily evaluated operands, it is possible to define new control structures such as repeat - until or foreach which can be used as follows:

    char c;
    repeat {
        cout << "Enter y or n: ";
        cin >> c;
    } until (c == 'y' || c == 'n');
 
    set<Person> db; Person p;
    foreach (p in db) {
        cout << p.name;
        if (p.male) cout << " male";
        else cout << " female";
        ......
        cout << endl;
    }

Publications

[1] C. Heinlein: "C+++: User-Defined Operator Symbols in C++." In: P. Dadam, M. Reichert (eds.): INFORMATIK 2004 - Informatik verbindet. Band 2 (Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e. V.; September 2004; Ulm). Lecture Notes in Informatics P-51, Gesellschaft für Informatik e. V., Bonn, 2004, 459-468. (PostScript, PDF)
Describes the basic concepts of C+++, an extension of C++ supporting user-defined operator symbols.

[2] C. Heinlein: Concept and Implementation of C+++, an Extension of C++ to Support User-Defined Operator Symbols and Control Structures. Nr. 2004-02, Ulmer Informatik-Berichte, Fakultät für Informatik, Universität Ulm, August 2004. (PostScript, PDF)
A significantly extended version of the previous paper, describing the full range of C+++ features, especially the possibility to define user-defined control structures. Furthermore, the implementation of C+++ by means of a ``stupid'' precompiler for C++ is described in detail.


Christian Heinlein, 22.09.09