Null Values

Motivation

Many programming languages provide a distinguished null value for pointer or reference types to indicate that an object reference currently does not refer to any object. On the other hand, it is usually not possible to indicate that a variable of a primitive type (such as int or char) currently does not contain any value. However, there are many circumstances where null values for all types of the language would be useful:

It is important to note that in all these circumstances, there is an essential difference between a null value representing no value at all and a default value such as zero for integral types.
It is also important to note that it is impossible in general to detect the first two cases (uninitialized variables and functions returning no value) statically at compile time, since the corresponding statements might be executed conditionally. Trying to do it anyway, necessarily results in conservative approximations by the compiler (as, e. g., in Java), which sometimes force a programmer to provide unnecessary dummy initializations or return statements in order to get a program compiled successfully.

Concept

Every type of an advanced procedural programming language, no matter whether it is built-in or user-defined, whether it is primitive or structured, possesses a unique null value representing no value at all. (Therefore, the notion of null value is actually a contradiction in itself.)

Null values are implicitly used in the following circumstances:

Furthermore, there is a generic null value constant null compatible with any type, that can be used to explicitly indicate a missing value.

Null values are propagated through all arithmetic operations on integral and floating-point values, i. e., if one operand of an arithmetic expression is null, the entire expression's value will be null, too.

The null value of a particular type is equal to itself, but different from all other values of the type. (This is in contrast to a floating-point NaN value which is different from all values including itself and other NaN values.) Furthermore, a null value is neither less nor greater than any other value of the type, i. e., it is incomparable to other values.

Any value of any type is implicitly convertible to a Boolean value by interpreting null as false and all other values as true. Consequently, the Boolean null value is equivalent to the Boolean false value, i. e., the Boolean true value is the only ``other'' value of the Boolean type.

Implementation

Ideally, null values, especially those of numeric types, should be supported directly by the hardware in order to implement arithmetic operations without performance penalties. Since off-the-shelf hardware usually does not support them, however, software implementations must be used which use a single bit of a value's representation as a null value indicator or store values as pairs consisting of a Boolean null value indicator and an actual value.

In C++, it is possible to define wrapper types for all primitive types and to overload the arithmetic operators for these in order to implement ``null-valuable primitive types.'' Furthermore, it is possible to define a ``null type'' with a single instance that is implicitly convertible to any other type in order to implement the generic null value constant null.

Publications

[1] C. Heinlein: "Null Values in Programming Languages." In: H. R. Arabnia (ed.): Proc. Int. Conf. on Programming Languages and Compilers (PLC'05) (Las Vegas, NV, June 2005), 123-129. (PostScript, PDF)
Describes the concept of null values in more detail.


Christian Heinlein, 22.09.09