GCO4020/CSC428 - Advanced Object Oriented Techniques In C++
Week 9

 

Topic 17: Fallible Types

Introduction

Robust programs check for computational and other errors they may encounter, and handle those errors in some predictable fashion. To assist in developing robust programs many standard function calls (scanf(), fork(), some forms of operator new, etc.) return error codes along with (or instead of) the values they produce.

Unfortunately, programmers frequently fail to check for such "error flags", often because embedding code to perform such checks can increase the code size dramatically, conpromise performance, and/or obscure the algorithm.

This topic looks at a simple technique which augments return values with specific validity information, in such a way that an exception is automatically generated whenever an "invalid" value is used. The technique also provides some other benefits "for free", including "chained" comparisons and a simple syntax for alternative values in assignments.


Synopsis


Source Code Examples


The virtue of "fallibility"

In the Icon programming language , many values are "conditional". That is, they either have a defined value or the special value "fail", indicating that the operation which generated them did not succeed. Failure values propagate "outwards" through a surrounding expression, and may cause an entire command to "fail" as well.

For example, Icon's "less than" operation returns a conditional value, which is the value of the second argument (if the comparison succeeds), or else "fail" (if the first argument is not less than the second). This arrangement means that a test (in Icon) like:

        if  a < b < c  then
        	write "well-ordered"

works "as expected" (whereas the equivalent code would be "dubious" at best in C++).

This is because, in Icon, the first part of the multiple comparison (a < b) evaluates either to the value of b (if a is less than b) or to "fail" (if a isn't less than b). If it evaluates to "fail", that failure immediately propagates "outwards", causing the entire if statement to fail. Otherwise the second half of the multiple comparison becomes equivalent to: b < c (since the value of b was the return value of the first comparison). This too, either fails (causing the if statement to fail) or succeeds, in which case the body of the if statement is then executed.

In C++, there is a similar "propagation of failure" mechanism built into the I/O streams library. Each istream and ostream object has a status built into it which is examined whenever the stream object is involved in an operation. For example, if an input operation leaves an istream object in a "not good" state, subsequent input operations via that object are short-circuited. Hence, if a statement like:

        cin >> a >> b >> c >> d;

is executed, and the user types in a value which cannot be read into a, the subsequent reads to b, c and d all fail (without actually attempting to read anything).

The following two sections describe a proxy¤ class (Fallible) which adds a bool status field to any type, as well as defining specific behaviours which enable that status to be checked before the value is used. Subsequent sections will explore extensions to this idiom which provide (amongst other things) Icon-like multiple comparisons.

See also: Exercise 1


Creating a Fallible proxy¤ - the "fallible" bit

The features that must be added to any class to make it "fallible" are:

These features are all independent of the actual type of the value being stored and so we can abstract them into a base class (which the Fallible proxy¤ will inherit). The base class (Failure) looks like this (see source code listing: Fallible.C):

class Failure { public: Failure(bool status = false);

bool IsValid(void) const; bool Failed(void) const; operator bool(void) const;

protected: void SetStatus(bool status); void CheckStatus(void) const;

bool myStatus; mutable bool myChecked; };

The constructor sets the internal myStatus and myChecked flags, indicating whether the value associated with a Failure object (see below) is valid and whether that validity has been checked (which, of course, it hasn't been yet):

Failure::Failure(bool status) : myStatus(status) , myChecked(false) {}

The Failure::IsValid(), Failure::Failed() and "conversion-to-bool" member functions can be used to check the validity of a Failure object (Failed() and operator bool() are simply "convenience" wrappers around IsValid()). All three also record that the check has been performed. Note that myChecked is declared mutable so that these tests can be called on (nominally) const objects:

bool Failure::IsValid() const { myChecked = true; return myStatus; }

Failure::operator bool() const { return IsValid(); }

bool Failure::Failed() const { return !IsValid(); }

Class Failure also provides two protected member functions. The first - CheckStatus() - throws an exception when called on a Failure object whose status is false and unchecked. Note that, since this action constitutes a check in itself, the myChecked member is set to true just before the exception is thrown (this prevents a single "failure to check" from raising more than a single exception).

void Failure::CheckStatus(void) const { if (!myStatus && !myChecked) { myChecked = true; throw Failure(); } }

Observe too that the exception raised is of the type Failure. In other words, the Failure class acts as its own exception class (!)

The final member function provides a means of manually setting the status of a Failure object, and ensures that such an object is subsequently marked as "unchecked":

void Failure::SetStatus(bool status) { myStatus = status; myChecked = false; }


Creating a Fallible proxy¤ - the proxy¤ bit

The Failure class provides uniform state information required by the Fallible class (much as the ios class provides uniform state information for the various stream classes in the I/O streams library).

The Fallible class builds on this state by associating a value of a specified type with it (see source code listing: Fallible.C):

template< class DataType > class Fallible : public Failure { public : Fallible(void) ; Fallible(const DataType& value, bool status = true); Fallible(const Fallible& f); Fallible(const Fallible& f, bool status);

Fallible& operator=(const Fallible& value);

operator DataType() const ;

friend istream& operator>>(istream& is, Fallible& f); friend ostream& operator<<(ostream& os, const Fallible& f);

private : DataType myValue; };

The Fallible proxy¤ is templated on the data type of the value to be stored, and inherits state information and error checking behaviour from the Failure class.

The constructors must set both value and state information appropriately. The semantics selected here (by no means the only choice) are quite conservative: uninitialized objects are "invalid", whilst initialized values are valid unless explicitly or implicitly specified otherwise (implicitly specified "invalidity" occurs when a Fallible object is initialized with another, "invalid" Fallible object):

template <class DataType> Fallible<DataType>::Fallible(void) : Failure(false) {}

template <class DataType> Fallible<DataType>::Fallible(const DataType& value, bool status) : Failure(status) , myValue(value) {}

template <class DataType> Fallible<DataType>::Fallible(const Fallible & f) : Failure(f.myStatus) , myValue(f.myValue) {}

template <class DataType> Fallible<DataType>::Fallible(const Fallible & f, bool status) : Failure(status) , myValue(f.myValue) {}

The assignment operator varies subtly from the default "member-by-member" assignment semantics. It does copy the value and status of its argument (as expected), but it copies the status by calling Failure::SetStatus(), thereby resetting to false the myChecked flag on the object being assigned to (implying that the value which is assigned to will have to be explicitly checked before it can be used):

Fallible& template< class DataType > Fallible<DataType>::operator=(const Fallible& f) { myValue = f.myValue; SetStatus(f.myStatus); return *this; }

In order to allow the proxy¤ to function as a value of type DataType, we provide a user-defined type conversion:

template< class DataType > Fallible<DataType>::operator DataType() const { CheckStatus(); return myValue; }

Note that, before the required value is made available, the Failure::CheckStatus() member function is called. This means that any attempt to access the value of a Fallible object - without first checking its validity - will fail if the value is "invalid".

These semantics are comparatively "weak" (we could always fail on accessing an invalid value, or always fail on accessing an unchecked value), but (in the normal C++ manner) they guard against mistake, not against deliberate misuse.

Most proxies provide I/O operators as a convenience, and Fallible does too (in this case, mainly because the semantics of I/O on Fallible objects is somewhat complicated by the need to set or check status).

Input to Fallible objects is unusual in that input failures set status information in both the input stream (as per usual) and also in the receiving object. Specifically, if the input operation fails, the status of receiving Fallible object is set to false:

friend istream& operator>>(istream& is, Fallible& f) { DataType val; if (is>>val) { f = val; } else { f.SetStatus(false); } return is; }

Output of Fallible objects is also interesting. Since it involves accessing the value of the object, it should fail (throwing a Failure exception) if that object is in an "invalid" state. We achieve this by calling CheckStatus() before attempting to perform the actual output operation:

friend ostream& operator<<(ostream& os, const Fallible& f) { f.CheckStatus(); return os << f.myValue; }

See also: Exercise 2

See also: Exercise 3


Using Fallible values

Fallible proxy¤ objects can, for the most part, be used just like objects of the type(s) they are proxying. The only difference is that, if those objects are given an "invalid" status then any subsequent attempt to access their values will throw a Failure exception.

For example, the following code:

Fallible<double> reciprocal(double n) { if (n==0) { return Fallible<double>(); } return 1/n; }

// AND LATER...

double n; while (cin >> n) { cout << reciprocal(n) << endl; // MAY THROW Failure EXCEPTION }

will throw an exception (from within the output operation) on any attempt to divide by zero. Note that, this exception is not generated from within the call to reciprocal(), which returns successfully whether the reciprocation succeeds or not. It is only when operator<< attempts to access the "invalid" value of the Fallible object returned by reciprocal() that the exception is thrown.

Alternatively, we might choose to forestall the exception by checking the return value explicitly:

double n; while ( cin >> n ) { Fallible<double> r = reciprocal(n); if (r.IsValid()) { cout << r << endl; } }

The value of returning a Fallible object is that we can write a function (like reciprocal) which can be used with this type of explicit validity testing, and yet the same function can also provide automatically thrown exceptions if we choose not to test explicitly (or forget to provide such a test!)

As another alternative, we might instead choose to push the "fallibility" back a step, by creating "fallible division" (or other such operators):

template <class DataType> Fallible<DataType> operator/ (Fallible<DataType> n, Fallible<DataType> d) { if (n.Failed() || d.Failed() || (DataType)d == 0) { return Fallible<DataType>(); } else { return (DataType)n / (DataType)d; } }

// AND LATER...

Fallible<double> n;

while ( cin >> n ) { cout << 1/n << endl; // MAY THROW Failure EXCEPTION }

In this case, any division involving a Fallible object first checks the status of both arguments as well as the value of the denominator, looking for conditions that would result in failure of the division. If any such condition is found, the operation returns an uninitialized value (with the all-important "invalid" status). Otherwise, it simply converts both arguments to the underlying type and performs the division.

Hence, attempting to divide "invalid" values produces an "invalid" result, thereby propagating the point of "failure" outwards to the surrounding expression (as in Icon). The following two sections expand on this notion of overloading "fallible" operators to propagate failure.

See also: Exercise 4


Chaining "fallible" operations

As an example of the utility of chaining "fallible operators" together to propagate failure, let us return to the Icon example of multiple comparisons.

We can provide the same ability to chain inequalities in C++, simply by overloading the various comparison operators for Fallible arguments. For example (see source code listing: FallibleCompare.C):

template <class DataType> Fallible<DataType> operator< (Fallible<DataType> a, Fallible<DataType> b) { if (a.IsValid() && b.IsValid() && (DataType)a < (DataType)b) { return b; } else { return Fallible<DataType>(); } }

With this operator defined, a "less than" comparison between two Fallible values returns a Fallible value. That value is "invalid" if either argument was "invalid", or if the comparison was not true. Otherwise, the value returned is that of the second argument. Hence, by the same logic as presented for the equivalent construct in Icon, the following code will now act as might reasonably be expected:

Fallible<string> first; Fallible<string> last; Fallible<string> next;

cin >> first >> last;

while (cin >> next) { if (first < next < last) { cout << "In range" << endl; } }

See also: Exercise 5

See also: Exercise 6


A "fallible" selection operator

Another use for chained "fallible operations" is to provide a means of selecting the first "valid" value in a series of potentially "invalid" Fallible objects. We can acheive this by overloading operator| as follows (we assume it isn't required for its normal bitwise or'ing duties - see source code listing: FallibleSelect.C):

template <class DataType> Fallible<DataType> operator| (Fallible<DataType> a, Fallible<DataType> b) { if (a.IsValid()) { return a; } else { return b; } }

Now, if we have a series of Fallible objects (say, t_local, t_global, and t_default) and we wish to assign the value of the first "valid" one to a variable (say t), we can do it like so:

Fallible<char> t = t_local | t_global | t_default | '\n';

Provided the three t_... variables are of type Fallible<char>, in this case, the subexpression t_local | t_global returns the value of t_local if it is "valid", otherwise it returns the value of t_global (whether it is "valid" or not!) This returned value then becomes the first argument to the ... | t_default subexpression, and the selection process repeats.

Finally, the selected value (which may still be "invalid") is compared with a (temporary) Fallible object containing '\n'. This object will definitely be "valid" (because the Fallible constructor is designed to ensure that promoted literal values are always "valid") and will be selected if no better "valid" alternative was found already.

See also: Exercise 7

See also: Exercise 8


Exercises

There are 8 exercises associated with this topic.

 


This material is part of the GCO4020/CSC428 - Advanced Object Oriented Techniques In C++ course.
Copyright © Damian Conway, 1997. All rights reserved.

Last updated: Fri Feb 18 11:18:33 2000