GCO4020/CSC428 - Advanced Object Oriented Techniques In C++
Week 11

 

Topic 21: Functors I

Introduction

This topic describes a range of related techniques which centre around the use of objects to encapsulate functions.


Synopsis


Source Code Examples


The functor concept

A "functor" is an object which acts like (and is used in place of) a function or operator. Typically, that means that such objects have one or more overloaded operator() member functions, and so can be "called" just like a regular C++ function.

The obvious question is: if they are called like a function, and act like a function, and are used in place of a function, why not just use an actual function in the first place?

There are various reasons for using functors instead of functions. They include:


Customized ordering in an STL¤ map

Functor classes are integral to the use of the Standard Template Library¤. For example, when declaring an STL¤ map object, we typically write something like:

map<string, int, less<string> > map_object;

In this declaration, the less<string> template parameter is actually the name of a (templated) functor class, which is defined as:

template <class T> struct less { bool operator()(const T& x, const T& y) const { return x < y; } };

This definition means that an object of type less<string> provides an overloaded operator(), which takes two string references and returns true if the first is less than the second.

The STL¤ map class uses an object of this type (as specified by its third template parameter) to sort its map keys. The useful thing about such an arrangement is that we can specialize the map class¤ with a parameter specifying any class with a overloaded operator() and thus get other kinds of sorts.

For example, defining and using the following class (see source code listing: MapOrder.C):

class Reversed { public: bool operator()(string s1, string s2) const { return s1 > s2; } };

map<string, int, Reversed> rev_map;

means that rev_map will store its keys in reverse alphabetical order (because the "less than" operator it will use to sort them is reversed). Alternatively, the class:

class Longest { public: bool operator()(string s1, string s2) const { return s1.length() > s2.length() || s1.length() == s2.length() && s1 < s2; } };

map<string, int, Longest> long_map;

means that long_map will store its keys such that longer keys appear first (whilst keys of the same length stay in alphabetical order).

This approach to making comparison operations modular (that is, decoupling them from the actual code that uses them) is a powerful generic¤ programming technique that is widely used in the Standard Template Library¤. For example, the STL¤ sort function is declared:

template <class Iterator, class Compare> void sort(Iterator first, Iterator last, Compare comp);

This means that any "random access" STL¤ container can be sorted using any suitable comparison functor. For example, suppose we had a vector of strings which had to be sorted so that all strings are grouped alphabetically by their first letter (regardless of its case), but otherwise retain their existing order within each letter grouping. We can accomplish this moderately challenging task with a surprisingly simple functor and the standard STL¤ sort function:

void sortByFirstLetter(vector<string>& strvector) { struct GroupFirstLetter { bool operator()(const string& a, const string& b) const { return tolower(a[0]) <= tolower(b[0]); } }

sort(strvector.begin(), strvector.end(), GroupFirstLetter()); }

Note that the use of a functor has two important benefits here:

See also: Exercise 1


Unifying the WhileReceptor and UntilReceptor classes

This idea of using functor classes to "insert" fragments of code into templated classes is used widely in the Standard Template Library¤, but has many other applications. For example, in "Topic 18: Receptors" it was pointed out that the WhileReceptor and UntilReceptor classes could be combined by templating them on an appropriate functor. This section describes how that goal may be achieved (you may find it useful to revise "Topic 18: Receptors" before studying this section).

We proceed by observing that the actual input loops for the WhileReceptor and UntilReceptor classes (that is, their respective ReceiveFrom() member functions) are identical except for a single comparison operator (at the commented lines):

istream& WhileReceptor::ReceiveFrom(istream& is) { if (!is) { return is; } myValue = ""; char nextChar; while (1) { if (is.peek() == EOF) { if (myValue.length()==0) {is.get(nextChar);} break; } if (!is.get(nextChar)) { break; } /* HERE */ if (myCharSet.find(nextChar) == string::npos) { is.putback(nextChar); break; } myValue += nextChar; if (myMaxCount > 0 && myValue.length() >= myMaxCount) { break; } } return is; }

istream& UntilReceptor::ReceiveFrom(istream& is) { if (!is) { return is; } myValue = ""; char nextChar; while (1) { if (is.peek() == EOF) { if (myValue.length()==0) {is.get(nextChar);} break; } if (!is.get(nextChar)) { break; } /* HERE */ if (myCharSet.find(nextChar) != string::npos) { is.putback(nextChar); break; } myValue += nextChar; if (myMaxCount > 0 && myValue.length() >= myMaxCount) { break; } } return is; }

If this uniquely distinct line were to be replaced by a call to a functor (specified as a template argument), then the two classes could be implemented as a single template and specialized appropriately.

That template class (CharSetReceptor) is declared as follows (see source code listing: WhileUntilReceptor.h):

template <class Functor> class CharSetReceptor : public Receptor { public: CharSetReceptor(string charSet, int maxCount = 0);

protected: virtual istream& ReceiveFrom(istream& is); };

Note that, apart from being templated on a functor type, CharSetReceptor is identical in declaration to both the WhileReceptor and UntilReceptor classes.

The constructor, like that of all other derived Receptor types, simply passes its arguments to the base class constructor:

template <class Functor> CharSetReceptor<Functor>::CharSetReceptor(string charSet, int maxCount) : Receptor(charSet, maxCount) {}

whilst the ReceiveFrom() member function is identical to those of the WhileReceptor and UntilReceptor classes, except that it uses a static functor object (compare) to perform the embedded comparison:

template <class Functor> istream& CharSetReceptor<Functor>::ReceiveFrom(istream& is) { static Functor compare; if (!is) { return is; }

myValue = ""; char nextChar; while (1) { if (is.peek() == EOF) { if (myValue.length()==0) { is.get(nextChar); } break; } if (!is.get(nextChar)) { break; }

if (compare(myCharSet.find(nextChar),string::npos)) { is.putback(nextChar); break; } else { myValue += nextChar; }

if (myMaxCount > 0 && myValue.length() >= myMaxCount) { break; } } return is; } }

Having thus decoupled the comparison from the input loop, we can create suitable functors to implement the required equality and inequality tests:

class Equality { public: bool operator()(string::size_type a, string::size_type b) { return a==b; } };

class Inequality { public: bool operator()(string::size_type a, string::size_type b) { return a!=b; } };

With these definitions, class WhileReceptor can be replaced with CharSetReceptor<Equality>, and class UntilReceptor with CharSetReceptor<Inequality>. Better still, since these new class names are somewhat obscure, we could reinstate the original names using a typedef:

typedef CharSetReceptor<Equality> WhileReceptor; typedef CharSetReceptor<Inequality> UntilReceptor;

See also: Exercise 2

See also: Exercise 3

See also: Exercise 4


A simple encoder functor

Yet another use for functor objects is to simplify and enhance the encapsulation¤ of state information within a function. Typically, if a function needs some static state (that is, one or more variables whose values persist between calls to the function), a local static variable is used:

string encode(string s) { static char state = 'a'; for (int i=0; i<s.length(); i++) { state = s[i] = (state + s[i]) & 0x7F; } return s; }

The above function implements a simple encryption algorithm for character strings. Each character in the string passed to the function is encoded by adding that character to the (cumulative) state information stored in the local static state variable (and then clipping the result to seven bits).

Hence calling encode("the quick brown fox") results in the string "U="B3(t_aSB9'G-", because

        ('t'+'a') & 0x7F  --->  'U'
        ('h'+'U') & 0x7F  --->  '='
        ('e'+'=') & 0x7F  --->  '"'
        (' '+'"') & 0x7F  --->  'B'
        ('q'+'B') & 0x7F  --->  '3'

et cetera.

This approach is well-structured and modular in its encapsulation¤ of the function's state information, but suffers from two main problems:

In order to overcome these problems, we need to store the function's internal state in some other form, typically as private data in an object (which could still be called encode). If that object is set up as a functor, we can still write encode("the quick brown fox"), but with the added benefit that other member functions of the encode object can allow us to reset the internal state if necessary.

The required functor class looks like this: (see source code listing: Encoder.C):

class EncodeFunctor { public: EncodeFunctor(char state);

void Reset(char state);

string operator()(string s);

private: char myState; };

The constructor simply initializes the functors internal state:

EncodeFunctor::EncodeFunctor(char state) : myState(state) {}

whilst the Reset() member function provides "write-only" access to that state, allowing us to reset it at any point:

void Reset(char state) { myState = state; }

The overloaded operator() member function looks almost exactly like the original encode() function (as it should, since they implement exactly the same algorithm). The only difference between the two is that EncodeFunctor::operator() does not need to declare a local static state variable, since it has access to the functor object's myState data member, which fulfils the same purpose:

EncodeFunctor(char state) string operator()(string s) { for (int i=0; i<s.length(); i++) { myState = s[i] = (myState + s[i]) & 0x7F; } return s; }

Hence, instead of storing state in a static variable within the function itself, the encoding state is stored in a data member of the functor object. The most obvious advantage is that we now have controlled access to the state information, so that it can be reset whenever necessary:

EncodeFunctor encode('a');

cout << encode("the quick brown fox") << endl;

encode.Reset('a');

cout << encode("jumps over the lazy dog") << endl;

Better still we can set up more than one EncodeFunctor object and do parallel or nested encoding:

EncodeFunctor encode1('a'); EncodeFunctor encode2('b'); EncodeFunctor encode3('a'); EncodeFunctor encode4('b');

while (cin >> str) { cout << encode1(str) << endl; cout << encode2(str) << endl; cout << encode3(encode4(str)) << endl; }

See also: Exercise 5


Exercises

There are 5 exercises associated with this topic.

 


This material is part of the GCO4020/CSC428 - Advanced Object Oriented Techniques In C++ course.
Copyright © Damian Conway, 1997. All rights reserved.

Last updated: Fri Feb 18 11:18:58 2000