GCO4020/CSC428 - Advanced Object Oriented Techniques In C++
Week 11
This topic describes a range of related techniques which centre around the use of objects to encapsulate functions.
map
WhileReceptor and UntilReceptor classes
A "functor" is an object which acts like (and is used in place of) a function
or operator. Typically, that means that such objects have one or more
overloaded operator() member functions, and so can be "called" just like a
regular C++ function.
The obvious question is: if they are called like a function, and act like a function, and are used in place of a function, why not just use an actual function in the first place?
There are various reasons for using functors instead of functions. They include:
map
Functor classes are integral to the use of the Standard Template Library¤.
For example, when declaring an STL¤ map object, we typically write
something like:
map<string, int, less<string> > map_object;
In this declaration, the less<string> template parameter is actually the
name of a (templated) functor class, which is defined as:
template <class T> struct less { bool operator()(const T& x, const T& y) const { return x < y; } };
This definition means that an object of type less<string>
provides an overloaded operator(), which takes two string
references and returns true if the first is less than the second.
The STL¤ map class uses an object of this type (as specified by
its third template parameter) to sort its map keys. The useful thing
about such an arrangement is that we can specialize the map class¤ with
a parameter specifying any class with a overloaded
operator() and thus get other kinds of sorts.
For example, defining and using the following class (see source code listing: MapOrder.C):
class Reversed { public: bool operator()(string s1, string s2) const { return s1 > s2; } };
map<string, int, Reversed> rev_map;
means that rev_map will store its keys in reverse alphabetical
order (because the "less than" operator it will use to sort them is
reversed). Alternatively, the class:
class Longest { public: bool operator()(string s1, string s2) const { return s1.length() > s2.length() || s1.length() == s2.length() && s1 < s2; } };
map<string, int, Longest> long_map;
means that long_map will store its keys such that longer keys
appear first (whilst keys of the same length stay in alphabetical
order).
This approach to making comparison operations modular (that is,
decoupling them from the actual code that uses them) is a powerful
generic¤ programming technique that is widely used in the Standard
Template Library¤. For example, the STL¤ sort function is
declared:
template <class Iterator, class Compare> void sort(Iterator first, Iterator last, Compare comp);
This means that any "random access" STL¤ container can be sorted using
any suitable comparison functor. For example, suppose we had a vector
of strings which had to be sorted so that all strings are grouped
alphabetically by their first letter (regardless of its case), but
otherwise retain their existing order within each letter grouping. We
can accomplish this moderately challenging task with a surprisingly
simple functor and the standard STL¤ sort function:
void sortByFirstLetter(vector<string>& strvector) { struct GroupFirstLetter { bool operator()(const string& a, const string& b) const { return tolower(a[0]) <= tolower(b[0]); } }
sort(strvector.begin(), strvector.end(), GroupFirstLetter()); }
Note that the use of a functor has two important benefits here:
See also: Exercise 1
WhileReceptor and UntilReceptor classes
This idea of using functor classes to "insert" fragments of code into
templated classes is used widely in the Standard Template Library¤, but
has many other applications. For example, in "Topic 18: Receptors" it
was pointed out that the WhileReceptor and UntilReceptor
classes could be combined by templating them on an appropriate
functor. This section describes how that goal may be achieved (you may
find it useful to revise "Topic 18: Receptors" before studying this
section).
We proceed by observing that the actual input loops for the
WhileReceptor and UntilReceptor classes (that is, their
respective ReceiveFrom() member functions) are identical except
for a single comparison operator (at the commented lines):
istream& WhileReceptor::ReceiveFrom(istream& is) { if (!is) { return is; } myValue = ""; char nextChar; while (1) { if (is.peek() == EOF) { if (myValue.length()==0) {is.get(nextChar);} break; } if (!is.get(nextChar)) { break; } /* HERE */ if (myCharSet.find(nextChar) == string::npos) { is.putback(nextChar); break; } myValue += nextChar; if (myMaxCount > 0 && myValue.length() >= myMaxCount) { break; } } return is; }
istream& UntilReceptor::ReceiveFrom(istream& is) { if (!is) { return is; } myValue = ""; char nextChar; while (1) { if (is.peek() == EOF) { if (myValue.length()==0) {is.get(nextChar);} break; } if (!is.get(nextChar)) { break; } /* HERE */ if (myCharSet.find(nextChar) != string::npos) { is.putback(nextChar); break; } myValue += nextChar; if (myMaxCount > 0 && myValue.length() >= myMaxCount) { break; } } return is; }
If this uniquely distinct line were to be replaced by a call to a functor (specified as a template argument), then the two classes could be implemented as a single template and specialized appropriately.
That template class (CharSetReceptor) is declared as follows
(see source code listing: WhileUntilReceptor.h):
template <class Functor> class CharSetReceptor : public Receptor { public: CharSetReceptor(string charSet, int maxCount = 0);
protected: virtual istream& ReceiveFrom(istream& is); };
Note that, apart from being templated on a functor type,
CharSetReceptor is identical in declaration to both the
WhileReceptor and UntilReceptor classes.
The constructor, like that of all other derived Receptor types, simply
passes its arguments to the base class constructor:
template <class Functor> CharSetReceptor<Functor>::CharSetReceptor(string charSet, int maxCount) : Receptor(charSet, maxCount) {}
whilst the ReceiveFrom() member function is identical to those of the WhileReceptor and UntilReceptor
classes, except that it uses a static functor object (compare) to
perform the embedded comparison:
template <class Functor> istream& CharSetReceptor<Functor>::ReceiveFrom(istream& is) { static Functor compare; if (!is) { return is; }
myValue = ""; char nextChar; while (1) { if (is.peek() == EOF) { if (myValue.length()==0) { is.get(nextChar); } break; } if (!is.get(nextChar)) { break; }
if (compare(myCharSet.find(nextChar),string::npos)) { is.putback(nextChar); break; } else { myValue += nextChar; }
if (myMaxCount > 0 && myValue.length() >= myMaxCount) { break; } } return is; } }
Having thus decoupled the comparison from the input loop, we can create suitable functors to implement the required equality and inequality tests:
class Equality { public: bool operator()(string::size_type a, string::size_type b) { return a==b; } };
class Inequality { public: bool operator()(string::size_type a, string::size_type b) { return a!=b; } };
With these definitions, class WhileReceptor can be replaced with
CharSetReceptor<Equality>, and class UntilReceptor with
CharSetReceptor<Inequality>. Better still, since these new class
names are somewhat obscure, we could reinstate the original names
using a typedef:
typedef CharSetReceptor<Equality> WhileReceptor; typedef CharSetReceptor<Inequality> UntilReceptor;
See also: Exercise 2
See also: Exercise 3
See also: Exercise 4
Yet another use for functor objects is to simplify and enhance the encapsulation¤ of state information within a function. Typically, if a function needs some static state (that is, one or more variables whose values persist between calls to the function), a local static variable is used:
string encode(string s) { static char state = 'a'; for (int i=0; i<s.length(); i++) { state = s[i] = (state + s[i]) & 0x7F; } return s; }
The above function implements a simple encryption algorithm for character
strings. Each character in the string passed to the function is encoded
by adding that character to the (cumulative) state information stored in the
local static state variable (and then clipping the result to seven bits).
Hence calling encode("the quick brown fox") results in the string
"U="B3(t_aSB9'G-", because
('t'+'a') & 0x7F ---> 'U'
('h'+'U') & 0x7F ---> '='
('e'+'=') & 0x7F ---> '"'
(' '+'"') & 0x7F ---> 'B'
('q'+'B') & 0x7F ---> '3'
et cetera.
This approach is well-structured and modular in its encapsulation¤ of the function's state information, but suffers from two main problems:
encode() function.
encode() to consistently
encode multiple strings, since the encoding of the
one string changes the initial state used to
encode the next.
encode). If that object is
set up as a functor, we can still write
encode("the quick brown fox"), but with the added benefit that
other member functions of the encode object can allow us to reset
the internal state if necessary.
The required functor class looks like this: (see source code listing: Encoder.C):
class EncodeFunctor { public: EncodeFunctor(char state);
void Reset(char state);
string operator()(string s);
private: char myState; };
The constructor simply initializes the functors internal state:
EncodeFunctor::EncodeFunctor(char state) : myState(state) {}
whilst the Reset() member function provides "write-only" access to that state, allowing us to reset it at any point:
void Reset(char state) { myState = state; }
The overloaded operator() member function looks almost exactly
like the original encode() function
(as it should, since they implement exactly the same algorithm). The
only difference between the two is that EncodeFunctor::operator() does
not need to declare a local static state variable, since it has access
to the functor object's myState data member, which fulfils the
same purpose:
EncodeFunctor(char state) string operator()(string s) { for (int i=0; i<s.length(); i++) { myState = s[i] = (myState + s[i]) & 0x7F; } return s; }
Hence, instead of storing state in a static variable within the function itself, the encoding state is stored in a data member of the functor object. The most obvious advantage is that we now have controlled access to the state information, so that it can be reset whenever necessary:
EncodeFunctor encode('a');
cout << encode("the quick brown fox") << endl;
encode.Reset('a');
cout << encode("jumps over the lazy dog") << endl;
Better still we can set up more than one EncodeFunctor object and
do parallel or nested encoding:
EncodeFunctor encode1('a'); EncodeFunctor encode2('b'); EncodeFunctor encode3('a'); EncodeFunctor encode4('b');
while (cin >> str) { cout << encode1(str) << endl; cout << encode2(str) << endl; cout << encode3(encode4(str)) << endl; }
See also: Exercise 5
Last updated: Fri Feb 18 11:18:58 2000