Object Encapsulation Made Easy

Damian Conway

School of Computer Science and Software Engineering
Monash University
Clayton 3168, Australia

damian@csse.monash.edu.au
http://www.csse.monash.edu.au/~damian

Abstract

Encapsulation is one of the cornerstones of object orientation, but it's the area in which Perl's support for object-oriented programming is weakest. This paper reviews two existing approaches to implementing encapsulated objects -- via closures and scalars -- and then describes a new module that is simpler to use and more powerful than either.

The problem

In Perl objects are just variables that have been associated with a particular package. Typically they're blessed hashes, or arrays, or scalars; occasionally they're darker mysteries, like typeglobs or closures. And because they are usually just standard variables, the attribute values they store are freely accessible everywhere in a program.

So, even if the object has accessor methods to control how the object's attributes are manipulated:

$obj->set_name("ob1");
print $obj->get_name();
it's still possible to access the data directly:
$obj->{_name} = "ob1";
print $obj->{_name};
But if the get_name and set_name methods do anything other than simply retrieve and set the underlying hash entry—for example, checking the assigned value's validity, or logging retrievals—then directly accessing the data in this way may introduce subtle bugs into the program.

In practice, this lack of a built-in encapsulation mechanism rarely seems to be a problem in Perl. Most object-oriented Perl programmers use hashes as the basis of their objects, and get by quite happily with the principle of "encapsulation by good manners". The lack of protection for attribute values doesn't matter because users of a class either respect the official interface of its objects (i.e. their methods), or they're smart enough to get away with poking around inside an object without breaking anything.

The only problem is that this culturally enforced encapsulation doesn't scale very well. It's fine for a few hundred lines of code written by a single programmer, but is less successful when the code is tens of thousands of lines long and developed by a group of people. Even if the entire team can be trusted to maintain sufficient programming discipline and to consistently respect the notional encapsulation of attributes (a dubious proposition), accidents and mistakes will happen. Especially in rarely used parts of the system.

Moreover, deliberate decisions to circumvent the conventions of encapsulation are rarely documented adequately, leading to problems much later in the development cycle. For example, consider a notionally "private" attribute of an object, which for efficiency reasons is accessed directly in an obscure part of a large system. If the implementation of the object's class changes, that attribute may cease to exist. In a more static language, this would cause an error message to be generated when next some external code attempts to access the (now non-existent) attribute. However, Perl's autovivification of hash entries will silently "recreate" the former attribute whenever it's accessed. The direct access operation proceeds, but now it retrieves or modifies a "phantom" attribute. Bugs such as this can be painfully difficult to diagnose and track down, especially if the original programmer has moved on by the time the problem is discovered.

Existing Perl encapsulation techniques

Encapsulation via closures

Tom Christiansen describes a technique[1] adapted from functional programming, which uses a blessed anonymous subroutine as the object, and an (otherwise inaccessible) lexical to store that object's attributes.

In this approach, the class's constructor creates a lexical hash (say, my %data) and initializes it with the appropriate attribute values. It then creates a new anonymous subroutine that acts as a closure, preserving access to the lexical %data variable, even after the constructor finishes. Finally the constructor blesses the anonymous subroutine into the class and returns a reference to it. In other words, each object of the class is a subroutine that is preserving an otherwise inaccessible lexical variable. Figure 1 illustrates the arrangement.
 

[Figure showing a scalar referring to a subroutine which has access to a lexical hash]
Figure 1: Structure of a closure-based object

The key to making this unusual set-up work is the behaviour of the anonymous subroutine. Typically, it takes up to two arguments: a string indicating which attribute is to be accessed, and an optional value to be assigned to that attribute. The subroutine then analyses the arguments and determines whether the access is permitted.

For example, if no second argument is specified, the anonymous subroutine might just return the value of the attribute specified by the first argument. If a second argument is specified (indicating a "set" operation), the subroutine might check whether the attribute is modifiable, perhaps by consulting another lexical hash (say, %public). If the attribute is not universally accessible, then the subroutine might check whether the request came from the right package (i.e. see who caller is) before proceeding.

Once that functionality is in place, each of the object's accessor methods just call the corresponding anonymous subroutine. The accessor passes the subroutine its own name, and any "new value" argument, whereupon the subroutine decides what to do. Every accessor is therefore structurally identical, so it's easiest to implement them all using a single AUTOLOAD method.

Putting all those components together produces a class declaration like this:

package Data;      # Personal data
$VERSION = 1.00;

my %public =       # Access info
        ( name=>1, age=>1, phone=>1 );
sub new {
        my ($class, %data) = @_;
        my $self = sub {
                my ($attr, $newval) = @_;

                # Enforce the encapsulation
                die "no such attribute: $attr" unless exists $data{$attr};
                die "inaccessible: $attr" unless $public{$attr} || caller eq $class;

                # Provide the access
                $data{$attr}=$newval if @_ > 1;
                return $data{$attr};
        };
        bless $self, $class;
}

sub AUTOLOAD {
        $AUTOLOAD =~ s/.*:://;
        return shift()->($AUTOLOAD,@_);
}
The result is that each Data object is a blessed subroutine, and has the only remaining access to the lexical %data. It uses that hash as its own private storage area, getting or setting entries in %data.

The next time Data::new is invoked, a new (and entirely distinct) lexical hash—also called %data—will be created within the constructor. Then a new (and entirely distinct) anonymous subroutine will be created, blessed, and returned. That subroutine will subsequently have the only access to the new %data hash. In this way, every call to Data::new creates a hash that's "guarded" by its own personal subroutine.

Oddly enough, the resulting encapsulation is far stronger than that provided by most other object-oriented languages. Not even the methods of its own class have direct access to an object's data. Instead, they must request access via the encapsulating subroutine.

Encapsulation via scalars

An equally secure—but less well-known—approach to encapsulation is the flyweight pattern[2,3], which uses blessed scalars as objects. Each scalar stores the index of a particular element of a lexical array (say, @data). That array contains a series of references to anonymous hashes, each of which provides storage for the attributes of a single object. Figure 2 illustrates this arrangement.
 
[Figure showing a scalar referring to another scalar which holds the address of an array whose first element refers to a hash]
Figure 2: Structure of a flyweight object

The various accessors for the class use the index stored in an object to access the corresponding element in the @data array, where the object's attributes are actually stored. But @data is a lexical variable and is declared inside a block, so only those subroutines that were also defined in the same block have access to it. And, of course, the only subroutines that will be defined in the block are the constructor and generic accessor (i.e. AUTOLOAD) for the class.

That necessary code looks like this:

class Data;
$VERSION = 2.00;
{
        my %public = ( name=>1, age=>1, phone=>1 );

        my @data;

        sub new {
                my ($class, %data) = @_;        
                # Add new hash to secret array
                push @data, \%data;

                # Determine the hash's index
                # and bless it as the object
                my $index = $#data;
                bless \$index, $class;
        }

        sub AUTOLOAD {
                my ($self, $newval) = @_;

                # Determine attribute name
                $AUTOLOAD =~ /.*::(.*)/;
                my $attr = $1;

                # Determine the index where the
                # object's attrs are stored
                my $index = ${$self};

                # Enforce the encapsulation
                die "no such attribute" unless exists $data[$index]{$attr};
                die "non-public attribute" unless $public{$attr};

                # Provide the access
                $data[$index]{$attr} = $newval if @_ > 1;
                return $data[$index]{$attr};
        }
}
So even though the users of the class have the keys (i.e. the index stored in each blessed scalar), lexical scope prevents them from reaching the lock (i.e. @data). This provides the desired encapsulation.

The new problem

Most object-oriented languages provide encapsulation that comes in varying strengths. For example, in C++ and Java, object and class data members can be declared as "public", "protected", or "private". "Public" attributes are available everywhere, "protected" attributes are restricted to a particular class hierarchy, and "private" attributes are only visible to the current class. Likewise, attributes in Eiffel can be given an export list to control which other classes can access them.

In contrast, the encapsulation techniques described above are inherently "all-or-nothing" propositions. Every attribute is completely encapsulated from the rest of the program. In C++/Java terms, they're all "private"; in Eiffel terms, none of them is "exported". It's up to the accessor subroutines to provide the necessary logic (i.e. die unless $public{$attr} || caller eq $class) to grant different levels of access. And, of course, this logic has to be manually coded in each encapsulating closure.

A more significant drawback is that both techniques are moderately hard to understand and to code correctly—particularly by beginners, who probably benefit most from proper encapsulation. Both techniques are based on the closure properties of Perl subroutines, which are not well understood by many programmers. Both are most efficiently implemented using relatively obscure code, which reduces the maintainability of the resulting classes.

All in all, the costs of building encapsulated classes seem to outweigh the benefits. It's hardly surprising that, as elegant as they are, such classes are used so rarely. What's really needed is a mechanism that will allow objects to be implemented in the usual way (i.e. by blessing hashes) and yet enable the implementer to designate some of the attributes of the resulting objects as "protected" or "private".

A limited-access hash

The Tie::SecureHash module[4] does just that. Hashes that are tied to it continue to provide most of the behaviours of a normal hash, but also allow their keys to be fully qualified—as if they were independent package variables. The module then uses these key qualifiers to restrict the accessibility of the corresponding entries in a tied hash.

A Tie::SecureHash object (or securehash) can be created by explicitly tie'ing an existing hash:

my %securehash;
tie %securehash, Tie::SecureHash;
or by calling the module's constructor method:
my $securehash_ref = Tie::SecureHash->new();
The constructor version returns a reference to an anonymous hash that has been tied to the Tie::SecureHash package, and which has also been blessed into the Tie::SecureHash class.

Either way, a securehash acts like a regular hash, and provides:

The module provides object methods corresponding to each of these operations: $securehash_ref->values(), $securehash_ref->each(), $securehash_ref->exists($key), etc.

Securehashes also support deletion of individual entries and direct assignment, with some limitations.

Building objects from securehashes

When using a securehash as the basis of an object (i.e. blessing it in some class's constructor), it's tedious to have to create the hash, tie it, and then bless it as well:
sub MyClass::new {
        my $class = ref($_[0]) || $_[0];
        tie my %hash, Tie::SecureHash;
        my $self = bless \%hash, $class;

        # initialization of attrs here

        return $self;
}
Because securehashes are principally intended as object implementations, the Tie::SecureHash module makes process easier by providing the method Tie::SecureHash::new. When called with a single argument, this method creates a new securehash (i.e. ties an ordinary anonymous hash to the Tie::SecureHash package) and then blesses it into the class named by argument (or into the same class as the argument, if it's an object reference). That simplifies MyClass::new to this:
sub MyClass::new {
        my $self = Tie::SecureHash->new($_[0]);

        # initialization of attrs here

        return $self;
}

Declaring securehash entries

Both versions of MyClass::new shown above leave space for initialization. That's because the various entries of a securehash have to be explicitly "declared" before they can be used. In other words, securehash entries aren't autovivifying.

This may seem inconvenient at first, but it actually saves an inordinate amount of time and effort tracking down "spelling bugs" like this:

package Disk::Recovery;

sub new {
        my ($class, @files) = @_;
        bless {
                _retrieved  => [ @files ],
                _attempts   => 0,
                _wierd_data => undef,
              }, $class;
}

sub report {
        print "Made $self->{_attempts} attempts to recover:\n";
        print "\t$_\n" foreach (@{self->{retreived}})
        print "Failed (weird data)\n" if $self->{_weird_data};
}
Unlike the regular hash in the above example, the entries of a securehash can't be accessed until they've been "created". A specific entry is created by referring to it using a qualified key, which is a key string consisting of any characters except ':', preceded by a standard Perl package qualifier. Table 1 illustrates some typical qualified keys.
 
Qualified key Key Qualifier
'Class::key' 'key' 'Class::'
'Class::a key' 'a key', 'Class::'
'My::CD::_tracks' '_tracks' 'My::CD::'
'Railway::_tracks'  '_tracks' 'Railway::'
'Crypt::__passwd' '__passwd'  'Crypt::'
'main::key_berm' 'key_berm' 'main::'
'::key_berm' 'key_berm' 'main::'
Table 1: Structure of a closure-based object

Each qualifier indicates the package that "owns" the key. Hence, the first two keys above are owned by class Class and the last two by the main package.

Qualified keys that have the same key but different qualifiers (for example, 'Railway::_tracks' and 'My::CD::_tracks') are treated as being distinct, even if they label two entries in the same securehash.

Typically, entries in a securehash are created by referring to their fully-qualified names at some point in a class's constructor:

sub MyClass::new {
        my $self = Tie::SecureHash->new($_[0]);

        $self->{MyClass::attr1}  = $_[1];
        $self->{MyClass::_attr2} = $_[2];
        $self->{MyClass::__attr3}= $_[3];

        return $self;
}
In this case, the entries with the keys "attr1", "_attr2", and "__attr3" are all "owned" by the class MyClass. For reasons that will be made clear in the next section, an entry must be declared within its owner's package. In practice, that means that the qualifier for any entry declaration will always be the name of the current package, as in the example above.

Key qualifiers are only required during the creation of entries (and occasionally to resolve ambiguities). After the declarations, they can usually be ignored:

sub MyClass::set_attr2 {
        my ($self, $newval) = @_;
        $self->{_attr2} = $newval if @_>1;
}
though using the fully qualified key is always acceptable:
sub MyClass::set_attr2 {
        my ($self, $newval) = @_;
        $self->{MyClass::_attr2} = $newval if @_>1;
}

Easier initialization

It's annoying to have to repeat the same class name when declaring each attribute in the constructor, so Tie::SecureHash allows Tie::SecureHash::new to take extra parameters which declare attributes without individual qualifiers. Or rather, the qualifier for each attribute passed to new is assumed to be the class name that is passed as the first argument.

For example, the constructor for MyClass could also be written like this:

sub MyClass::new {
      my $self = Tie::SecureHash->new($_[0], attr1   => $_[1],
                                             _attr2  => $_[2],
                                             __attr3 => $_[3],
                                     );
}
This is the only way that entries can be declared without an explicit qualifier.

Access constraints

Securehashes use an extension of a common Perl custom—underscoring—to determine the accessibility of their various entries. In Perl, a leading underscore in the key of an entry suggests that the particular entry is "not for public use". Tie::SecureHash formalizes that idea by treating any entry whose key begins with a single underscore as being inaccessible outside its owner's class hierarchy. In other words, an underscored key indicates a "protected" method.

Tie::SecureHash treats keys that begin with two (or more) underscores even more carefully. The entries for such keys are only accessible from code in their owner's package and in the same file as they were originally declared. In other words, a double underscored key indicates a "private" and "pseudo-lexical" key.

The only other possibility is a key with no leading underscore. Predictably, no underscore indicates that an entry is "public" and universally accessible.

This is reasonably consistent with existing Perl conventions about key naming, but the important difference is that securehashes enforce the convention at run-time. If a doubly-underscored key is accessed outside its owner's package or its declaration file, an exception is immediately thrown. The same thing happens if a singly-underscored key is accessed outside its native class hierarchy. For example:

package Derived::Class;
@ISA = qw( MyClass );

sub dump {
        my ($self) = @_;
        print $self->{attr1};    # okay
        print $self->{_attr2};   # okay
        print $self->{__attr3};  # error
}
The first print is okay because the lack of a leading underscore indicates that 'attr1' is a public attribute, accessible from any package. The second print is okay too because the single leading underscore indicates that '_attr2' is a protected attribute, accessible for any package in Class's hierarchy. But the last print tries to access an attribute with two leading underscores, causing the exception:
Private key 'MyClass::__attr3' of tied SecureHash 
is inaccessible from package Derived::Class.
Likewise, an access attempt such as:
package main;
my $obj = MyClass->new();
print $obj->{_attr2};
would die with the message:
Protected key 'MyClass::_attr2' of tied SecureHash 
is inaccessible from package main
(unless main inherits from MyClass, of course).

Access constraints also apply to the functions each, keys, values, and delete, when applied to securehashes. A key will only be iterated, listed, or deleted if it is accessible at the point where the operation is invoked.

This also has implications for direct assignment to a securehash. A statement such as:

%securehash = ();
is equivalent to a series of delete operations, and hence will only succeed if every key in the securehash is accessible from that point. If any key is inaccessible, an exception will be thrown (and the securehash will be unchanged).

Another difficulty with reassigning a securehash is that every new key being assigned must be appropriately qualified with the name of the current package. In other words, the standard securehash entry declaration rules still apply. For example:

package SomeClass;
%securehash = (
        attr1 => $val1,
        attr2 => $val2,
);
will throw an exception because the keys 'attr1' and 'attr2' don't exist in the newly-cleared %securehash. To successfully reinitialize the securehash, each new key requires a fully qualified name:
package SomeClass;
%securehash = (
        SomeClass::attr1 => $val1,
        SomeClass::attr2 => $val2,
);

Ambiguous keys in a securehash

The ability to access securehash entries by unqualified keys is an important convenience. It can also be a useful programming technique when using inheritance, since it provides "polymorphic" attributes (see below). But it creates problems under some circumstances.

The convenience aspect is obvious. Requiring that securehash keys always be fully qualified would flout the cardinal virtue of Laziness. No-one would want to use a securehash if they always had to write $self->{MyClass::__attr3}, instead of just $self->{__attr3}. In most cases, each attribute of an object will be uniquely named, so each securehash will contain only a single matching unqualified key. The qualifier would be redundant and annoying.

Inheritance, however, brings a difficulty known as the "data inheritance problem"[5]. When one class inherits from another, it's all too easy to accidentally reuse the name of a base class attribute in a derived class. For example:

package Settable;
$VERSION = 1.00;        #uses normal hashes

sub new {
        my ($class, $is_set) = @_;
        bless
                my $self = {_set => $is_set},
                $class;
}

sub set {
        my ($self) = @_;
        # access Settable's _set attr
        $self->{_set} = 1;
}

package Set;
@ISA = qw( Settable );

sub new {
        my ($class, %items) = @_;
        my $self = $class->SUPER::new();
        $self->{_set} = { %items }
        # Oops!
}

sub list {
        my ($self) = @_;
        print keys %{$self->{_set}};
        # Err...was that Set's '_set'
        # or Settable's '_set'?
}
The problem is both Settable and Set want to use a '_set' entry, but Set objects have to share the same hash as their Settable base parts, and hence there can be only one such entry.

The use of qualified keys in a securehash solves the problem (in fact, it's the same solution as suggested in Perl Cookbook):

package Settable;
$VERSION = 2.00;        #uses securehashes

sub new {
        my ($class, $set) = @_;
        my $self =
                Tie::SecureHash->new($class);
        $self->{Settable::_set} = $set;
        return $self;
}


sub set {
        my ($self) = @_;
        $self->{Settable::_set} = 1;
        # Definitely Settable's _set
}

package Set;
@ISA = qw( Settable );

sub new {
        my ($class, %items) = @_;
        my $self = $class->SUPER::new();
        $self->{Set::_set} = { %items };
        # Different key so no "collision"
}

sub list
{
        my ($self) = @_;
        print keys %{$self->{Set::_set}};
        # Definitely Set's _set
}
But securehashes are even smarter than that. Any qualifier/key combination that is unique creates an entry whose unqualified key is unique within its owner's namespace. So it's also possible to write:
package Settable;
$VERSION = 3.00;        #uses securehashes

sub new {
        my ($class, $set) = @_;
        my $self =
                Tie::SecureHash->new($class);
        $self->{Settable::_set} = $set;
        return $self;
}

sub set {
        my ($self) = @_;
        $self->{_set} = 1;
        # Definitely Settable's '_set' (!)
}

package Set;
@ISA = qw( Settable );

sub new {
        my ($class, %items) = @_;
        my $self = $class->SUPER::new();
        $self->{Set::_set} = { %items };
        # Different key so no "collision"
}

sub list
{
        my ($self) = @_;
        print keys %{$self->{_set}};
        # Definitely Set's _set (!)
}
The unqualified keys are unambiguous because the Tie::SecureHash module keeps track of where an access was requested, and works out which key was intended from that context. When the Set::list accesses the '_set' key, it probably wants the entry for 'Set::_set', not 'Settable::_set'. The securehash is aware of the context of the access and returns the correct attribute.

Another way of looking at it is to think of securehash entries that are defined in a base class as being "hidden" by derived class entries of the same name (just like inherited attributes are in most other object-oriented languages). Of course, if the inherited entry is needed in a derived class method, it can still be accessed by fully qualifying it:

sub Set::list {
        my ($self) = @_;
        print keys %{$self->{_set}}
                if $self->{Settable::_set};
}
That's not to say that a securehash can always correctly guess the intended entry for an unqualified key. Consider the following two classes:
package Chemical;

sub new {
        my ($class, $chemname) = @_;
        Tie::SecureHash->new($class, name => $chemname);
}

package Medicine;
@ISA = qw( Chemical );

sub new {
        my ($class, $medname, $chemname) = @_;
        my $self = Chemical->new($class, $chemname);
        $self->{Medicine::name} = $medname;
        return $self;
}
Within the Chemical class, the unqualified public key 'name' will always be assumed to be referring to 'Chemical::name'. Similarly, inside any of Medicine's methods the same key is unambiguously resolved to 'Medicine::name'. But what about accesses from the main package? For example:
package main;
my $medicine =  Medicine->new("Dydroxifen","dihydrogen oxide");
print $medicine->{name};
Since the 'name' entry isn't being accessed from a method of either class, there's no way to decide which entry was intended. Tie::SecureHash resolves the ambiguity by immediately throwing an exception.

The solution is to explicitly qualify any ambiguous case:

print $medicine->{Medicine::name};
Problems of a similar type occur with protected keys as well, whenever a class inherits from two or more classes. If both classes use a protected attribute of the same name then, in a class than derives from both, it's impossible to tell which inherited attribute was intended:
package Dessert::Topping;

sub new { Tie::SecureHash->new($_[0], _shaken => 0) }

sub shake { $_[0]->{_shaken} = 1 }


package Floor::Wax;

sub new { Tie::SecureHash->new($_[0], _shaken => 0 ) }

sub shake { $_[0]->{_shaken}++ }


package Jiffy::Whip;
@ISA = qw(Dessert::Topping Floor::Wax);

sub shaken { $_[0]->{_shaken} }   # Dessert::Topping's '_shaken' 
                                  # orFloor::Wax's '_shaken'?
Once again, since it can't decide which of the two attributes was intended, Tie::SecureHash simply throws an exception.

Debugging a securehash

In a more complicated hierarchy than the ones shown above, ambiguities can be quite difficult to detect and defuse. The Tie::SecureHash module provides a method (named debug) that can be called to dump the contents of a securehash to STDERR. The debug method can be called on any securehash—regardless of the class into which it's been blessed—with an explicit method call:
sub Jiffy::Whip::shaken {
        my ($self) = @_;
        $self->Tie::SecureHash::debug();   # Find the source...
        return $self->{_shaken};           # ...of this problem:
}
Tie::SecureHash::debug reports the current location details (package, file, line and subroutine) and the key and value of each entry of the securehash, categorized by owner. More importantly, debug reports the accessibility of each entry at the point where it was called (either "accessible", "inaccessible", or "ambiguous") and explains why.

"Fast" securehashes

Securehashes provide an easy means of controlling the accessibility of object attributes on a per-attribute basis. Unfortunately, that ease and flexibility comes at a cost. For a start, accessing the entries of any kind of tied hash is significantly slower that for untied hashes, often taking 5 to 10 times as long per access. On top of that performance hit, securehashes have to perform some moderately expensive tests (involving the Universal::isa subroutine) before they can grant access to an entry. These tests can double the cost again, so accesses to securehashes are often 10 and 20 times slower than to untied hash. That makes the use of securehashes impractical in most production code.

Fortunately, production code doesn't actually need the security of encapsulation. That's because all that checking of access restrictions is only actually required when a piece of code incorrectly attempts to violate those restrictions. Since production code is always thoroughly tested (ahem!), such bugs will have been caught and eliminated, so the checks are redundant. In other words, if no one can ever break the law, you no longer need any police to enforce it.

Thus, the solution is to develop the application using Tie::SecureHash to enforce proper encapsulation, test it thoroughly to ensure that there are no improper accesses anywhere in the code, and then optimize the final code by converting every securehash to a normal hash.

Because a securehash's interface mimics the interface of a regular hash, converting from securehashes to the regular kind is surprisingly easy. It's not necessary to change any of the code that accesses a securehash, only the code that creates it. In fact, that's exactly what encapsulation is all about: hiding implementation details behind a standard interface so that client code doesn't have to worry when those details change.

Of course, in the typical large application where encapsulation is most useful, hunting for every situation where a securehash is created and then replacing it with a regular hash could still be time-consuming and error-prone. Fortunately, even that isn't necessary.

Tie::Securehash provides a special "fast" mode, in which a call to Tie::SecureHash::new returns a reference to an ordinary hash, rather than to a securehash. Hence, in "fast" mode, there's no need to replace any code like:

$self = Tie::SecureHash->new($_[0]);
because it correctly adjusts its behaviour automatically.

Of course, that doesn't solve the problem of any "raw" tie-ing:

tie %$self, Tie::SecureHash;
but that's just another reason to use Tie::SecureHash::new instead. Indeed, in "fast" mode, Tie::SecureHash generates a warning whenever a raw tie such as this is used.

"Fast" mode is enabled by importing the entire module with an extra argument:

use Tie::SecureHash "fast";

"Strict" securehashes

This "develop-with-restrictions-then-run-without-them" approach works well, but there are two caveats: Tie::SecureHash::new must always be used to create securehashes, and unqualified keys can never be used to access them.

The need to use Tie::SecureHash::new was explained above: Tie::SecureHash::new knows about "fast" mode and can adjust for it, but the in-built tie function doesn't and can't.

The second caveat imposes a more significant restriction. One of the useful features of a securehash is that, once an entry has been declared with its full qualifier, any code can refer to it without the qualifier and expect the securehash to do the right thing in all unambiguous cases. However, when the securehash is replaced with a regular hash, that "do what I mean" intelligence disappears. That can lead to subtle bugs, because regular hashes autovivify and will happily create unrelated entries when both qualified and unqualified versions of a key are used.

These two restrictions are not particularly onerous, but they can be difficult to apply consistently in a large application. To make conversion to "fast" mode easier, Tie::SecureHash offers another mode, called "strict". Like "fast" mode, this mode can be invoked by importing the module with the appropriate argument:

use Tie::SecureHash "strict";
In "strict" mode, securehashes control access in their normal way, except that they also produce warnings whenever a hash is explicitly tied to Tie::SecureHash, and whenever an unqualified key is used to access a securehash. Thus, code that uses securehashes and runs without warnings in "strict" mode is guaranteed to have the same behaviour in "fast" mode.

The formal access rules

The access rules for a securehash are designed to provide secure encapsulation with minimal inconvenience and maximal intuitiveness. However, to produce this appearance of intelligence, the formal access rules are quite complicated...

All entries

Public entries

Protected entries

Private entries

Conclusion

The Tie::SecureHash module provides a simple, and effective way to implement hash-based objects that provide enforced encapsulation to their attributes. The use of key qualifiers also solves the problem of collisions between a class's attributes and those of its ancestors.

The module provides debugging facilities, and enables developers to generate safely encapsulated classes without any performance penalty, using the module's "strict" and "fast" options.

The module is available from the CPAN.

References

  1. Christiansen, T., perltoot, standard Perl documentation set,
  2. Gamma, E, et al., Design Patterns: Elements of Reusable Object-Oriented Software, Chapter 4, Addison-Wesley, 1995.
  3. Conway, D., Object Oriented Perl, Chapter 11, Manning Publications, 1999.
  4. http://www.perl.com/CPAN/authors/id/DCONWAY/
  5. Christiansen, T. & Torkington, N., Perl Cookbook, Recipe 13.12, O'Reilly & Associates, 1997.