C++ Report
"Pattern Hatching" column for February '96 issue
John Vlissides

Title:  Protection, Part I: The Hollywood Principle

Last time I digressed from our on-going file system design to offer
some advice on how to write design patterns. I don't want to make a
habit of that, as I'd much rather show design patterns in action than
engage in metadiscussions.  Still, it's all too tempting to make
patterns an end in themselves.  If you ever find me doing it, some
e-mail of reproof will be in order.

We've looked at patterns (Composite and Proxy) for defining the file
system structure and a pattern (Visitor) for introducing new
capabilities uninvasively, that is, by *adding* code rather than
*changing* code. Herein lies another bromide of good object-oriented
design: You maximize flexibility and maintainability when your system
can be modified without touching existing code. If you can still say
that after others have leveraged your software, then
congratulations---you've delivered on much of the promise of object
technology!

But I'm digressing again. Another major design issue in our file system
has to do with security. There are at least two relevant subissues:

     1. Protection from inadvertent and malicious corruption.

     2. Maintaining integrity in the face of hardware and software
        errors.

I'll focus on the first of these subissues here, leaving the second to
you as an exercise. (If you take that as a challenge and would like to
herald your solution, let me know. I'd be privileged to share this forum
with a pattern compatriot.)

I. SINGLE-USER PROTECTION

Anyone who's used a computer intensively has a horror story to tell
about how he or she lost vital data through an unfortunate syntax
error, a wayward mouse click, or just one of those late-night
brain-faults. Deleting the wrong file at the right time is a common
catastrophe. Another is inadvertent editing---changing a file that
shouldn't be changed casually. While a truly advanced file system
would have an undo feature for recovering from such mishaps,
prevention is often preferable to cure. Sadly, most file systems give
you a different choice: prevention or regret.

For now I'll concentrate on protecting file system objects (i.e.,
nodes) from deletion and modification.  I'll consider protection as it
relates to the programming interface rather than the user interface.
We needn't worry too much about that distinction, however, since our
programming abstractions correspond closely to user-level
abstractions.  Also, I assume we're dealing with a single-user file
system like you'd find on a classic, unnetworked personal computer (as
opposed to a multiuser one, like Unix). That will keep things simple
at the outset.  I'll consider the implications of multiuser protection
next time.

All elements of the file system (including files, directories, and
symbolic links) adhere to the Node interface, which currently includes
the following operations[footnote0]:

    const char* GetName();
    const Protection& GetProtection();

    void SetName(const char*);
    void SetProtection(const Protection&);

    void StreamIn(istream&);
    void StreamOut(ostream&);

    Node* GetChild(int);
    void Adopt(Node*);
    void Orphan(Node*);

I've said something substantive about each of these operations except
GetProtection.  Ostensibly it retrieves a node's protection
information, but what that means isn't clear yet. What kind of
protection are we talking about?

If we're aiming to protect nodes from accidental change or deletion,
then all we need is write protection---that is, the node can be either
writable or not writable. If we stipulate further that the node should
be protected from prying eyes, then we should be able to make it
unreadable as well. Of course, that will only protect it from eyes
that are both prying and *ignorant*---ignorant of how one changes a
node's protection.  Read protection might be useful for keeping stuff
away from your kids (or maybe vice versa), but it's not exactly
indispensable.  It will become more important in a multiuser
environment.

To recap, we know that nodes can be readable or unreadable, writable
or unwritable. Most file systems have additional protection modes
governing things like executability, automatic archiving, and so
forth. We can treat those kinds of protection in more or less the same
way as readability and writability. So I'll focus on just these two
modes to get the point across.

II. ENFORCEMENT

What effect do these protection modes have on a node's behavior?
Obviously, an unreadable file shouldn't reveal its contents, which
suggests that it shouldn't respond to StreamOut requests. Perhaps less
obviously, a client shouldn't have access to the children of an
unreadable node, if it has any.  So GetChild should be inoperative for
the unreadable node. As for writability, an unwritable node should let
you change neither its attributes nor its structure; hence SetName,
StreamIn, Adopt, and Orphan should be neutralized as well.
(SetProtection must be treated gingerly in this regard. I'll talk more
about that when we get to multiuser protection.)

Preventing the deletion of an unwritable node poses some interesting
language-level challenges. For example, a client can't delete a Node
explicitly like other objects. The C++ compiler can catch such an
attempt for us, but not by declaring a node *const*, as one might be
inclined.  A node's protection can change at run-time, after all.

Instead, we can *protect the destructor*.  Unlike a normal public
destructor, a protected destructor makes it illegal for classes
outside the Node class hierarchy to delete a node
explicitly[footnote1].  Protecting the destructor also has the nice
property of disallowing local Node objects, that is, nodes created on
the stack.  It prevents an unwritable node from getting deleted
automatically when it goes out of scope---an inconsistency that might
indicate a bug.

But how do you (attempt to) delete a node now that its destructor is
protected?  One thing seems certain: we'll end up using some kind of
operation that takes the node to be deleted as a parameter. The
question is, who defines that operation? There are three
possibilities:

     1. The Node class (possibly redefined by subclasses).

     2. A class outside the Node class hierarchy.

     3. A global function.

We can dismiss the third option immediately, as it provides little
over a static member function defined on an existing class. A deletion
operation outside the Node hierarchy is rather unappealing as well.
It forces the class defining that operation to be a friend of Node.
Why? Because if the node happens to be writable and therefore
deletable, then someone must call its protected destructor. The only
way to accomplish that outside the Node class hierarchy is to make the
deleting class a friend of Node.  That has the unfortunate side-effect
of exposing not just the Node's destructor but everything else it
encapsulates as well.

Let's consider the first alternative: defining a Delete operation on
the Node base class. If we make Delete a static operation, then it
must take a Node instance as a parameter; if it isn't static, then it
can be parameterless, since the *this* parameter is implied. Choosing
between static, virtual, and nonvirtual member functions boils down to
a choice between extensibility and aesthetics.

A virtual member function is extensible through subclassing.  But some
people find the syntax

    node->Delete();

a bit unsettling. I'm not sure why that is, but I suspect people wince at

    delete this;

for the same reason. Too suicidal, perhaps. A static member function
can steer clear of this stumbling block...

    Node::Delete(node);

...but it doesn't lend itself to modification in subclasses. A
nonvirtual member function, meanwhile, offers the worst of both
worlds.

Let's see if we can have our cake and eat it too: enjoy the syntactic
advantages of a static member function while allowing extension in
subclasses.

III. TEMPLATE METHOD

What is our Delete operation's charter anyway, ignoring for a moment
how subclasses might want to extend it? Two things seem invariant:
Delete must check whether the node it's passed is writable, and if it
is, it deletes it. Subclasses might want to extend the deletion
criteria, or they might want to change how deletion is carried out.
But the invariants remain, well, invariant. We just need a little help
implementing them in an extensible way.

Enter the Template Method design pattern, whose Intent reads:

    Define the skeleton of an algorithm in an operation, deferring some
    steps to subclasses. Template Method lets subclasses redefine
    certain steps of an algorithm without changing the algorithm's
    structure.

According to the first bullet of the pattern's Applicability section,
Template Method is applicable whenever you want to implement the
invariant parts of an algorithm once and leave it up to subclasses to
implement the behavior that can vary.  A template method might look
like this:

    void BaseClass::TemplateMethod () {
        // an invariant part goes here

        DoSomething();      // a part subclasses can vary

        // another invariant part goes here

        DoSomethingElse();  // another variable part

        // and so forth
    }

BaseClass defines the DoSomething and DoSomethingElse operations to
implement default behavior, and subclasses specialize them to do
different things.  The pattern calls such operations *primitive
operations*, because the template method effectively composes them to
create a higher-order operation.  Primitive operations should be
declared virtual, since subclasses must be able to redefine them
polymorphically.  The pattern suggests we identify primitive
operations explicitly by prepending "Do-" to their names.  We should
also declare them protected to keep clients from calling them
directly, since they might not make sense outside the template method's
context.

As for the template method itself, the pattern recommends it be
declared nonvirtual to ensure that the invariant parts stay invariant.
We've gone a step further in our case: our candidate for a template
method, the Delete operation, is not just nonvirtual---it's static.
While that doesn't mean we can't apply the pattern, it does put a
twist on our implementation of it.

But before implementing Delete, let's design our primitive operations.
We've already established the invariant parts of the operations, that
is, determining if the node is writable and if so, deleting it.  It's
not much of a leap from there to the following structure:

    void Node::Delete (Node* node) {
        if (node->IsWritable()) {
            delete node;
        } else {
            cerr << node->GetName() << " cannot be deleted." << endl;
        }
    }

IsWritable[footnote2] is a primitive operation that subclasses can
redefine to vary the protection criteria.  The base class might define
a common default implementation of IsWritable, or it may force
subclasses to implement it by declaring it pure virtual:

    class Node {
    public:
        static void Delete(Node*);
        // ...
    protected:
        virtual ~Node();
        virtual bool IsWritable() = 0;
        // ...
    };

The pure virtual declaration avoids storing protection-related state
in the abstract base class, but it also precludes reusing that state
in subclasses.

Although Delete is static rather than nonvirtual, it can still work as
a template method in this case.  That's because it doesn't need to
refer to *this*; it merely delegates to the Node instance it's passed.
And since Delete is a member of the Node base class, it can call
protected operations like IsWritable and delete on Node instances
without breaching encapsulation.

Right now Delete uses just one primitive operation, not counting the
destructor.  We should add another primitive to let subclasses vary
the error message instead of hard-wiring it in the base class:

    void Node::Delete (Node* node) {
        if (node->IsWritable()) {
            delete node;
        } else {
            node->DoWarning(undeletableWarning);
        }
    }

DoWarning abstracts how the node warns the user of *any* problem, not
just an inability to delete. It can be arbitrarily sophisticated,
doing anything from printing a string to throwing an exception.  It
avoids having to define a primitive operation (like
DoUndeletableWarning, DoUnwritableWarning,
DoThisThatOrTheOtherWarning, *ad nauseum*) for every conceivable
situation.

We can apply Template Method to the other Node operations, which don't
happen to be static. In doing so we introduce new primitive
operations:

    void Node::StreamOut (ostream& out) {
        if (IsReadable()) {
            DoStreamOut(out);
        } else {
            DoWarning(unreadableWarning);
        }
    }

The major difference between the StreamOut and Delete template methods
is that StreamOut can call Node operations directly. Delete can't do
that, because it's static and can't refer to *this*. It must be passed
the node to be deleted, to which it delegates the primitive operations.

IV.  THE HOLLYWOOD PRINCIPLE

The Template Method pattern leads to an inversion of control known as
the "Hollywood Principle," or, "Don't call us; we'll call you."
Subclasses can extend or reimplement the variable parts of the
algorithm, but they cannot alter the template method's flow of control
and other invariant parts. Therefore when you define a new subclass of
Node, you have to think not in terms of control flow but
*responsibility*---the operations you *must* override, those you *might*
override, and others you *mustn't* override.  Structuring your
operations as template methods makes these responsibilities more
explicit.

The Hollywood Principle is a key to understanding frameworks. It lets
a framework capture architectural and implementation artifacts that
don't vary, deferring the variant parts to application-specific
subclasses.

The inversion of control is part of what makes framework programming
uncomfortable for some.  When programming procedurally, one is very
much preoccupied with control flow. It's hard to imagine how you can
understand a procedural program without knowing the twists and turns
it takes, even with impeccable functional decomposition. But a good
framework will abstract away control flow details.  You end up
focusing on objects, which can seem both more and less tangible than
control flow.  You think in terms of object responsibilities and
collaborations. It's a higher-level, slightly more declarative view of
the world, with potentially greater leverage and flexibility.  The
Template Method pattern realizes these benefits on a smaller scale
than a framework---at the operation level rather than the object
level.

V.  MULTIUSER PROTECTION...

...will have to wait until next time, as I'm low on space and still
have business to attend to. We'll look at how design patterns can help
us extend the design to let multiple users coexist happily in the file
system.

VI.  MAILBAG

Ranjiv Sharma writes:

    Hi John,

    I read your article on "Visiting Rights" in the [September '95] C++
    Report with interest. However, your implementation of the link
    confused me a bit. The proxy (Link) should have the same interface
    as its subject (Node) so that a proxy can be used wherever a
    subject was expected.  However, the link as shown in Figure 1 in
    the article does not implement the GetChild(), Adopt(Node) and the
    Orphan(Node) methods. A client would need to do
    GetSubject()->GetChild() to get to children of a linked directory,
    which implies that the client must know that it is dealing with a
    Link---which in turn implies that it probably needs to do
    [run-time type identification] using dynamic_cast.

    Am I missing something ?

    Thanks, Ranjiv

No, you're not missing anything, Ranjiv, but I sure did.  When I
described the Proxy implementation in "Orphanage, Adoption, and
Surrogates" (June '95), I said,

    The last major issue to address concerns how Link implements the
    Node interface.  To first approximation it merely delegates each
    operation to the corresponding operation on _subject.  For
    example, it might delegate GetChild as follows:

        Node* Link::GetChild (int n) {
            return _subject->GetChild(n);
        }

Unfortunately, the diagram in "Visiting Rights" didn't reflect that
statement. The box for the Proxy class should have included the
complete Node interface, perhaps with an accompanying implementation
box that showed an operation forwarding itself to the subject.  Thanks
much for pointing it out.

VII.  REFLECTIONS ON PLoP

One more thing before I sign off: some ruminations on the second
annual Pattern Languages of Programs conference (PLoP '95).  This
year, like last year, it was held at Allerton Park, Illinois.  And
this year, unlike last year, it took place in early September.  The
timing was unfortunate, as it precluded the attendance of more than a
few academics with teaching responsibilities---to everyone's loss.

Anyway, PLoP is the first and as far as I know *only* conference
dedicated to the pattern form. Roughly 75 people from around the world
convened for each installment, with about 30 submissions accepted
last year and 50 this year. PLoP has an unusual submission process.
The program committee doesn't just give a submission the thumbs-up or
-down; we iterate with the author of a promising pattern or pattern
language to improve it prior to its dissemination at the conference.
The process, known as "shepherding," is central to PLoP's mission of
fostering a new body of literature, one that's notoriously difficult
to produce with any quality. All participants are committed to sharing
this burden.

PLoP's format is unique as well.  Instead of the usual talking
head-style presentations, PLoP is built around *writers
workshops*---forums in which an author's pattern or pattern language
is scrutinized and critiqued by his or her peers in real-time. It's a
marvelous way to get honest feedback from multiple perspectives.
Those I know who have been through a writers workshop are unanimous in
their praise for the format.

Nor does the conference mark the end of the review process.  Each
author uses the feedback from the workshop in one last editing pass
before submitting the work for publication in book form.  The works
from last year appear in *Pattern Languages of Program Design*,
capably edited by Jim Coplien and Doug Schmidt.  Jim, Norm Kerth, and
I are editing next year's installment.

Two things struck me about this year's conference. The first was the
high quality of the submissions. People seemed to have a better handle
on patterns in general. I guess that shouldn't surprise me, given
there were far more published examples to learn from than last year.
Yet people were also less inclined to define the term "pattern" and
more intent on conveying their expertise effectively.

The second impressive thing was the diversity of expertise. People wrote
patterns on everything from storage management in C++ to how to design
first-rate Web pages to trenchant pedagogy. The telecommunications area
was particularly well-represented. In fact, plans are underway to hold
area-specific PLoPs in addition to the annual ones held in Allerton and
(starting next year) Europe.

So from my admittedly biased viewpoint, PLoP '95 was an unqualified
success.  It may not have hit the big time yet, but then again, it
might be a shame if it had.  For if there's one place the Hollywood
Principle *shouldn't* hold, it's PLoP.

FOOTNOTES

[footnote0]: Note that I've added corresponding Set operations for
GetName and GetProtection. They do what you'd expect.

[footnote1]: Making the destructor private isn't an option, since that
wouldn't let subclasses extend it to delete their children or any other
objects they aggregate.

[footnote2]: Okay, so I'm bending the rules. But "DoIsWritable" is just
abominable.

ENDNOTES

Coplien, J. and D. Schmidt, eds.  *Pattern Languages of Program
Design*, Addison-Wesley, Reading, MA, 1995.

Gamma, E., R. Helm, R. Johnson, J. Vlissides.  *Design Patterns:
Elements of Reusable Object-Oriented Software*, Addison-Wesley,
Reading, MA, 1995.