C++ Report "Pattern Hatching" column for February '96 issue John Vlissides Title: Protection, Part I: The Hollywood Principle Last time I digressed from our on-going file system design to offer some advice on how to write design patterns. I don't want to make a habit of that, as I'd much rather show design patterns in action than engage in metadiscussions. Still, it's all too tempting to make patterns an end in themselves. If you ever find me doing it, some e-mail of reproof will be in order. We've looked at patterns (Composite and Proxy) for defining the file system structure and a pattern (Visitor) for introducing new capabilities uninvasively, that is, by *adding* code rather than *changing* code. Herein lies another bromide of good object-oriented design: You maximize flexibility and maintainability when your system can be modified without touching existing code. If you can still say that after others have leveraged your software, then congratulations---you've delivered on much of the promise of object technology! But I'm digressing again. Another major design issue in our file system has to do with security. There are at least two relevant subissues: 1. Protection from inadvertent and malicious corruption. 2. Maintaining integrity in the face of hardware and software errors. I'll focus on the first of these subissues here, leaving the second to you as an exercise. (If you take that as a challenge and would like to herald your solution, let me know. I'd be privileged to share this forum with a pattern compatriot.) I. SINGLE-USER PROTECTION Anyone who's used a computer intensively has a horror story to tell about how he or she lost vital data through an unfortunate syntax error, a wayward mouse click, or just one of those late-night brain-faults. Deleting the wrong file at the right time is a common catastrophe. Another is inadvertent editing---changing a file that shouldn't be changed casually. While a truly advanced file system would have an undo feature for recovering from such mishaps, prevention is often preferable to cure. Sadly, most file systems give you a different choice: prevention or regret. For now I'll concentrate on protecting file system objects (i.e., nodes) from deletion and modification. I'll consider protection as it relates to the programming interface rather than the user interface. We needn't worry too much about that distinction, however, since our programming abstractions correspond closely to user-level abstractions. Also, I assume we're dealing with a single-user file system like you'd find on a classic, unnetworked personal computer (as opposed to a multiuser one, like Unix). That will keep things simple at the outset. I'll consider the implications of multiuser protection next time. All elements of the file system (including files, directories, and symbolic links) adhere to the Node interface, which currently includes the following operations[footnote0]: const char* GetName(); const Protection& GetProtection(); void SetName(const char*); void SetProtection(const Protection&); void StreamIn(istream&); void StreamOut(ostream&); Node* GetChild(int); void Adopt(Node*); void Orphan(Node*); I've said something substantive about each of these operations except GetProtection. Ostensibly it retrieves a node's protection information, but what that means isn't clear yet. What kind of protection are we talking about? If we're aiming to protect nodes from accidental change or deletion, then all we need is write protection---that is, the node can be either writable or not writable. If we stipulate further that the node should be protected from prying eyes, then we should be able to make it unreadable as well. Of course, that will only protect it from eyes that are both prying and *ignorant*---ignorant of how one changes a node's protection. Read protection might be useful for keeping stuff away from your kids (or maybe vice versa), but it's not exactly indispensable. It will become more important in a multiuser environment. To recap, we know that nodes can be readable or unreadable, writable or unwritable. Most file systems have additional protection modes governing things like executability, automatic archiving, and so forth. We can treat those kinds of protection in more or less the same way as readability and writability. So I'll focus on just these two modes to get the point across. II. ENFORCEMENT What effect do these protection modes have on a node's behavior? Obviously, an unreadable file shouldn't reveal its contents, which suggests that it shouldn't respond to StreamOut requests. Perhaps less obviously, a client shouldn't have access to the children of an unreadable node, if it has any. So GetChild should be inoperative for the unreadable node. As for writability, an unwritable node should let you change neither its attributes nor its structure; hence SetName, StreamIn, Adopt, and Orphan should be neutralized as well. (SetProtection must be treated gingerly in this regard. I'll talk more about that when we get to multiuser protection.) Preventing the deletion of an unwritable node poses some interesting language-level challenges. For example, a client can't delete a Node explicitly like other objects. The C++ compiler can catch such an attempt for us, but not by declaring a node *const*, as one might be inclined. A node's protection can change at run-time, after all. Instead, we can *protect the destructor*. Unlike a normal public destructor, a protected destructor makes it illegal for classes outside the Node class hierarchy to delete a node explicitly[footnote1]. Protecting the destructor also has the nice property of disallowing local Node objects, that is, nodes created on the stack. It prevents an unwritable node from getting deleted automatically when it goes out of scope---an inconsistency that might indicate a bug. But how do you (attempt to) delete a node now that its destructor is protected? One thing seems certain: we'll end up using some kind of operation that takes the node to be deleted as a parameter. The question is, who defines that operation? There are three possibilities: 1. The Node class (possibly redefined by subclasses). 2. A class outside the Node class hierarchy. 3. A global function. We can dismiss the third option immediately, as it provides little over a static member function defined on an existing class. A deletion operation outside the Node hierarchy is rather unappealing as well. It forces the class defining that operation to be a friend of Node. Why? Because if the node happens to be writable and therefore deletable, then someone must call its protected destructor. The only way to accomplish that outside the Node class hierarchy is to make the deleting class a friend of Node. That has the unfortunate side-effect of exposing not just the Node's destructor but everything else it encapsulates as well. Let's consider the first alternative: defining a Delete operation on the Node base class. If we make Delete a static operation, then it must take a Node instance as a parameter; if it isn't static, then it can be parameterless, since the *this* parameter is implied. Choosing between static, virtual, and nonvirtual member functions boils down to a choice between extensibility and aesthetics. A virtual member function is extensible through subclassing. But some people find the syntax node->Delete(); a bit unsettling. I'm not sure why that is, but I suspect people wince at delete this; for the same reason. Too suicidal, perhaps. A static member function can steer clear of this stumbling block... Node::Delete(node); ...but it doesn't lend itself to modification in subclasses. A nonvirtual member function, meanwhile, offers the worst of both worlds. Let's see if we can have our cake and eat it too: enjoy the syntactic advantages of a static member function while allowing extension in subclasses. III. TEMPLATE METHOD What is our Delete operation's charter anyway, ignoring for a moment how subclasses might want to extend it? Two things seem invariant: Delete must check whether the node it's passed is writable, and if it is, it deletes it. Subclasses might want to extend the deletion criteria, or they might want to change how deletion is carried out. But the invariants remain, well, invariant. We just need a little help implementing them in an extensible way. Enter the Template Method design pattern, whose Intent reads: Define the skeleton of an algorithm in an operation, deferring some steps to subclasses. Template Method lets subclasses redefine certain steps of an algorithm without changing the algorithm's structure. According to the first bullet of the pattern's Applicability section, Template Method is applicable whenever you want to implement the invariant parts of an algorithm once and leave it up to subclasses to implement the behavior that can vary. A template method might look like this: void BaseClass::TemplateMethod () { // an invariant part goes here DoSomething(); // a part subclasses can vary // another invariant part goes here DoSomethingElse(); // another variable part // and so forth } BaseClass defines the DoSomething and DoSomethingElse operations to implement default behavior, and subclasses specialize them to do different things. The pattern calls such operations *primitive operations*, because the template method effectively composes them to create a higher-order operation. Primitive operations should be declared virtual, since subclasses must be able to redefine them polymorphically. The pattern suggests we identify primitive operations explicitly by prepending "Do-" to their names. We should also declare them protected to keep clients from calling them directly, since they might not make sense outside the template method's context. As for the template method itself, the pattern recommends it be declared nonvirtual to ensure that the invariant parts stay invariant. We've gone a step further in our case: our candidate for a template method, the Delete operation, is not just nonvirtual---it's static. While that doesn't mean we can't apply the pattern, it does put a twist on our implementation of it. But before implementing Delete, let's design our primitive operations. We've already established the invariant parts of the operations, that is, determining if the node is writable and if so, deleting it. It's not much of a leap from there to the following structure: void Node::Delete (Node* node) { if (node->IsWritable()) { delete node; } else { cerr << node->GetName() << " cannot be deleted." << endl; } } IsWritable[footnote2] is a primitive operation that subclasses can redefine to vary the protection criteria. The base class might define a common default implementation of IsWritable, or it may force subclasses to implement it by declaring it pure virtual: class Node { public: static void Delete(Node*); // ... protected: virtual ~Node(); virtual bool IsWritable() = 0; // ... }; The pure virtual declaration avoids storing protection-related state in the abstract base class, but it also precludes reusing that state in subclasses. Although Delete is static rather than nonvirtual, it can still work as a template method in this case. That's because it doesn't need to refer to *this*; it merely delegates to the Node instance it's passed. And since Delete is a member of the Node base class, it can call protected operations like IsWritable and delete on Node instances without breaching encapsulation. Right now Delete uses just one primitive operation, not counting the destructor. We should add another primitive to let subclasses vary the error message instead of hard-wiring it in the base class: void Node::Delete (Node* node) { if (node->IsWritable()) { delete node; } else { node->DoWarning(undeletableWarning); } } DoWarning abstracts how the node warns the user of *any* problem, not just an inability to delete. It can be arbitrarily sophisticated, doing anything from printing a string to throwing an exception. It avoids having to define a primitive operation (like DoUndeletableWarning, DoUnwritableWarning, DoThisThatOrTheOtherWarning, *ad nauseum*) for every conceivable situation. We can apply Template Method to the other Node operations, which don't happen to be static. In doing so we introduce new primitive operations: void Node::StreamOut (ostream& out) { if (IsReadable()) { DoStreamOut(out); } else { DoWarning(unreadableWarning); } } The major difference between the StreamOut and Delete template methods is that StreamOut can call Node operations directly. Delete can't do that, because it's static and can't refer to *this*. It must be passed the node to be deleted, to which it delegates the primitive operations. IV. THE HOLLYWOOD PRINCIPLE The Template Method pattern leads to an inversion of control known as the "Hollywood Principle," or, "Don't call us; we'll call you." Subclasses can extend or reimplement the variable parts of the algorithm, but they cannot alter the template method's flow of control and other invariant parts. Therefore when you define a new subclass of Node, you have to think not in terms of control flow but *responsibility*---the operations you *must* override, those you *might* override, and others you *mustn't* override. Structuring your operations as template methods makes these responsibilities more explicit. The Hollywood Principle is a key to understanding frameworks. It lets a framework capture architectural and implementation artifacts that don't vary, deferring the variant parts to application-specific subclasses. The inversion of control is part of what makes framework programming uncomfortable for some. When programming procedurally, one is very much preoccupied with control flow. It's hard to imagine how you can understand a procedural program without knowing the twists and turns it takes, even with impeccable functional decomposition. But a good framework will abstract away control flow details. You end up focusing on objects, which can seem both more and less tangible than control flow. You think in terms of object responsibilities and collaborations. It's a higher-level, slightly more declarative view of the world, with potentially greater leverage and flexibility. The Template Method pattern realizes these benefits on a smaller scale than a framework---at the operation level rather than the object level. V. MULTIUSER PROTECTION... ...will have to wait until next time, as I'm low on space and still have business to attend to. We'll look at how design patterns can help us extend the design to let multiple users coexist happily in the file system. VI. MAILBAG Ranjiv Sharma writes: Hi John, I read your article on "Visiting Rights" in the [September '95] C++ Report with interest. However, your implementation of the link confused me a bit. The proxy (Link) should have the same interface as its subject (Node) so that a proxy can be used wherever a subject was expected. However, the link as shown in Figure 1 in the article does not implement the GetChild(), Adopt(Node) and the Orphan(Node) methods. A client would need to do GetSubject()->GetChild() to get to children of a linked directory, which implies that the client must know that it is dealing with a Link---which in turn implies that it probably needs to do [run-time type identification] using dynamic_cast. Am I missing something ? Thanks, Ranjiv No, you're not missing anything, Ranjiv, but I sure did. When I described the Proxy implementation in "Orphanage, Adoption, and Surrogates" (June '95), I said, The last major issue to address concerns how Link implements the Node interface. To first approximation it merely delegates each operation to the corresponding operation on _subject. For example, it might delegate GetChild as follows: Node* Link::GetChild (int n) { return _subject->GetChild(n); } Unfortunately, the diagram in "Visiting Rights" didn't reflect that statement. The box for the Proxy class should have included the complete Node interface, perhaps with an accompanying implementation box that showed an operation forwarding itself to the subject. Thanks much for pointing it out. VII. REFLECTIONS ON PLoP One more thing before I sign off: some ruminations on the second annual Pattern Languages of Programs conference (PLoP '95). This year, like last year, it was held at Allerton Park, Illinois. And this year, unlike last year, it took place in early September. The timing was unfortunate, as it precluded the attendance of more than a few academics with teaching responsibilities---to everyone's loss. Anyway, PLoP is the first and as far as I know *only* conference dedicated to the pattern form. Roughly 75 people from around the world convened for each installment, with about 30 submissions accepted last year and 50 this year. PLoP has an unusual submission process. The program committee doesn't just give a submission the thumbs-up or -down; we iterate with the author of a promising pattern or pattern language to improve it prior to its dissemination at the conference. The process, known as "shepherding," is central to PLoP's mission of fostering a new body of literature, one that's notoriously difficult to produce with any quality. All participants are committed to sharing this burden. PLoP's format is unique as well. Instead of the usual talking head-style presentations, PLoP is built around *writers workshops*---forums in which an author's pattern or pattern language is scrutinized and critiqued by his or her peers in real-time. It's a marvelous way to get honest feedback from multiple perspectives. Those I know who have been through a writers workshop are unanimous in their praise for the format. Nor does the conference mark the end of the review process. Each author uses the feedback from the workshop in one last editing pass before submitting the work for publication in book form. The works from last year appear in *Pattern Languages of Program Design*, capably edited by Jim Coplien and Doug Schmidt. Jim, Norm Kerth, and I are editing next year's installment. Two things struck me about this year's conference. The first was the high quality of the submissions. People seemed to have a better handle on patterns in general. I guess that shouldn't surprise me, given there were far more published examples to learn from than last year. Yet people were also less inclined to define the term "pattern" and more intent on conveying their expertise effectively. The second impressive thing was the diversity of expertise. People wrote patterns on everything from storage management in C++ to how to design first-rate Web pages to trenchant pedagogy. The telecommunications area was particularly well-represented. In fact, plans are underway to hold area-specific PLoPs in addition to the annual ones held in Allerton and (starting next year) Europe. So from my admittedly biased viewpoint, PLoP '95 was an unqualified success. It may not have hit the big time yet, but then again, it might be a shame if it had. For if there's one place the Hollywood Principle *shouldn't* hold, it's PLoP. FOOTNOTES [footnote0]: Note that I've added corresponding Set operations for GetName and GetProtection. They do what you'd expect. [footnote1]: Making the destructor private isn't an option, since that wouldn't let subclasses extend it to delete their children or any other objects they aggregate. [footnote2]: Okay, so I'm bending the rules. But "DoIsWritable" is just abominable. ENDNOTES Coplien, J. and D. Schmidt, eds. *Pattern Languages of Program Design*, Addison-Wesley, Reading, MA, 1995. Gamma, E., R. Helm, R. Johnson, J. Vlissides. *Design Patterns: Elements of Reusable Object-Oriented Software*, Addison-Wesley, Reading, MA, 1995.