WHAT DOES SUBSTITUTABILITY REALLY MEAN? PART 5
by C. J. Date

 

 

 

(Continued from Part 4)

 

 

This is the fifth part of this six-part series on the Liskov Substitution Principle (LSP), also known as the principle of substitutability.  In this and the forthcoming last part, I want to conclude my examination of the paper A Behavioral Notion of Subtyping, by Barbara Liskov and Jeannette Wing (ACM Transactions on Programming Languages and Systems 16, No. 6, November 1994).

 

 

The Running Example 

 

The Liskov/Wing paper makes extensive use of a running example involving bagsand stacks.  Here's the paper's own introduction to the example: 

 

“Consider a bounded bag type that provides a put method that inserts elements into a bag and a get method that removes an arbitrary element from a bag.  Put has a precondition that checks to see that adding an element will not grow the bag beyond its bound; get has a precondition that checks to see that the bag is nonempty. 

 

Consider also a bounded stack type that has, in addition to push and pop methods, a swap_top method that takes an integer, i, and modifies the stack by replacing its top with i.  Stack's push and pop methods have preconditions similar to bag's put and get, and swap_top has a precondition requiring that the stack is nonempty. “

 

Note:  For clarity, from this point forward I'll set the type names and method names in upper case, even when I'm giving what are otherwise verbatim quotes from the Liskov/Wing paper. 

 

Observe, by the way, that the running example involves bags and stacks containing integers specifically; the much more interesting question of generic bag and stack types isn't discussed.  Anyway, to continue with that example: 

 

“Suppose we want to show STACK is a subtype of BAG.  We need to relate the values of stacks to those of bags.  This can be done by means of an abstraction function ... A given STACK value maps to a BAG value where we abstract from the insertion order on the elements. 

 

We also need to relate STACK's methods to BAG's.  Clearly there is a correspondence between STACK's PUSH method and BAG's PUT and similarly for the POP and GET methods (even though the names of the corresponding methods do not match).” 

 

At this point I'd like to make a few observations about the example.  First of all, note the assumption that the two types already exist, and now we're trying to show that one is a subtype of the other.  In other words, to paraphrase something I said in Part 3 [substitute_3.htm] of this series, the emphasis is on verifying the hypothesis that STACK is a subtype of BAG.  It's categorically not a matter of taking the statement "STACK is a subtype of BAG" as a given and exploring the logical consequences of that fact; rather, it's the exact opposite--in other words, Liskov and Wing are effectively saying that if we have substitutability, then we have subtyping.  Our own approach, by contrast, is to say that if we have subtyping, then we have substitutability. (And so I've been quite unfair to the Liskov/Wing paper in this article so far, because I've almost totally ignored what the authors regard as its main contribution!  But my purpose isn't so much to examine that contribution as such, but rather--to say it one more time--to examine the Liskov Substitution Principle.) 

 

Second, it's presumably because the two types already exist that we get into that business of their having "different value spaces" (see Part 4 of this series).  After all, it's almost certainly the case that bags and stacks will have different storage representations (note the remark about "relating the values of stacks to those of bags").  But I stand by my claim that such questions are an implementation matter merely and shouldn't show through to the model level.  We shouldn't even be talking about them at the model level, even if we're discussing the two types independently and not considering the possibility that one might be a subtype of the other. 

 

Third, it's also because the two types already exist that "corresponding methods have different names"--i.e., the fact that there are different versions of "the same" method is exposed at the model level (note the remark about "relating STACK's methods to BAG's").  Again, however, I stand by my claim that such issues are an implementation matter merely and shouldn't show through to the model level; again, we shouldn't even be talking about them at the model level. 

 

The foregoing conjectures on my part are reinforced by the following facts: 

 

Ø       The paper explicitly states, "subtypes must provide [the] expected methods with compatible signatures."  This remark too suggests an emphasis on preexisting types, because if we were explicitly defining a new type S to be a subtype of an existing type T, there's no way S couldn't have "[the] expected methods with compatible signatures."  (In our approach, in fact, the question of whether the signatures are "compatible" doesn't even make sense.)

 

Ø       The paper also states that "a common case [is] that the subtype adds some extra methods [like SWAP_TOP in the example] but does not change any existing ones" (italics added).  "Changing existing methods" here refers, I'm pretty sure, to the idea that the implementations might be different at the supertype and subtype levels--suggesting, again, that we're really talking about distinct implementation versions of "the same" method, and thus talking about an implementation concern rather than a model one.

 

 

Preconditions and Postconditions

 

Back to the running example.  I assume it's clear, at least in principle, how the necessary mappings of stack values and methods to bag values and methods can be done.  Thus, we are close to being able to say that STACK is a subtype of BAG.  However, we aren't quite all the way there ... According to Liskov and Wing, we also need to show among other things that: 

 

The preconditions for PUT and GET imply those for PUSH and POP, respectively. 

The postconditions for PUSH and POP imply those for PUT and GET, respectively. 

 

For brevity, let's consider just PUT and PUSH and ignore the other two.  Informally, the precondition for PUT is simply that the target bag isn't full, or in other words that its current size is less than the maximum size (recall that the bags we're talking about are bounded ones specifically).  Analogously, the precondition for PUSH is that the target stack isn't full.  So it's easy to see that the precondition for PUT implies that for PUSH (Of course, there are other aspects to the preconditions (and postconditions); for example, the operands for PUT must be of type BAG and INTEGER, respectively.  I'm ignoring such refinements here for simplicity.) 

 

Likewise, the postcondition for PUSH is that the stack now additionally contains the specified integer in its topmost position, while the postcondition for PUT is that the bag now additionally contains the specified integer (we can't say where, because the concept of "position" within a bag has no meaning).  Thus, it's easy to see that the postcondition for PUSH implies that for PUT. 

 

Of course, it goes without saying that in order for the foregoing logical implications to be checked, the preconditions and postconditions in question must be stated explicitly somewhere.  Liskov and Wing propose a required clause for stating preconditions and an ensures clause for stating postconditions; these clauses appear as part of the specification of the method in question.  Note:  At run time, then, it should be possible--at least in principle--for the system to check that any given method invocation does indeed satisfy the stated conditions, but such run-time checking isn't the principal point of the clauses. 

 

Our own type model (our type model, please note, not our inheritance model) currently includes nothing corresponding to the requires and ensures clauses, though there's no reason why we couldn't extend it to do so if we wanted to.  What we definitely wouldn't do, however, is have two distinct versions of the same method, a subtype version and a supertype version, both explicitly user-visible, with two distinct sets of preconditions and postconditions.  In our view, the need to check the various precondition and postcondition implications arises from a defect in the Liskov/Wing model--more precisely, from the fact that the model exposes certain features (i.e., distinct versions of the same method) that ought really to be hidden. 

 

 

Covariance and Contravariance

 

Liskov and Wing also espouse two notions called "covariance and contravariance."  Our own inheritance model supports covariance, too; in fact, you can't not support it if you support substitutability (which is to say, if you support subtyping and inheritance at all, since in our view subtyping and inheritance logically imply substitutability).  Covariance--more precisely referred to as result covariance--says that if a method M is defined to return a result of type T, then an invocation of M is allowed to return a result of any subtype of T.  For example, if "add one" is defined to return a result of type INT, sometimes it will return a result of type EVEN instead.  (Here, as in Part 2 of this article, INT is integers, EVEN is even integers, and EVEN is a subtype of INT.) 

 

Contravariance is another matter, however.  I don't really want to get into a lot of detail here, because the topic is a little complicated; suffice it to say is that I strongly suspect Liskov and Wing are forced into considering it because (as noted earlier) they're trying to define subtyping in terms of substitutability, and their model thus necessarily seems to include some features that I believe should be kept firmly under the covers.  In our own model, by contrast (where we define substitutability in terms of subtyping), we find no good reason for embracing the concept of contravariance at all, and indeed we explicitly reject it.  (In fact, we claim in The Third Manifesto that contravariance seems to be a case of the implementation tail wagging the model dog.)

 

 

Extending the Example

 

Suppose it's been established that STACK is indeed a subtype of BAG in accordance with all of the requirements of the Liskov/Wing proposal.  Now suppose we want to define a new method--more precisely, a new "observer"--called UNION that takes two bags and returns the bag union of those two bags. What happens? 

 

Note: The bag union of two bags A and B is a bag in which the value v appears exactly n times if and only if it appears exactly a times in A and exactly b times in B and n = MAX(a,b).  In fact, the Liskov/Wing paper mentions a union operator for bags in passing, but doesn't give a precise definition for it and doesn't attempt to show it as a method. 

 

Clearly, if the supertype/subtype relationships are to be maintained, UNION has to work when the operands are two stacks, or one stack and one bag, as well as when they're both just bags.  (For simplicity, let's agree until further notice to use the term bag to mean, specifically, a bag that isn't a stack.)  Well, we can obviously write code--I mean, we can implement versions of the UNION method--that will work for these two cases.  However, the result is surely just a bag in both cases; certainly I don't see a sensible way of defining a "union" of two stacks that produces another stack as a result. 

 

The foregoing paragraph notwithstanding, I still don't think there's a problem--not yet, at any rate.  But suppose now that the UNION method is defined to be a mutator, not an observer (i.e., it actually modifies one of its arguments).  According to everything I've understood about the Liskov/Wing model so far, then, UNION will fail if the argument to be modified happens to be a stack, not just a bag!  (As I understand it, objects can't change type, and generalization by constraint isn't supported, in that model.)  In other words, the Subtype Requirement is violated--there's at least one context in which I simply can't substitute stacks for bags.  Am I forced to conclude, therefore, that it's no longer the case that STACK is a subtype of BAG? 

 

Now, I pointed out in Part 3of this series that the Liskov/Wing paper explicitly defines a type to consist, in part, of some specific set of methods, and hence that adding a new method to a given type changes the type.  Thus, it could logically be argued that adding the UNION mutator method to type BAG yields a brand new type, BAG+ say, and "all bets are off"--meaning that all supertype/subtype relationships now need to be reexamined and reestablished, where appropriate.  (In this connection, let me draw your attention to the title of the paper: A Behavioral Notion of Subtyping--emphasis added.  That title does tend to suggest that if the behavior changes, the supertype/subtype relationships change too.) 

 

Such reexamining and reestablishing of existing relationships seems to be a necessary consequence of defining subtyping in terms of substitutability instead of the other way around.  So (to say it again) the position can be logically defended.  But is it desirable?  Myself, I wouldn't have thought so--though it is admittedly still the case that a program that previously worked for bags, and therefore could previously be invoked on stacks instead of bags, can still be invoked on stacks instead of bags, because (by definition) such a program won't be using the new UNION method anyway (except that there aren't any bags now, there are "bag+"s instead.  Or do we have to keep type BAG as it was before, and define BAG+ as a proper subtype of BAG?  But then every BAG value would be a BAG+ value too; thus, to say that BAG+ is a "subtype" of BAG looks to me suspiciously like a hack.  (In fact, in our own inheritance model, we require that if S is a proper subtype of T, then there must exist at least one value of T that isn't a value of S.) 

 

(To Be Continued)

 

 

Posted 08/23/02

 

 

 

[ABOUT] [QUOTES] [LINKS]