(Continued from Part 4)
This is the fifth part of this six-part series on the Liskov
Substitution Principle (LSP), also known as the principle of
substitutability. In this and the
forthcoming last part, I want to conclude my examination of the paper A
Behavioral Notion of Subtyping, by Barbara Liskov and Jeannette Wing (ACM
Transactions on Programming Languages and Systems 16, No. 6, November
1994).
The Running Example
The Liskov/Wing paper makes extensive use of a running
example involving bagsand stacks. Here's the paper's own introduction to the
example:
“Consider a bounded bag type that provides a put method that
inserts elements into a bag and a get method that removes an arbitrary element
from a bag. Put has a precondition that
checks to see that adding an element will not grow the bag beyond its bound;
get has a precondition that checks to see that the bag is nonempty.
Consider also a bounded stack type that has, in addition to push
and pop methods, a swap_top method that takes an integer, i, and modifies the
stack by replacing its top with i.
Stack's push and pop methods have preconditions similar to bag's put and
get, and swap_top has a precondition requiring that the stack is nonempty. “
Note: For clarity, from this point forward
I'll set the type names and method names in upper case, even when I'm giving
what are otherwise verbatim quotes from the Liskov/Wing paper.
Observe, by the way, that the running example involves bags
and stacks containing integers specifically; the much more interesting
question of generic bag and stack types isn't discussed. Anyway, to continue with that
example:
“Suppose we want to show STACK is a subtype of BAG. We need to relate the values of
stacks to
those of bags. This can be done by
means of an abstraction function ... A given STACK value maps to a BAG value
where we abstract from the insertion order on the elements.
We also need to relate STACK's methods to BAG's. Clearly there is a correspondence
between
STACK's PUSH method and BAG's PUT and similarly for the POP and GET methods
(even though the names of the corresponding methods do not match).”
At this point I'd like to make a few observations about the
example. First of all, note the
assumption that the two types already exist, and now we're trying to show that
one is a subtype of the other. In other
words, to paraphrase something I said in Part 3 [substitute_3.htm]
of this series, the emphasis is on verifying the hypothesis that STACK
is a subtype of BAG. It's categorically
not a matter of taking the statement "STACK is a subtype of BAG" as a
given and exploring the logical consequences of that fact; rather, it's the
exact opposite--in other words, Liskov and Wing are effectively saying that if
we have substitutability, then we have subtyping. Our own approach, by contrast, is to say that if
we have subtyping,
then we have substitutability. (And so I've been quite unfair to the
Liskov/Wing paper in this article so far, because I've almost totally ignored
what the authors regard as its main contribution! But my purpose isn't so much to examine that
contribution as
such, but rather--to say it one more time--to examine the Liskov Substitution
Principle.)
Second, it's presumably because the two types already exist
that we get into that business of their having "different value
spaces" (see Part 4 of
this series). After all, it's almost
certainly the case that bags and stacks will have different storage
representations (note the remark about "relating the values of stacks to
those of bags"). But I stand by my
claim that such questions are an implementation matter merely and shouldn't
show through to the model level. We
shouldn't even be talking about them at the model level, even if we're
discussing the two types independently and not considering the possibility that
one might be a subtype of the other.
Third, it's also because the two types already exist that
"corresponding methods have different names"--i.e., the fact that
there are different versions of "the same" method is exposed
at the model level (note the remark about "relating STACK's methods to
BAG's"). Again, however, I stand
by my claim that such issues are an implementation matter merely and shouldn't
show through to the model level; again, we shouldn't even be talking
about them at the model level.
The foregoing conjectures on my part are reinforced by the
following facts:
Ø The
paper explicitly states, "subtypes must provide [the] expected methods
with compatible signatures." This
remark too suggests an emphasis on preexisting types, because if we were
explicitly defining a new type S to be a subtype of an existing type T, there's
no way S couldn't have "[the] expected methods with compatible
signatures." (In our approach, in
fact, the question of whether the signatures are "compatible" doesn't
even make sense.)
Ø The
paper also states that "a common case [is] that the subtype adds some
extra methods [like SWAP_TOP in the example] but does not change any existing
ones" (italics added).
"Changing existing methods" here refers, I'm pretty sure, to
the idea that the implementations might be different at the supertype and
subtype levels--suggesting, again, that we're really talking about distinct
implementation versions of "the same" method, and thus talking about
an implementation concern rather than a model one.
Preconditions and Postconditions
Back to the running example.
I assume it's clear, at least in principle, how the necessary mappings
of stack values and methods to bag values and methods can be done. Thus, we are close to being able
to say that
STACK is a subtype of BAG. However, we
aren't quite all the way there ... According to Liskov and Wing, we also need
to show among other things that:
The preconditions for PUT and GET imply those for PUSH
and POP, respectively.
The postconditions for PUSH and POP imply those for
PUT and GET, respectively.
For brevity, let's consider just PUT and PUSH and ignore the
other two. Informally, the precondition
for PUT is simply that the target bag isn't full, or in other words that its
current size is less than the maximum size (recall that the bags we're talking
about are bounded ones specifically).
Analogously, the precondition for PUSH is that the target stack isn't
full. So it's easy to see that the
precondition for PUT implies that for PUSH (Of course, there are other aspects
to the preconditions (and postconditions); for example, the operands for PUT
must be of type BAG and INTEGER, respectively.
I'm ignoring such refinements here for simplicity.)
Likewise, the postcondition for PUSH is that the stack now
additionally contains the specified integer in its topmost position, while the
postcondition for PUT is that the bag now additionally contains the specified
integer (we can't say where, because the concept of "position" within
a bag has no meaning). Thus, it's easy
to see that the postcondition for PUSH implies that for PUT.
Of course, it goes without saying that in order for the
foregoing logical implications to be checked, the preconditions and
postconditions in question must be stated explicitly somewhere. Liskov and Wing propose a
required
clause for stating preconditions and an ensures clause for stating
postconditions; these clauses appear as part of the specification of the method
in question. Note: At run time, then, it should be possible--at
least in principle--for the system to check that any given method invocation
does indeed satisfy the stated conditions, but such run-time checking isn't the
principal point of the clauses.
Our own type model (our type model, please note, not
our inheritance model) currently includes nothing corresponding to the requires
and ensures clauses, though there's no reason why we couldn't extend it to do
so if we wanted to. What we definitely
wouldn't do, however, is have two distinct versions of the same method, a
subtype version and a supertype version, both explicitly user-visible, with two
distinct sets of preconditions and postconditions. In our view, the need to check the various
precondition and
postcondition implications arises from a defect in the Liskov/Wing model--more
precisely, from the fact that the model exposes certain features (i.e.,
distinct versions of the same method) that ought really to be hidden.
Covariance and Contravariance
Liskov and Wing also espouse two notions called "covariance
and contravariance." Our own
inheritance model supports covariance, too; in fact, you can't not
support it if you support substitutability (which is to say, if you support
subtyping and inheritance at all, since in our view subtyping and inheritance logically
imply substitutability).
Covariance--more precisely referred to as result covariance--says
that if a method M is defined to return a result of type T, then an invocation
of M is allowed to return a result of any subtype of T. For example, if "add one" is
defined to return a result of type INT, sometimes it will return a result of
type EVEN instead. (Here, as in Part 2
of this article, INT is integers, EVEN is even integers, and EVEN is a subtype
of INT.)
Contravariance is another matter, however. I don't really want to get into a lot
of
detail here, because the topic is a little complicated; suffice it to say is
that I strongly suspect Liskov and Wing are forced into considering it because
(as noted earlier) they're trying to define subtyping in terms of
substitutability, and their model thus necessarily seems to include some
features that I believe should be kept firmly under the covers. In our own model, by contrast
(where we
define substitutability in terms of subtyping), we find no good reason for
embracing the concept of contravariance at all, and indeed we explicitly reject
it. (In fact, we claim in The Third
Manifesto that contravariance seems to be a case of the implementation
tail wagging the model dog.)
Extending the Example
Suppose it's been established that STACK is indeed a subtype
of BAG in accordance with all of the requirements of the Liskov/Wing
proposal. Now suppose we want to define
a new method--more precisely, a new "observer"--called UNION that
takes two bags and returns the bag union of those two bags. What happens?
Note: The bag union of two bags A and B is a
bag in which the value v appears exactly n times if and only if
it appears exactly a times in A and exactly b times in B
and n = MAX(a,b). In
fact, the Liskov/Wing paper mentions a union operator for bags in passing, but
doesn't give a precise definition for it and doesn't attempt to show it as a
method.
Clearly, if the supertype/subtype relationships are to be
maintained, UNION has to work when the operands are two stacks, or one stack
and one bag, as well as when they're both just bags. (For simplicity, let's agree until further
notice to use the term
bag to mean, specifically, a bag that isn't a stack.) Well, we can obviously write code--I
mean, we
can implement versions of the UNION method--that will work for these two
cases. However, the result is surely
just a bag in both cases; certainly I don't see a sensible way of defining a
"union" of two stacks that produces another stack as a result.
The foregoing paragraph notwithstanding, I still don't think
there's a problem--not yet, at any rate.
But suppose now that the UNION method is defined to be a mutator,
not an observer (i.e., it actually modifies one of its arguments). According to everything I've
understood
about the Liskov/Wing model so far, then, UNION will fail if the argument to be
modified happens to be a stack, not just a bag! (As I understand it, objects can't change type, and
generalization
by constraint isn't supported, in that model.)
In other words, the Subtype Requirement is violated--there's at least
one context in which I simply can't substitute stacks for bags. Am I forced to conclude, therefore,
that it's
no longer the case that STACK is a subtype of BAG?
Now, I pointed out in Part 3of this series that the
Liskov/Wing paper explicitly
defines a type to consist, in part, of some specific set of methods, and hence
that adding a new method to a given type changes the type. Thus, it could logically be argued that
adding the UNION mutator method to type BAG yields a brand new type, BAG+ say,
and "all bets are off"--meaning that all supertype/subtype relationships
now need to be reexamined and reestablished, where appropriate. (In this connection, let me draw
your
attention to the title of the paper: A Behavioral Notion of Subtyping--emphasis
added. That title does tend to suggest
that if the behavior changes, the supertype/subtype relationships change too.)
Such reexamining and reestablishing of existing relationships
seems to be a necessary consequence of defining subtyping in terms of
substitutability instead of the other way around. So (to say it again) the position can be
logically defended. But is it desirable? Myself, I wouldn't have thought so--though
it is admittedly still the case that a program that previously worked for bags,
and therefore could previously be invoked on stacks instead of bags, can still
be invoked on stacks instead of bags, because (by definition) such a program
won't be using the new UNION method anyway (except that there aren't any bags
now, there are "bag+"s instead.
Or do we have to keep type BAG as it was before, and define BAG+ as a
proper subtype of BAG? But then every
BAG value would be a BAG+ value too; thus, to say that BAG+ is a
"subtype" of BAG looks to me suspiciously like a hack. (In fact, in our own inheritance
model, we
require that if S is a proper subtype of T, then there must exist at least one
value of T that isn't a value of S.)
(To Be Continued)
Posted
08/23/02
[ABOUT]
[QUOTES]
[LINKS]