(Continued from Part 3)
5. Updating Joins
Most previous treatments of the view update problem--including
those by one of the present authors (Date)--have argued that the updatability
or otherwise of a given join depends, at least in part, on whether the join is
one-to-one, one-to-many, or many-to-many.
By contrast, it is the contention of the present paper that joins are always
updatable. Moreover, the update rules
are identical in all three cases, and are essentially quite straightforward.
What makes this claim
plausible--startling though it might seem at first sight--is the new
perspective on the problem that is afforded by adoption of the fundamental
principle stated at the beginning of the "Preliminaries" section in
this series. Broadly speaking, the
overall objective of support for the view mechanism has always been to make
views look as much like base tables as possible, and this objective is indeed a
laudable one. With regard to view
update specifically, however:
Ø It
is usually assumed (implicitly) that it is always possible to update an
individual row of a base table independently of all the other rows in that base
table.
Ø
By contrast, it is manifestly not always possible to
update an individual row of a view independently of all the other rows in that
view. For example, Codd shows that it
is not possible to delete just one row from a certain join, because the effect
would be to leave a table that "is not the join of any two tables
whatsoever" (which means that the result could not possibly satisfy the
table predicate for the view). And the
approach to such view updates historically has always been to reject them
altogether, on the grounds that it is impossible to make them look completely
like base table updates.
Our approach is rather different. We recognize the fact that even with a base table, it is not
always possible to update individual rows independently of all the rest. (Consider what would happen, for example, if
the suppliers base table were subject to the constraint that either suppliers
S1 and S4 both appear or neither of them does.) Typically, therefore, we accept those view updates that have
historically been rejected, interpreting them in an obvious and logically
correct way to apply to the underlying table(s); we accept them, moreover, in
full recognition of the fact that updating those underlying tables might well
have side-effects on the view--side-effects that are, however, required in
order to avoid the possibility that the view might violate its own table
predicate.
With that preamble
out of the way, let us now get down to detail.
In what follows, we first define our terms. Then we present the update rules for joins. Then we consider the implications of those
rules for each of the three cases (one-to-one, one-to-many, many-to-many) in
turn.
First of all, then, we take the term "join" to mean
natural join specifically. Let
the columns of table A be partitioned into two disjoint groups, X
and Y say. Likewise, let the
columns of table B be partitioned into two disjoint groups, Y and
Z say. Now suppose that the
columns of Y (only) are common to the two tables, so that the columns of
X are "the other columns" of A and the columns of Z
are "the other columns" of B.
Suppose also that corresponding columns of Y (i.e., columns with
the same name) are defined on the same domain.
Finally, regard each of X, Y, and Z as a single composite
column. Then the expression
A JOIN B
yields a table with columns {X,Y,Z} consisting of all
rows <x,y,z> such that the row <x,y> appears in A
and the row <y,z> appears in B. The table predicate PJ for J = A JOIN B
is thus
PA ( a ) AND PB
( b )
where for a given row j of the join, a is the
"A-portion" of j (i.e., the row that is derived from j
by projecting away the value j.Z) and b is the "B-portion"
of j (i.e., the row that is derived from j by projecting away the
value j.X). In other words:
"Every row in the join is such
that the A-portion satisfies PA and the B-portion
satisfies PB."
For example, the table predicate for the join of tables S and
SP over S# is as follows:
"Every row <s#,sname,status,city,p#,qty>
in the join is such that the row <s#,sname,status,city> satisfies
the table predicate for S and the row <s#,p#,qty> satisfies the
table predicate for SP."
Here then are the update rules for J = A JOIN B:
Ø
INSERT: The new
row j must satisfy PJ. If the A-portion
of j does not appear in A, it is inserted into A (Note that this INSERT might
have the side-effect of inserting the B-portion into B also, as with INSERTs
on, e.g., unions or intersections. Analogous remarks apply to the DELETE and
UPDATE rules also; for brevity, we do not bother to spell out all such
possibilities here. If the B-portion of j does not appear in B, it is inserted
into B.
Note: The specific procedural manner in which the foregoing rule is
stated ("insert into A, then insert into B") should be
understood purely as a pedagogical device; it should not be taken to mean that
the DBMS would execute exactly that procedure in practice. Indeed, the principle of symmetry--No. 5
from the "Preliminaries" section--implies as much, because neither A
nor B has precedence over the other.
Analogous remarks apply to all of the other rules below.
Ø
DELETE: The A-portion
of the row to be deleted is deleted from A and the B-portion is
deleted from B.
Ø
UPDATE: The row
to be updated must be such that the updated version satisfies PJ. The A-portion is deleted from A,
without performing any triggered actions or table predicate checks, and the B-portion
is deleted from B, again without performing any triggered actions or
table predicate checks. Then, if the A-portion
of the updated version of the row does not appear in A, it is inserted
into A; if the B-portion does not appear in B, it is
inserted into B.
Let us now examine the implications of these rules for the
three different cases.
Case 1 (one-to-one):
The term "one-to-one" here would more accurately be
"(one-or-zero)-to-(one-or-zero)."
In other words, there is a DBMS-known integrity constraint in effect
that guarantees that for each row of A there is at most one matching row
in B and vice versa. More
precisely, the set of columns Y over which the join is performed must
include a subset (not necessarily a proper subset) K, say, such that K
is a candidate key for A and a candidate key for B.
Examples:
Ø
For a first example, the reader is invited to consider
the effect of the foregoing rules on the join of the suppliers table S to
itself over supplier numbers (only).
Ø
By way of a second example, suppose the
suppliers-and-parts database includes another base table, SR { S#, REST },
where S# identifies a supplier and REST identifies that supplier's favorite
restaurant. Assume that not all
suppliers in table S are represented in table SR. The reader is invited to consider the effect of the foregoing
rules on the join of tables S and SR (over S#). What difference would it make if a given supplier could be
represented in table SR and not in table S?
Case 2 (one-to-many):
The term "one-to-many" here would more accurately
be "(zero-or-one)-to-(zero-or-more)." In other words, there is a DBMS-known integrity constraint in effect
that guarantees that for each row of B there is at most one matching row
in A. Typically, what this means
is that the set of "common columns" Y over which the join is
performed must include a subset (not necessarily a proper subset) K,
say, such that K is a candidate key for A and a matching foreign
key for B (If this is in fact the case, and if (as we would strongly
recommend) that foreign key has "nulls not allowed," we can replace
the phrase "zero or one" by "exactly one."
Examples:
Let view SSP be defined as
S JOIN SP
(this is a foreign-key-to-matching-candidate-key join, of
course). Sample values are shown in
Fig. 6.
SSP
+-----------------------------------------+
¦ S# ¦ SNAME ¦ STATUS ¦ CITY ¦ P# ¦ QTY ¦
+----+-------+--------+--------+----+-----¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P1 ¦ 300 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P2 ¦ 200 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P3 ¦ 400 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P4 ¦ 200 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P5 ¦ 100 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P6 ¦ 100 ¦
¦ S2 ¦ Jones ¦ 10 ¦ Paris ¦ P1 ¦ 300
¦
¦ S2 ¦ Jones ¦ 10 ¦ Paris ¦ P2 ¦ 400
¦
¦ S3 ¦ Blake ¦ 30 ¦ Paris ¦ P2 ¦ 200
¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦ P2 ¦ 200 ¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦ P4 ¦ 300 ¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦ P5 ¦ 400 ¦
+-----------------------------------------+
Fig. 6: View SSP (sample values)
An attempt to insert the row
<S4,Clark,20,London,P6,100> into SSP will succeed, and will have the
effect of inserting the row <S4,P6,100> into table SP (thereby adding a
row to the view).
An attempt to insert the row
<S5,Adams,30,Athens,P6,100> into SSP will succeed, and will have the
effect of inserting the row <S5,P6,100> into table SP (thereby adding a
row to the view).
An attempt to insert the row
<S6,Green,20,London,P6,100> into SSP will succeed, and will have the
effect of inserting the row <S6,Green,20,London> into table S and the row
<S6,P6,100> into table SP (thereby adding a row to the view).
Note: Suppose for
the moment that it is possible for SP rows to exist without a corresponding S
row. Suppose moreover that table SP
already includes some rows with supplier number S6 (but not one with supplier
number S6 and part number P1). Then the
INSERT in the example just discussed will have the effect of inserting some
additional rows into the view--namely, the join of the row
<S6,Green,20,London> and those previously existing SP rows for supplier
S6.
·
An attempt to insert the row
<S4,Clark,20,Athens,P6,100> into SSP will fail (why?).
·
An attempt to insert the row
<S5,Adams,30,London,P6,100> into SSP will fail (why?).
·
An attempt to insert the row
<S1,Smith,20,London,P1,400> into SSP will fail (why?).
·
An attempt to delete the row
<S3,Blake,30,Paris,P2,200> from SSP will succeed, and will have the
effect of deleting the row <S3,Blake,30,Paris> from table S and the row
<S3,P2,200> from table SP.
·
An attempt to delete the row
<S1,Smith,20,London,P1,300> from SSP will "succeed" (see the
note below) and will have the effect of deleting the row <S1,Smith,20,London>
from table S and the row <S1,P1,300> from table SP.
Note: Actually the
overall effect of this attempted DELETE will depend on the foreign key delete
rule from SP.S# to S.S#. If the rule is
RESTRICT the overall operation will fail.
If it is CASCADE it will have the side-effect of deleting all other SP
rows for supplier S1 as well. Other
possibilities are left as an exercise for the reader.
·
An attempt to update the SSP row
<S1,Smith,20,London,P1,300> to <S1,Smith,20,London,P1,400> will succeed,
and will have the effect of updating the SP row <S1,P1,300> to
<S1,P1,400>.
·
An attempt to update the SSP row
<S1,Smith,20,London,P1,300> to <S1,Smith,20,Athens,P1,400> will
succeed, and will have the effect of updating the S row <S1,Smith,20,London>
to <S1,Smith,20,Athens> and the SP row <S1,P1,300> to
<S1,P1,400>.
·
An attempt to update the SSP row
<S1,Smith,20,London,P1,300> to <S6,Smith,20,London,P1,300> will
"succeed" (see the note below) and will have the effect of updating
the S row <S1,Smith,20,London> to <S6,Smith,20,London> and the SP
row <S1,P1,300> to <S6,P1,300>.
Note: Actually,
the overall effect of this attempted update will depend on the foreign key
update rule from SP.S# to S.S#. The
details are left as another exercise for the reader.
Case 3 (many-to-many):
The term "many-to-many" here would more accurately
be "(zero-or-more)-to-(zero-or-more)." In other words, there is no DBMS-known integrity constraint in
effect that guarantees that we are really dealing with a Case 1 or Case 2
situation.
Examples:
Let view SCP be defined as
S JOIN P
(join of S and P over CITY--a many-to-many join). Sample values are shown in Fig. 7.
SC
+------------------------------------------------------------+
| S# ¦ SNAME ¦ STATUS ¦ CITY ¦ P# ¦ PNAME ¦ COLOR ¦ WEIGHT ¦
+----+-------+--------+--------+----+-------+-------+--------¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P1 ¦ Nut
¦ Red ¦ 12 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P4 ¦ Screw ¦ Red ¦ 14 ¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦ P6 ¦ Cog
¦ Red ¦ 19 ¦
¦ S2 ¦ Jones ¦ 10 ¦ Paris ¦ P2 ¦
Bolt ¦ Green ¦ 17 ¦
¦ S2 ¦ Jones ¦ 10 ¦ Paris ¦ P5 ¦
Cam ¦ Blue ¦ 12 ¦
¦ S3 ¦ Blake ¦ 30 ¦ Paris ¦ P2 ¦
Bolt ¦ Green ¦ 17 ¦
¦ S3 ¦ Blake ¦ 30 ¦ Paris ¦ P5 ¦
Cam ¦ Blue ¦ 12 ¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦ P1 ¦ Nut
¦ Red ¦ 12 ¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦ P4 ¦ Screw ¦ Red ¦ 14 ¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦ P6 ¦ Cog
¦ Red ¦ 19 ¦
+------------------------------------------------------------+
Fig. 7: The join
of S and P over CITY
·
Inserting the row
<S7,Brown,15,Oslo,P8,Wheel,White,25> will succeed, and will have the
effect of inserting the row <S7,Brown,15,Oslo> into table S and the row
<P8,Wheel,White,25,Oslo> into table P (thereby adding the specified row
to the view).
·
Inserting the row
<S1,Smith,20,London,P7,Washer,Red,5> will succeed, and will have the
effect of inserting the row <P7,Washer,Red,5,London> into table P
(thereby adding two rows to the view--the row <S1,Smith,20,London,P7,Washer,Red,5>
(as specified) and the row <S4,Clark,20,London,P7,Washer,Red,5>).
·
Inserting the row
<S6,Green,20,London,P7,Washer,Red,5> will succeed, and will have the
effect of inserting the row <S6,Green,20,London> into table S and the row
<P7,Washer,Red,5,London> into table P (thereby adding six rows to
the view).
·
Deleting the row
<S1,Smith,20,London,P1,Nut,Red,12> will succeed, and will have the effect
of deleting the row <S1,Smith,20,London> from table S and the row
<P1,Nut,Red,12,London> from table P (thereby deleting four rows
from the view).
Further examples are
left as an exercise for the reader.
(The authors would
like to thank Nagraj Alur, Hugh Darwen, Fabian Pascal, and Paul Winsberg for
their helpful comments on earlier drafts of this article.)
Comments On Republication: Originally published in Database
Programming & Design 7, No. 6 (June 1994) and published as a two-part
article in RELATIONAL DATABASE
WRITINGS 1991-94. It is
republished here by permission of David McGoveran, Miller Freeman Inc. and Pearson Education, Inc. © All rights
reserved by C.J. Date. Research has shown that certain detail level
corrections might be needed, which we may undertake in the future. However, we
still believe strongly that the overall approach is sound.
(Continued in Part 6)
Posted
01/10/03
[ABOUT]
[QUOTES]
[LINKS]