(Continued from Part 2)
5. Updating Unions
The general principles articulated in the Part 2 apply
to all kinds of updates on all kinds of tables. In particular, they apply to updates on joins, restrictions,
projections, etc. For the remainder of
this paper, however, we concentrate on the question of updates on unions,
intersections, and differences specifically (unions in this section,
intersections and differences in the next).
We begin with a few preliminary remarks.
1.
We assume we are updating a table defined by means of an
expression of the form A UNION B or A INTERSECT B
or A MINUS B (as appropriate), where A and B are
arbitrary relational expressions (i.e., they are not necessarily base
tables). A and B must be
type-compatible.
2.
The table predicates corresponding to A and B
are PA and PB, respectively.
3.
Several of the view update rules refer to the possibility of side
effects. Now, it is well known that
side-effects are usually undesirable; the point is, however, that side-effects
might be unavoidable if A and B happen to be overlapping subsets
of the same underlying table, as will frequently be the case with union,
intersection, and difference views.
4.
We limit our attention to single-row updates only, for
simplicity.
Important caveat:
The reader must understand that considering single-row updates only is
in fact an oversimplification, and indeed a distortion of the
truth. Relational operations are always
set-at-a-time; a set containing a single row is merely a special case. What is more, a multi-row update is
sometimes required (i.e., some updates cannot be simulated by a series
of single-row operations). And this
remark is true of both base tables and views, in general. For example, suppose table EMP includes two
additional employees, E8 and E9, and is subject to the constraint that E8 and
E9 must have the same salary. Then a
single-row UPDATE that changes the salary of just one of the two will
necessarily fail.
Since our objective in this paper is merely to present an informal
introduction to our ideas, we will (as stated) describe the update rules in
terms of single-row operations. But the
reader should not lose sight of the foregoing important caveat.
Here then is the INSERT rule for A UNION B:
Ø
The new row must satisfy PA or PB or both. If it satisfies PA, it is inserted into A
(note that this INSERT might have the side-effect of inserting the row into B
also). If it satisfies PB, it is
inserted into B, unless it was inserted into B already as a side effect of
inserting it into A.
Note: The specific
procedural manner in which the foregoing rule is stated ("insert into A,
then insert into B") should be understood purely as a pedagogic
device; it should not be taken to mean that the DBMS will execute exactly
that procedure in practice. Indeed, the
principle of symmetry--No. 4 from the "Further Principles" section--implies
as much, because neither A nor B has precedence over the
other. Analogous remarks apply to all
of the rules discussed in this paper.
Explanation:
Ø
The new row must satisfy at least one of PA and PB
because otherwise it does not qualify for inclusion in A UNION B--i.e.,
it does not satisfy the table predicate, viz. (PA) OR (PB),
for A UNION B. (As an
aside, we note also that the new row must not already appear in either A
or B, because otherwise we would be trying to insert a row that already
exists.)
Ø
If the requirements of the previous paragraph are
satisfied, the new row is inserted into whichever of A or B it
logically belongs to (possibly both).
Examples:
Let view UV be defined as
(EMP WHERE DEPT# = 'D1') UNION (EMP WHERE SAL > 33K)
Fig. 2 shows a sample tabulation of this view, corresponding
to the sample tabulation of EMP shown in Fig. 1.
+----------------------------+
¦ EMP# ¦ ENAME ¦ DEPT# ¦ SAL ¦
+------+-------+-------+-----¦
|
E1 ¦ Lopez ¦ D1 ¦ 25K ¦
¦ E2
¦ Cheng ¦ D1 ¦ 42K ¦
¦ E4
¦ Saito ¦ D2 ¦ 45K ¦
+----------------------------+
Fig. 2: View UV (sample values)
Ø
Let the row to be inserted be
<E5,Smith,D1,30K>. This row
satisfies the table predicate for EMP WHERE DEPT# = 'D1' (though not the table
predicate for EMP WHERE SAL > 33K).
It is therefore inserted into EMP WHERE DEPT# = 'D1'. Because of the rules regarding INSERT on a
restriction (which are fairly obvious and are not spelled out in detail here),
the effect is to insert the new row into the EMP base table.
Ø
Now let the row to be inserted be
<E6,Jones,D1,40K>. This row
satisfies the table predicate for EMP WHERE DEPT# = 'D1' and the table
predicate for EMP WHERE SAL > 33K.
It is therefore logically inserted into both. However, inserting it into either of the two restrictions has the
side-effect of inserting it into the other anyway, so there is no need to
perform the second INSERT explicitly.
Now suppose EMPA and EMPB are two distinct base
tables, EMPA representing employees in department D1 and EMPB representing
employees with salary > 33K (see Fig. 3); suppose view UV is defined as EMPA
UNION EMPB, and consider again the two sample INSERTs previously
discussed. Inserting the row
<E5,Smith,D1,30K> into view UV will cause that row to be inserted into base
table EMPA, presumably as required.
However, inserting the row <E6,Jones,D1,40K> into view UV will
cause that row to be inserted into both base tables! This result is logically correct, although
arguably counterintuitive (it is an example of what we called a "slightly
surprising result" earlier). It
is the authors' position that such surprises can occur only if the database is
badly designed. In particular, it is our position that a design that
permits the very same row to appear in--i.e., to satisfy the table predicate
for--two distinct base tables is by definition a bad design. This (perhaps controversial!) position is
elaborated elsewhere.
EMPA EMPB
+----------------------------+
+----------------------------+
¦
EMP# ¦ ENAME ¦ DEPT# ¦ SAL ¦ ¦ EMP# ¦ ENAME ¦ DEPT# ¦ SAL ¦
+------+-------+-------+-----¦
+------+-------+-------+-----¦
¦ E1
¦ Lopez ¦ D1 ¦ 25K ¦ ¦ E2 ¦ Cheng ¦ D1 ¦ 42K ¦
¦ E2
¦ Cheng ¦ D1 ¦ 42K ¦ ¦ E4 ¦ Saito ¦ D2 ¦ 45K ¦
+----------------------------+
+----------------------------+
Fig. 3: Base tables EMPA and
EMPB (sample values)
Note: To pave the
way for an understanding of “bad design”, the reader might care to meditate on
the fact that the two base tables EMPA and EMPB already both contain the row
<E2,Cheng,D1,42K>. How did this
state of affairs arise?
We turn now to the DELETE rule for A UNION B:
Ø
If the row to be deleted appears in A, it is deleted
from A (note that this DELETE might have the side-effect of deleting the row
from B also). If it (still) appears in
B, it is deleted from B.
Examples to illustrate this rule are left as an exercise for
the reader. Note that (in general)
deleting a row from A or B might cause a cascade DELETE or some
other triggered action to be performed.
Finally, the UPDATE rule:
Ø
The row to be updated must be such that the updated
version satisfies PA or PB or both. If
the row to be updated appears in A, it is deleted from A without performing any
triggered actions (cascade DELETE, etc.) that such a DELETE would normally
cause, and likewise without checking the table predicate for A. Note that this DELETE might have the side
effect of deleting the row from B also.
If the row (still) appears in B, it is deleted from B (again without any
triggered actions or table predicate checks).
Next, if the updated version of the row satisfies PA, it is inserted
into A (note that this INSERT might have the side-effect of inserting the
updated version into B also). Finally,
if the updated version satisfies PB, it is inserted into B, unless it was
inserted into B already as a side effect of inserting it into A.
This UPDATE rule essentially consists of the DELETE rule
followed by the INSERT rule, except that (as indicated) no triggered actions or
table predicate checks are performed after the DELETE (any triggered actions
associated with the UPDATE are conceptually performed after all deletions and
insertions have been done, just prior to the table predicate checks).
It is worth pointing out that one important consequence of
treating UPDATEs in this fashion is that a given UPDATE can effectively cause a
row to move from one table to another.
Given the database of Fig. 3, for example, updating the row
<E1,Lopez,D1,25K> within view UV to <E1,Lopez,D2,40K> will delete
the existing row for Lopez from EMPA and insert the updated row for Lopez into
EMPB.
6. Updating Intersections And Differences
Here now are the rules for updating A INTERSECT B. This time we simply state the rules without
further discussion (they follow the same general pattern as the union
rules). Again, examples to illustrate
the various cases are left as an exercise for the reader.
Ø
INSERT: The new
row must satisfy both PA and PB.
If it does not currently appear in A, it is inserted into A
(note that this INSERT might have the side-effect of inserting the row into B
also). If it (still) does not appear in
B, it is inserted into B.
Ø
DELETE: The row
to be deleted is deleted from A (note that this DELETE might have the
side-effect of deleting the row from B also). If it (still) appears in B, it is deleted from B.
Ø
UPDATE: The row
to be updated must be such that the updated version satisfies both PA
and PB. The row is deleted from A
without performing any triggered actions or table predicate checks (note that
this DELETE might have the side-effect of deleting it from B also); if
it (still) appears in B, it is deleted from B, again without any
triggered actions or table predicate checks.
Next, if the updated version of the row does not currently appear in A,
it is inserted into A (note that this INSERT might have the side-effect
of inserting the row into B also).
If it (still) does not appear in B, it is inserted into B.
And here are the rules for updating A MINUS B:
Ø
INSERT: The new
row must satisfy PA and not PB. It is
inserted into A.
Ø
DELETE: The row
to be deleted is deleted from A.
Ø
UPDATE: The row
to be updated must be such that the updated version satisfies PA and not
PB. The row is deleted from A without
performing any triggered actions or table predicate checks; the updated version
of the row is then inserted into A.
7. Concluding Remarks
We have described a systematic approach to the view-updating
problem in general, and have applied that approach to the question of updating
union, intersection, and difference views in particular. A critical aspect of our approach is that a
given row can appear in a given table only if that row does not cause the table
predicate for that table to be violated, and this observation is just as true
for a view as it is for a base table.
In other words, the table predicate for a given table constitutes the criterion
for update acceptability for that table.
Regarding the rules for union, intersection, and difference
views specifically, the following desirable properties of our approach are
worth calling out explicitly:
1.
Each kind of view supports all three update operations (INSERT,
UPDATE, DELETE). By contrast, other
proposals allow, e.g., DELETE but not INSERT on a union view, implying that the
user might be able to delete a row from a given view and then not be able to
insert that very same row back into that very same view.
2. Certain
important equivalences are preserved.
For example, the expressions A INTERSECT B and A
MINUS (A MINUS B) are semantically identical and should thus
display identical update behavior if treated as view definitions, and so they
do (exercise for the reader!).
3. For
union and difference, INSERT and DELETE are always inverses of each other;
however, for intersection they might not be (quite). For instance, if A and B are distinct base tables,
inserting row r into V = A INTERSECT B might cause r
to be inserted into A only (because it is already present in B);
subsequently deleting r from V will now cause r to be
deleted from both A and B.
(On the other hand, deleting r and then reinserting it will
always preserve the status quo.)
However, it is once again the authors' position that such an asymmetry
can arise only if the database is badly designed (in particular, if the design
permits the very same row to satisfy the table predicate for two distinct base
tables).
Of the foregoing, we remark that 1 (support for all three
update operations) and 3 (INSERT and DELETE inverses of each other) might be
regarded as two more principles that a systematic view updating mechanism
really ought to satisfy if possible. 2
(certain equivalences preserved) is in fact a special case of the second of the
principles already stated in the section "Further Principles"
earlier.
Finally, we note that (of course) few DBMS products today
support any kind of updates at all on union, intersection, and difference
views. It is our hope that this paper
can serve as a guideline to be followed (a) by vendors in adding the necessary
support to their products; (b) by the SQL standards committees in their efforts
to develop the next iteration of the SQL standard [Ed.
Comment: Don’t hold your breath]. In the meantime, DBAs and
application programmers who have to develop workaround solutions (using,
perhaps, stored or triggered procedures) to the problems caused by the current
lack of support would be well advised to adhere to the principles described in
the foregoing sections.
(The authors would like to thank Nagraj Alur, Hugh Darwen,
Fabian Pascal, and Paul Winsberg for their helpful comments on earlier drafts
of this paper.)
Comments On Republication: Originally published in Database
Programming & Design 7, No. 6 (June 1994) and published as a two-part
article in RELATIONAL
DATABASE WRITINGS1991-94. It is republished here by permission of David
McGoveran, Miller Freeman Inc. and Pearson
Education, Inc. © All rights reserved
by C.J. Date. research has shown that certain detail level corrections
might be needed, which we may undertake in the future. However, we still
believe strongly that the overall approach is sound.
(See also Updating
Joins and Other Views Parts 4, 5 and 6.)
References And Bibliography
1.
International Organization for Standardization (ISO): Database
Language SQL, Document ISO/IEC 9075:1992.
Also available as American National Standards Institute (ANSI) Document
ANSI X3.135-1992.
2.
D. D. Chamberlin, J. N. Gray, and I. L. Traiger: "Views,
Authorization, and Locking in a Relational Data Base System," Proc. NCC 44,
Anaheim, Calif. Montvale, N.J.: AFIPS
Press (May 1975).
3.
E. F. Codd: The Relational Model for Database Management
Version 2. Reading, Mass.:
Addison-Wesley (1990).
4.
Hugh Darwen: "Without Check Option," in C. J. Date
and Hugh Darwen, Relational Database Writings 1989-1991. Reading, Mass.: Addison-Wesley (1992).
5.
C. J. Date: An
Introduction to Database Systems (6th edition). Reading, Mass.: Addison-Wesley (1994).
6.
C. J. Date: "A Matter of Integrity" (in three parts)
(in this volume).
7.
C. J. Date: "A Contribution to the Study of Database
Integrity," in Relational Database Writings 1985-1989. Reading, Mass.: Addison-Wesley (1990).
8.
C. J. Date: "Updating Views," in Relational
Database: Selected Writings.
Reading, Mass.: Addison-Wesley (1986).
9.
C. J. Date and Hugh Darwen: A Guide to the SQL Standard
(3rd edition). Reading, Mass.:
Addison-Wesley (1993).
10.
C. J. Date and David McGoveran: "Updating Joins and Other
Views" (in this volume).
11.
C. J. Date and Colin J. White: A Guide to DB2 (4th
edition). Reading, Mass.:
Addison-Wesley (1992).
12.
Umeshwar Dayal and Philip A. Bernstein: "On the Correct
Translation of Update Operations on Relational Views," ACM TODS 7,
No. 3 (September 1982).
13.
L. Furtado and M. A. Casanova: "Updating Relational
Views," in W. Kim, D. Reiner, and D. Batory (eds.), Query Processing in
Database Systems. New York, N.Y.:
Springer Verlag (1985).
14.
Nathan Goodman: "View Update Is Practical," InfoDB
5, No. 2 (Summer 1990).
15.
Arthur M. Keller:
"Algorithms for Translating View Updates to Database Updates for
Views Involving Selections, Projections, and Joins," Proc. 4th ACM
SIGACT-SIGMOD Symposium on Principles of Database Systems, Portland, Ore.
(March 1985).
16.
David McGoveran and C. J. Date: A Guide to SYBASE and SQL
Server. Reading, Mass.:
Addison-Wesley (1992).
17.
David McGoveran and C. J. Date: "A New Database Design
Principle" (in this volume).
18.
M. R. Stonebraker: "Implementation of Views and Integrity
Constraints by Query Modification," Proc. ACM SIGMOD International
Conference on Management of Data, San Jose, Calif. (May 1975).
Posted
12/13/02
[ABOUT]
[QUOTES]
[LINKS]