(Continued from Part 3)
1. Introduction
The problem of view updating has been the subject of considerable
study for many years. In the previous
three-part article, the present authors described an approach to the problem
that seems more satisfactory (i.e., more systematic and more robust) than
previous proposals, and applied that approach to the particular case of union,
intersection, and difference views. The
emphasis in that paper on those operators was deliberate--it was our feeling
that union, intersection, and difference, though important, had been largely neglected
in most previous work; moreover, we also felt that a strong feature of our
scheme was precisely the fact that it treated those operators correctly. However, we clearly need to show how our
approach applies to joins and other operators as well. Hence the present paper.
Note: The discussions
of this article, like those of the previous one, are quite informal. It is our intention to produce a formal
description of our scheme as soon as time permits.
We assume the reader has already seen this article’s
predecessor. The major contribution of
the earlier article was the identification of a series of principles that must
be satisfied by any systematic view updating mechanism.
We now discuss the application of the foregoing principles to
the updating of views whose definition involves relational operations other
than union, intersection, and difference.
The major operators we consider are restriction, projection,
extension, and join. As
before, we limit our attention to single-row updates only, for simplicity;
however, we must first repeat the following remarks from that paper.
"Important caveat: The reader must understand that considering single-row updates
only is in fact an oversimplification, and indeed a distortion of the
truth. Relational operations are always
set-at-a-time; a set containing a single row is merely a special case. What is more, a multi-row update is
sometimes required (i.e., some updates cannot be simulated by a series
of single-row operations). And this
remark is true of both base tables and views, in general. For example, suppose [the employees table
EMP is subject to the constraint that employees] E8 and E9 must have the same
salary. Then a single-row UPDATE that
changes the salary of just one of the two will necessarily fail. Since our
objective in this paper is merely to present an informal introduction to our
ideas, we will (as stated) describe the update rules in terms of single-row
operations. But the reader should not
lose sight of the foregoing important caveat."
One final preliminary remark: We should make it clear at the outset that for the operators
under consideration here, our rules (or some of them, at any rate) will
probably not look all that different from those found in other proposals--at
least at first glance. Nevertheless, we
claim that our rules are still more systematic than (for example) those of the
SQL standard, and in any case part of the point is that our rules must be
seen as a package; i.e., the rules for join (etc.) discussed in this
article cannot be separated from the rules discussed in the previous one for
union etc.
2. The Suppliers-And-Parts Database
All examples in this paper are based on the familiar
suppliers-and-parts database. See Fig.
1 for a simplified database definition and Fig. 2 for a set of sample data
values.
CREATE DOMAIN S# ...;
CREATE DOMAIN NAME ... ;
CREATE DOMAIN STATUS ...;
CREATE DOMAIN CITY ...;
CREATE DOMAIN P# ...;
CREATE DOMAIN COLOR ...;
CREATE DOMAIN WEIGHT ...;
CREATE DOMAIN QTY ...;
CREATE BASE TABLE S
(S# DOMAIN (S#),
SNAME DOMAIN (NAME),
STATUS DOMAIN (STATUS),
CITY DOMAIN (CITY))
PRIMARY KEY (S#);
CREATE BASE TABLE P
(P# DOMAIN (P#),
PNAME DOMAIN (NAME),
COLOR DOMAIN (COLOR),
WEIGHT DOMAIN (WEIGHT),
CITY DOMAIN (CITY))
PRIMARY KEY (P#);
CREATE BASE TABLE SP
(S# DOMAIN (S#),
P# DOMAIN (P#),
QTY DOMAIN (QTY))
PRIMARY KEY (S#,P#)
FOREIGN KEY (S#)
REFERENCES S
FOREIGN KEY (P#)
REFERENCES P;
Fig.
1: The suppliers-and-parts database (data definition)
S SP
+------------------------------+ +---------------+
¦ S# ¦ SNAME
¦ STATUS ¦ CITY ¦ ¦ S# ¦ P# ¦ QTY ¦
+----+-------+--------+--------¦ +----+----+-----¦
¦ S1 ¦ Smith
¦ 20 ¦ London ¦ ¦ S1 ¦ P1 ¦ 300 ¦
¦ S2 ¦ Jones
¦ 10 ¦ Paris ¦
¦ S1 ¦ P2 ¦ 200 ¦
¦ S3 ¦ Blake
¦ 30 ¦ Paris ¦
¦ S1 ¦ P3 ¦ 400 ¦
¦ S4 ¦ Clark
¦ 20 ¦ London ¦ ¦ S1 ¦ P4 ¦ 200 ¦
¦ S5 ¦ Adams
¦ 30 ¦ Athens ¦ ¦ S1 ¦ P5 ¦ 100 ¦
+------------------------------+ ¦ S1 ¦ P6 ¦ 100 ¦
P ¦ S2
¦ P1 ¦ 300 ¦
+--------------------------------------+ ¦ S2 ¦ P2 ¦ 400 ¦
¦ P# ¦ PNAME
¦ COLOR ¦ WEIGHT ¦ CITY ¦ ¦ S3 ¦ P2 ¦ 200 ¦
+----+-------+-------+--------+--------+ ¦ S4 ¦ P2 ¦ 200 ¦
¦ P1 ¦
Nut ¦ Red ¦ 12 ¦ London ¦ ¦ S4 ¦ P4 ¦ 300 ¦
¦ P2 ¦
Bolt ¦ Green ¦ 17 ¦ Paris ¦ ¦ S4 ¦ P5 ¦ 400 ¦
¦ P3 ¦ Screw
¦ Blue ¦ 17 ¦ Rome ¦ +---------------+
¦ P4 ¦ Screw
¦ Red ¦ 14 ¦ London ¦
¦ P5 ¦
Cam ¦ Blue ¦ 12 ¦ Paris ¦
¦ P6 ¦
Cog ¦ Red ¦ 19 ¦ London ¦
+--------------------------------------+
Fig.
2: The suppliers-and-parts database (sample values)
Notation: Throughout this
paper we use italic upper-case letters A, B, ... from near the beginning
of the alphabet to refer generically to tables, and italic upper-case letters X,
Y, ... from near the end of the alphabet to refer generically to columns of
such tables. The table predicates for A,
B, ... are PA, PB, ..., respectively. We use italic lower-case letters a, b, ... to refer
generically to rows of tables A, B, ..., respectively.
3. Updating Restrictions
First of all, note that the table predicate for the table
that results from the restriction operation (also known as a selection
operation)
A WHERE condition
(where condition is, specifically, a restriction
condition) is
( PA ) AND ( condition
)
For example, the table predicate for the restriction S WHERE
CITY = 'London' is
( PS ) AND ( CITY =
'London' )
where PS is the table predicate for the suppliers table
S. It follows that (for example) any
row r presented for insertion into a view defined by means of this
restriction must be such that the conditions PS(r) and r.CITY =
'London' both evaluate to true, or the INSERT will fail (In SQL, however,
such an attempt will fail only if the CREATE VIEW statement explicitly includes
the specification WITH CHECK OPTION--i.e., by default, it will not
fail.)
Here then are the update rules for A WHERE condition:
Ø INSERT:
The new row must satisfy both PA and condition. It is inserted into A.
Ø DELETE:
The row to be deleted is deleted from A.
Ø UPDATE:
The row to be updated must be such that the updated version satisfies both PA
and condition. The row is deleted from
A without performing any triggered actions or table predicate checks. The updated version of the row is then
inserted into A.
Examples: Let
view LS be defined as
S WHERE CITY = 'London'
Fig. 3 shows a sample tabulation of this view, corresponding
to the sample tabulation of S shown in Fig. 2.
LS
+------------------------------+
¦ S# ¦ SNAME ¦ STATUS ¦ CITY ¦
+----+-------+--------+--------¦
¦ S1 ¦ Smith ¦ 20 ¦ London ¦
¦ S4 ¦ Clark ¦ 20 ¦ London ¦
+------------------------------+
Fig. 3: View LS (sample values)
·
An attempt to insert the row <S6,Green,20,London>
into LS will succeed. The new row will
be inserted into table S, and will therefore be effectively inserted into the
view as well.
·
An attempt to insert the row <S1,Green,20,London>
into LS will fail, because it violates the table predicate for table
S--specifically, it violates the uniqueness constraint on the primary key S.S#
for table S.
·
An attempt to insert the row <S6,Green,20,Athens>
into LS will fail, because it violates the restriction condition CITY =
'London'.
·
An attempt to delete the LS row
<S1,Smith,20,London> will succeed.
The row will be deleted from table S, and will therefore be effectively
deleted from the view as well.
·
An attempt to update the LS row
<S1,Smith,20,London> to <S6,Green,20,London> will succeed. An attempt to update that same row
<S1,Smith,20,London> to either <S2,Smith,20,London> or
<S1,Smith,20,Athens> will fail--in the first case because it violates the
primary key uniqueness constraint on table S, in the second case because it
violates the restriction condition CITY = 'London'.
4. Updating Projections
Again we start by considering the relevant table
predicate. Let the columns of table A
be partitioned into two disjoint groups, X and Y say. Regard each of X and Y as a
single composite column. It is
clear, then, that a given row <x> will appear in the projection A[X]
if and only if there exists some value y from the domain of Y-values
such that the row <x,y> appears in A. For example, consider the projection of
table S over S#, SNAME, and CITY. Every
row <s#,sname,city> appearing in that projection is such that
there exists a status value status such that the row <s#,sname,status,city>
appears in table S.
Here then are the update rules for A[X]:
Ø INSERT: Let the row to be inserted be
<x>. Let the default value of Y
be y (A default value might be NULL in SQL.
We do not wish to get sidetracked into a discussion of the problems of
nulls here; for present purposes, it is irrelevant whether the default is null
or something else.) (it is an error if no such default value exists, i.e., if Y
has "defaults not allowed").
The row <x,y> (which must satisfy PA) is inserted into A.
Note: Since candidate keys will usually (but not invariably) have
"defaults not allowed," a projection that does not include all
candidate keys of the underlying table will usually not permit INSERTs.
Ø DELETE: All rows of A with the same X-value as the
row to be deleted from A[X] are deleted from A.
Note: In practice, it will usually be desirable that X include
at least one candidate key of A, so that the row to be deleted from A[X]
is derived from exactly one row a of A. However, there is no logical reason to make this a hard
requirement. (Analogous remarks apply
in the case of UPDATE also--see below.)
Ø UPDATE: Let the row to be updated be <x> and
let the updated version be <x'>.
Let a be a row of A with the same X-value x, and let the value of Y in
row a be y. All such rows a are deleted
from A without performing any triggered actions or table predicate checks. Then, for each such value y, row
<x',y> (which must satisfy PA) is inserted into A.
Note: It is here that the
"slight refinement" mentioned in Principle No. 7 in the
"Preliminaries" section shows itself. Specifically, observe that the final "INSERT" step in
the UPDATE rule reinstates the previous Y-value in each inserted row--it
does not replace it by the applicable default value, as a standalone
INSERT would.
Examples: Let
view SC be defined as
SC {S#, CITY}
Fig. 4 shows a sample tabulation of this view, corresponding
to the sample tabulation of S shown in Fig. 2.
SC
+-------------+
¦ S# ¦ CITY
¦
+----+--------+
¦ S1 ¦ London ¦
¦ S2 ¦ Paris
¦
¦ S3 ¦ Paris
¦
¦ S4 ¦ London ¦
¦ S5 ¦ Athens ¦
+-------------+
Fig. 4: View SC (sample values)
·
An attempt to insert the row <S6,London> into SC
will succeed, and will have the effect of inserting the row <S6,n,t,London>
into table S, where n and t are the default values for columns
S.SNAME and S.STATUS respectively.
·
An attempt to insert the row <S1,London> into SC
will fail, because it violates the table predicate for table S--specifically,
it violates the uniqueness constraint on the primary key S.S# for table S.
·
An attempt to delete the row <S1,London> from SC
will succeed. The row for S1 will be
deleted from table S.
·
An attempt to update the SC row <S1,London> to
<S1,Athens> will succeed; the effect will be to replace the row
<S1,Smith,20,London> in table S by the row <S1,Smith,20,Athens>--not
by the row <S1,n,t,Athens>, please observe.
·
An attempt to update that same SC row <S1,London>
to <S2,London> will fail (why, exactly?).
Consideration of the case in which the projection does not
include a candidate key of the underlying table--for example, the projection of
table S over STATUS and CITY--is left as an exercise for the reader.
5. Updating Extensions
The reader might perhaps be unfamiliar with the relational EXTEND
operator. Here is a brief
explanation. Basically, EXTEND takes a
specified table and--conceptually, at least--returns a new (derived) table that
is similar to the original table but includes an additional column, values of
which are obtained by evaluating some specified computational expression. For example, we might write:
EXTEND P ADD ( WEIGHT * 454 ) AS
GMWT
Assuming sample values for table P as given in Fig. 2, the
result of this expression is as shown in Fig. 5. (The assumption is that WEIGHT values are given in pounds; the
expression WEIGHT * 454 will convert those weights to grams.)
¦ P# ¦ PNAME ¦ COLOR ¦ WEIGHT ¦ CITY ¦ GMWT ¦
+----+-------+-------+--------+--------+------+
¦ P1 ¦ Nut
¦ Red ¦ 12 ¦ London ¦ 5448 ¦
¦ P2 ¦ Bolt
¦ Green ¦ 17 ¦ Paris ¦ 7718 ¦
¦ P3 ¦ Screw ¦ Blue ¦ 17 ¦ Rome ¦ 7718 ¦
¦ P4 ¦ Screw ¦ Red ¦ 14 ¦ London ¦ 6356
¦
¦ P5 ¦ Cam
¦ Blue ¦ 12 ¦ Paris ¦ 5448 ¦
¦ P6 ¦ Cog
¦ Red ¦ 19 ¦ London ¦ 8626 ¦
+---------------------------------------------+
Fig. 5: An example of EXTEND
Note that there is an exact one-to-one correspondence between
rows of the extension and rows of the underlying table.
In general, the table predicate PE for the table E
that results from the extension operation
EXTEND A ADD exp AS X
is as follows:
PA ( a ) AND e.X
= exp ( a )
Here e is a row of table E and a is the
projection of that row e over all columns of A. In stilted English:
"Every row e in the extension is such that (a)
the row a that is derived from e by projecting away the value e.X
satisfies PA, and (b) that value e.X is equal to the result of
applying the expression exp to that row a."
For example, the table predicate for the extension of table P
shown in Fig. 5 is as follows:
"Every row <p#,pname,color,weight,city,gmwt>
in the extension is such that (a) the row <p#,pname,color,weight,city>
satisfies the table predicate for P, and (b) the value gmwt is equal to
the value 454 * weight."
Here then are the update rules for E = EXTEND A
ADD exp AS X:
Ø
INSERT: Let the
row to be inserted be e; e must satisfy PE.
The row a that is derived from e by projecting away the value e.X is
inserted into A.
Ø
DELETE: Let the
row to be deleted be e. The row a that
is derived from e by projecting away the value e.X is deleted from A.
Ø
UPDATE: Let the
row to be updated be e and let the updated version be e'; e' must satisfy
PE. The row a that is derived from e by
projecting away the value e.X is deleted from A without performing any
triggered actions or table predicate checks.
The row a' that is derived from e' by projecting away the value e'.X is
inserted into A.
Examples (refer to Fig. 5):
·
An attempt to insert the row
<P7,Cog,Red,12,Paris,5448> will succeed, and will have the effect of
inserting the row <P7,Cog,Red,12,Paris> into table P.
·
An attempt to insert the row
<P7,Cog,Red,12,Paris,5449> will fail (why?).
·
An attempt to insert the row
<P1,Cog,Red,12,Paris,5448> will fail (why?).
·
An attempt to delete the row for P1 will succeed, and
will have the effect of deleting the row for P1 from table P.
·
An attempt to update the row for P1 to
<P1,Nut,Red,10,Paris,4540> will succeed; the effect will be to replace
the row <P1,Nut,Red,12,London> in table P by the row
<P1,Nut,Red,10,Paris>.
·
An attempt to update that same row to a row for P2
(with all other values unchanged) or a row in which the GMWT value is not equal
to 454 times the WEIGHT value will fail (in each case, why?).
(The authors would
like to thank Nagraj Alur, Hugh Darwen, Fabian Pascal, and Paul Winsberg for
their helpful comments on earlier drafts of this article.)
Comments On Republication: Originally published in Database
Programming & Design 7, No. 6 (June 1994) and published as a two-part
article in RELATIONAL
DATABASE WRITINGS1991-94. It is republished here by permission of David
McGoveran, Miller Freeman Inc. and Pearson
Education, Inc. © All rights reserved
by C.J. Date. Research has shown that certain detail level corrections
might be needed, which we may undertake in the future. However, we still
believe strongly that the overall approach is sound.
(Continued in Part 5)
[ABOUT]
[QUOTES]
[LINKS]