UPDATING VIEWS PART 4: JOINS AND OTHER VIEWS
by Chris Date and David McGoveran

 

 

 

 

(Continued from Part 3)

 

 

1. Introduction

 

The problem of view updating has been the subject of considerable study for many years.  In the previous three-part article, the present authors described an approach to the problem that seems more satisfactory (i.e., more systematic and more robust) than previous proposals, and applied that approach to the particular case of union, intersection, and difference views.  The emphasis in that paper on those operators was deliberate--it was our feeling that union, intersection, and difference, though important, had been largely neglected in most previous work; moreover, we also felt that a strong feature of our scheme was precisely the fact that it treated those operators correctly.  However, we clearly need to show how our approach applies to joins and other operators as well.  Hence the present paper.

 

Note:  The discussions of this article, like those of the previous one, are quite informal.  It is our intention to produce a formal description of our scheme as soon as time permits.

 

We assume the reader has already seen this article’s predecessor.  The major contribution of the earlier article was the identification of a series of principles that must be satisfied by any systematic view updating mechanism. 

 

We now discuss the application of the foregoing principles to the updating of views whose definition involves relational operations other than union, intersection, and difference.  The major operators we consider are restriction, projection, extension, and join.  As before, we limit our attention to single-row updates only, for simplicity; however, we must first repeat the following remarks from that paper.

 

"Important caveat:  The reader must understand that considering single-row updates only is in fact an oversimplification, and indeed a distortion of the truth.  Relational operations are always set-at-a-time; a set containing a single row is merely a special case.  What is more, a multi-row update is sometimes required (i.e., some updates cannot be simulated by a series of single-row operations).  And this remark is true of both base tables and views, in general.  For example, suppose [the employees table EMP is subject to the constraint that employees] E8 and E9 must have the same salary.  Then a single-row UPDATE that changes the salary of just one of the two will necessarily fail. Since our objective in this paper is merely to present an informal introduction to our ideas, we will (as stated) describe the update rules in terms of single-row operations.  But the reader should not lose sight of the foregoing important caveat."

 

One final preliminary remark:  We should make it clear at the outset that for the operators under consideration here, our rules (or some of them, at any rate) will probably not look all that different from those found in other proposals--at least at first glance.  Nevertheless, we claim that our rules are still more systematic than (for example) those of the SQL standard, and in any case part of the point is that our rules must be seen as a package; i.e., the rules for join (etc.) discussed in this article cannot be separated from the rules discussed in the previous one for union etc.

 

 

2. The Suppliers-And-Parts Database

 

All examples in this paper are based on the familiar suppliers-and-parts database.  See Fig. 1 for a simplified database definition and Fig. 2 for a set of sample data values.

 

CREATE DOMAIN S# ...;

CREATE DOMAIN NAME ... ;

CREATE DOMAIN STATUS ...;

CREATE DOMAIN CITY ...;

CREATE DOMAIN P# ...;

CREATE DOMAIN COLOR ...;

CREATE DOMAIN WEIGHT ...;

CREATE DOMAIN QTY ...;

 

CREATE BASE TABLE S

 (S# DOMAIN (S#),

  SNAME DOMAIN (NAME),

  STATUS DOMAIN (STATUS),

  CITY DOMAIN (CITY))

  PRIMARY KEY (S#);

 

CREATE BASE TABLE P

 (P# DOMAIN (P#),

  PNAME DOMAIN (NAME),

  COLOR DOMAIN (COLOR),

  WEIGHT  DOMAIN (WEIGHT),

  CITY DOMAIN (CITY))

  PRIMARY KEY (P#);

 

CREATE BASE TABLE SP

 (S# DOMAIN (S#),

  P# DOMAIN (P#),

  QTY DOMAIN (QTY))

  PRIMARY KEY (S#,P#)

  FOREIGN KEY (S#) REFERENCES S

  FOREIGN KEY (P#) REFERENCES P;

 

Fig. 1: The suppliers-and-parts database (data definition)

 

S                                         SP

+------------------------------+          +---------------+

¦ S# ¦ SNAME ¦ STATUS ¦ CITY   ¦          ¦ S# ¦ P# ¦ QTY ¦

+----+-------+--------+--------¦          +----+----+-----¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦          ¦ S1 ¦ P1 ¦ 300 ¦

¦ S2 ¦ Jones ¦     10 ¦ Paris  ¦          ¦ S1 ¦ P2 ¦ 200 ¦

¦ S3 ¦ Blake ¦     30 ¦ Paris  ¦          ¦ S1 ¦ P3 ¦ 400 ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦          ¦ S1 ¦ P4 ¦ 200 ¦

¦ S5 ¦ Adams ¦     30 ¦ Athens ¦          ¦ S1 ¦ P5 ¦ 100 ¦

+------------------------------+          ¦ S1 ¦ P6 ¦ 100 ¦

P                                         ¦ S2 ¦ P1 ¦ 300 ¦

+--------------------------------------+  ¦ S2 ¦ P2 ¦ 400 ¦

¦ P# ¦ PNAME ¦ COLOR ¦ WEIGHT ¦ CITY   ¦  ¦ S3 ¦ P2 ¦ 200 ¦

+----+-------+-------+--------+--------+  ¦ S4 ¦ P2 ¦ 200 ¦

¦ P1 ¦ Nut   ¦ Red   ¦     12 ¦ London ¦  ¦ S4 ¦ P4 ¦ 300 ¦

¦ P2 ¦ Bolt  ¦ Green ¦     17 ¦ Paris  ¦  ¦ S4 ¦ P5 ¦ 400 ¦

¦ P3 ¦ Screw ¦ Blue  ¦     17 ¦ Rome   ¦  +---------------+

¦ P4 ¦ Screw ¦ Red   ¦     14 ¦ London ¦

¦ P5 ¦ Cam   ¦ Blue  ¦     12 ¦ Paris  ¦

¦ P6 ¦ Cog   ¦ Red   ¦     19 ¦ London ¦

+--------------------------------------+

 

Fig. 2: The suppliers-and-parts database (sample values)

 

Notation:  Throughout this paper we use italic upper-case letters A, B, ... from near the beginning of the alphabet to refer generically to tables, and italic upper-case letters X, Y, ... from near the end of the alphabet to refer generically to columns of such tables.  The table predicates for A, B, ... are PA, PB, ..., respectively.  We use italic lower-case letters a, b, ... to refer generically to rows of tables A, B, ..., respectively.

 

 

3. Updating Restrictions

 

First of all, note that the table predicate for the table that results from the restriction operation (also known as a selection operation)

 

A WHERE condition

 

(where condition is, specifically, a restriction condition) is

 

( PA ) AND ( condition

 

For example, the table predicate for the restriction S WHERE CITY = 'London' is

 

( PS ) AND ( CITY = 'London' )

 

where PS is the table predicate for the suppliers table S.  It follows that (for example) any row r presented for insertion into a view defined by means of this restriction must be such that the conditions PS(r) and r.CITY = 'London' both evaluate to true, or the INSERT will fail (In SQL, however, such an attempt will fail only if the CREATE VIEW statement explicitly includes the specification WITH CHECK OPTION--i.e., by default, it will not fail.)

 

Here then are the update rules for A WHERE condition:

 

Ø       INSERT: The new row must satisfy both PA and condition.  It is inserted into A.

Ø       DELETE: The row to be deleted is deleted from A.

Ø       UPDATE: The row to be updated must be such that the updated version satisfies both PA and condition.  The row is deleted from A without performing any triggered actions or table predicate checks.  The updated version of the row is then inserted into A.

 

Examples: Let view LS be defined as

 

S WHERE CITY = 'London'

 

Fig. 3 shows a sample tabulation of this view, corresponding to the sample tabulation of S shown in Fig. 2.

 

LS

+------------------------------+

¦ S# ¦ SNAME ¦ STATUS ¦ CITY   ¦

+----+-------+--------+--------¦

¦ S1 ¦ Smith ¦     20 ¦ London ¦

¦ S4 ¦ Clark ¦     20 ¦ London ¦

+------------------------------+

Fig. 3: View LS (sample values)

 

 

·   An attempt to insert the row <S6,Green,20,London> into LS will succeed.  The new row will be inserted into table S, and will therefore be effectively inserted into the view as well.

 

·   An attempt to insert the row <S1,Green,20,London> into LS will fail, because it violates the table predicate for table S--specifically, it violates the uniqueness constraint on the primary key S.S# for table S. 

 

·   An attempt to insert the row <S6,Green,20,Athens> into LS will fail, because it violates the restriction condition CITY = 'London'.

 

·   An attempt to delete the LS row <S1,Smith,20,London> will succeed.  The row will be deleted from table S, and will therefore be effectively deleted from the view as well.

 

·   An attempt to update the LS row <S1,Smith,20,London> to <S6,Green,20,London> will succeed.  An attempt to update that same row <S1,Smith,20,London> to either <S2,Smith,20,London> or <S1,Smith,20,Athens> will fail--in the first case because it violates the primary key uniqueness constraint on table S, in the second case because it violates the restriction condition CITY = 'London'.

 

 

4. Updating Projections

 

Again we start by considering the relevant table predicate.  Let the columns of table A be partitioned into two disjoint groups, X and Y say.  Regard each of X and Y as a single composite column.  It is clear, then, that a given row <x> will appear in the projection A[X] if and only if there exists some value y from the domain of Y-values such that the row <x,y> appears in A.  For example, consider the projection of table S over S#, SNAME, and CITY.  Every row <s#,sname,city> appearing in that projection is such that there exists a status value status such that the row <s#,sname,status,city> appears in table S.

 

Here then are the update rules for A[X]:

 

Ø       INSERT:  Let the row to be inserted be <x>.  Let the default value of Y be y (A default value might be NULL in SQL.  We do not wish to get sidetracked into a discussion of the problems of nulls here; for present purposes, it is irrelevant whether the default is null or something else.) (it is an error if no such default value exists, i.e., if Y has "defaults not allowed").  The row <x,y> (which must satisfy PA) is inserted into A. 

 

Note:  Since candidate keys will usually (but not invariably) have "defaults not allowed," a projection that does not include all candidate keys of the underlying table will usually not permit INSERTs. 

 

Ø       DELETE:  All rows of A with the same X-value as the row to be deleted from A[X] are deleted from A. 

 

Note:  In practice, it will usually be desirable that X include at least one candidate key of A, so that the row to be deleted from A[X] is derived from exactly one row a of A.  However, there is no logical reason to make this a hard requirement.  (Analogous remarks apply in the case of UPDATE also--see below.)

 

Ø       UPDATE:  Let the row to be updated be <x> and let the updated version be <x'>.  Let a be a row of A with the same X-value x, and let the value of Y in row a be y.  All such rows a are deleted from A without performing any triggered actions or table predicate checks.  Then, for each such value y, row <x',y> (which must satisfy PA) is inserted into A.

 

Note: It is here that the "slight refinement" mentioned in Principle No. 7 in the "Preliminaries" section shows itself.  Specifically, observe that the final "INSERT" step in the UPDATE rule reinstates the previous Y-value in each inserted row--it does not replace it by the applicable default value, as a standalone INSERT would.

 

Examples: Let view SC be defined as

 

SC {S#, CITY}

 

Fig. 4 shows a sample tabulation of this view, corresponding to the sample tabulation of S shown in Fig. 2.

 

SC

+-------------+

¦ S# ¦ CITY   ¦

+----+--------+

¦ S1 ¦ London ¦

¦ S2 ¦ Paris  ¦

¦ S3 ¦ Paris  ¦

¦ S4 ¦ London ¦

¦ S5 ¦ Athens ¦

+-------------+

Fig. 4: View SC (sample values)

 

 

·   An attempt to insert the row <S6,London> into SC will succeed, and will have the effect of inserting the row <S6,n,t,London> into table S, where n and t are the default values for columns S.SNAME and S.STATUS respectively. 

 

·   An attempt to insert the row <S1,London> into SC will fail, because it violates the table predicate for table S--specifically, it violates the uniqueness constraint on the primary key S.S# for table S. 

 

·   An attempt to delete the row <S1,London> from SC will succeed.  The row for S1 will be deleted from table S. 

 

·   An attempt to update the SC row <S1,London> to <S1,Athens> will succeed; the effect will be to replace the row <S1,Smith,20,London> in table S by the row <S1,Smith,20,Athens>--not by the row <S1,n,t,Athens>, please observe.

 

·   An attempt to update that same SC row <S1,London> to <S2,London> will fail (why, exactly?).

 

Consideration of the case in which the projection does not include a candidate key of the underlying table--for example, the projection of table S over STATUS and CITY--is left as an exercise for the reader.

 

 

5. Updating Extensions

 

The reader might perhaps be unfamiliar with the relational EXTEND operator.  Here is a brief explanation.  Basically, EXTEND takes a specified table and--conceptually, at least--returns a new (derived) table that is similar to the original table but includes an additional column, values of which are obtained by evaluating some specified computational expression.  For example, we might write:

 

EXTEND P ADD ( WEIGHT * 454 ) AS GMWT

 

Assuming sample values for table P as given in Fig. 2, the result of this expression is as shown in Fig. 5.  (The assumption is that WEIGHT values are given in pounds; the expression WEIGHT * 454 will convert those weights to grams.)

 

+---------------------------------------------+

¦ P# ¦ PNAME ¦ COLOR ¦ WEIGHT ¦ CITY   ¦ GMWT ¦

+----+-------+-------+--------+--------+------+

¦ P1 ¦ Nut   ¦ Red   ¦     12 ¦ London ¦ 5448 ¦

¦ P2 ¦ Bolt  ¦ Green ¦     17 ¦ Paris  ¦ 7718 ¦

¦ P3 ¦ Screw ¦ Blue  ¦     17 ¦ Rome   ¦ 7718 ¦

¦ P4 ¦ Screw ¦ Red   ¦     14 ¦ London ¦ 6356 ¦

¦ P5 ¦ Cam   ¦ Blue  ¦     12 ¦ Paris  ¦ 5448 ¦

¦ P6 ¦ Cog   ¦ Red   ¦     19 ¦ London ¦ 8626 ¦

+---------------------------------------------+

Fig. 5: An example of EXTEND

 

 

Note that there is an exact one-to-one correspondence between rows of the extension and rows of the underlying table.

 

In general, the table predicate PE for the table E that results from the extension operation

 

EXTEND A ADD exp AS X

 

is as follows:

 

PA ( a ) AND e.X = exp ( a )

 

Here e is a row of table E and a is the projection of that row e over all columns of A.  In stilted English:

 

"Every row e in the extension is such that (a) the row a that is derived from e by projecting away the value e.X satisfies PA, and (b) that value e.X is equal to the result of applying the expression exp to that row a." 

 

For example, the table predicate for the extension of table P shown in Fig. 5 is as follows:

 

"Every row <p#,pname,color,weight,city,gmwt> in the extension is such that (a) the row <p#,pname,color,weight,city> satisfies the table predicate for P, and (b) the value gmwt is equal to the value 454 * weight."

 

Here then are the update rules for E = EXTEND A ADD exp AS X:

 

Ø       INSERT:  Let the row to be inserted be e; e must satisfy PE.  The row a that is derived from e by projecting away the value e.X is inserted into A. 

Ø       DELETE:  Let the row to be deleted be e.  The row a that is derived from e by projecting away the value e.X is deleted from A. 

Ø       UPDATE:  Let the row to be updated be e and let the updated version be e'; e' must satisfy PE.  The row a that is derived from e by projecting away the value e.X is deleted from A without performing any triggered actions or table predicate checks.  The row a' that is derived from e' by projecting away the value e'.X is inserted into A. 

 

Examples (refer to Fig. 5):

 

·   An attempt to insert the row <P7,Cog,Red,12,Paris,5448> will succeed, and will have the effect of inserting the row <P7,Cog,Red,12,Paris> into table P. 

 

·   An attempt to insert the row <P7,Cog,Red,12,Paris,5449> will fail (why?).

 

·   An attempt to insert the row <P1,Cog,Red,12,Paris,5448> will fail (why?).

 

·   An attempt to delete the row for P1 will succeed, and will have the effect of deleting the row for P1 from table P.

 

·   An attempt to update the row for P1 to <P1,Nut,Red,10,Paris,4540> will succeed; the effect will be to replace the row <P1,Nut,Red,12,London> in table P by the row <P1,Nut,Red,10,Paris>. 

 

·   An attempt to update that same row to a row for P2 (with all other values unchanged) or a row in which the GMWT value is not equal to 454 times the WEIGHT value will fail (in each case, why?).

 

 

(The authors would like to thank Nagraj Alur, Hugh Darwen, Fabian Pascal, and Paul Winsberg for their helpful comments on earlier drafts of this article.)

 

Comments On Republication: Originally published in Database Programming & Design 7, No. 6 (June 1994) and published as a two-part article in RELATIONAL DATABASE WRITINGS1991-94. It is republished here by permission of David McGoveran, Miller Freeman Inc. and Pearson Education,  Inc. © All rights reserved by C.J. Date. Research has shown that certain detail level corrections might be needed, which we may undertake in the future. However, we still believe strongly that the overall approach is sound.

 

(Continued in Part 5)

 

 

 

[ABOUT] [QUOTES] [LINKS]