7.1 Keys and Foreign Keys

Keys and Foreign Keys

Section outline

Introduction to using SQL for defining keys for a relation, specifically PRIMARY KEY or UNIQUE.
Explanation of using the term ‘key’ in connection with referential-integrity constraints.

#1

read: Section 7.1: Keys and Foreign Keys (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Introduction to using SQL for defining keys for a relation, specifically PRIMARY KEY or UNIQUE.
- Explanation of using the term 'key' in connection with referential-integrity constraints.

In SQL, an attribute or a set of attributes can be declared as a key for a relation. This declaration enforces integrity constraints on the data.

Candidate Keys

SQL allows the declaration of a key for a relation using the keywords PRIMARY KEY or UNIQUE. A primary key uniquely identifies each tuple in a relation and cannot contain NULL values. A unique key also ensures that all values in the column are distinct, but it may permit one NULL value.
Foreign Keys

SQL also uses the term “key” to establish a referential integrity constraint between two tables. This is known as a foreign key. A foreign-key constraint requires that a value in an attribute or set of attributes in one relation must also be present as a value in the primary key or a unique key of another (or possibly the same) relation.

Definition: Foreign Key Constraint

Let $R_1$ and $R_2$ be two relations (not necessarily distinct). Let $F$ be a set of attributes of $R_1$ and $G$ be a set of attributes of $R_2$. We say that $F$ is a foreign key of $R_1$ referencing $G$ in $R_2$ if the following two conditions hold:

The set of attributes $G$ must be declared as a candidate key (either PRIMARY KEY or UNIQUE) for the relation $R_2$.
For any tuple $t_1 \in R_1$, if no attribute in $F$ has a NULL value for $t_1$, then there must exist a tuple $t_2 \in R_2$ such that the values of $t_1$ on attributes $F$ are equal to the values of $t_2$ on attributes $G$. Formally: \[ \forall t_1 \in R_1, \left( \bigwedge_{a \in F} t_1[a] \neq \text{NULL} \right) \implies \left( \exists t_2 \in R_2, t_1[F] = t_2[G] \right) \]

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-20 20:19:36
Duration: 14.640941143035889
Cummulative cost: $1.95e-02

Declaring Foreign-Key Constraints

Section outline

Defines a foreign key constraint as an assertion that values for certain attributes must also appear in the primary-key component of another relation.
Lists the twofold implication of a foreign-key declaration: the referenced attribute must be UNIQUE or a PRIMARY KEY, and the foreign key’s values must appear in the referenced attribute’s values.
Describes two methods for declaring a foreign key: inline with the attribute using REFERENCES <table>(<attribute>), or as a separate clause using FOREIGN KEY (<attributes>) REFERENCES <table>(<attributes>).

#2

read: Section 7.1.1: Declaring Foreign-Key Constraints (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Defines a foreign key constraint as an assertion that values for certain attributes must also appear in the primary-key component of another relation.
- Lists the twofold implication of a foreign-key declaration: the referenced attribute must be UNIQUE or a PRIMARY KEY, and the foreign key's values must appear in the referenced attribute's values.
- Describes two methods for declaring a foreign key: inline with the attribute using `REFERENCES <table>(<attribute>)`, or as a separate clause using `FOREIGN KEY (<attributes>) REFERENCES <table>(<attributes>)`.

A foreign key constraint asserts that the values for a set of attributes in one relation must also appear as values for a primary or unique key in some tuple of another relation. This mechanism enforces referential integrity between tables.

Definition: Foreign Key Constraint

Let $R_1$ and $R_2$ be two relations (which may be the same). Let $F$ be a set of attributes of $R_1$, designated as a foreign key. Let $G$ be a set of attributes of $R_2$ that is declared as a PRIMARY KEY or UNIQUE.

The declaration of $F$ as a foreign key referencing $G$ implies that for any tuple $t \in R_1$, if the values for the attributes in $F$, denoted $t[F]$, are not NULL, then there must exist a tuple $s \in R_2$ such that $t[F] = s[G]$.

The declaration of a foreign key has a twofold implication:

The referenced attribute(s) in the second relation must be declared as either a PRIMARY KEY or UNIQUE. The system will not permit a foreign key declaration that references a non-key attribute set.
The values of the foreign key in the first relation must correspond to values in the referenced key of the second relation. The only exception is that foreign key attributes may contain NULL values, in which case the constraint is not checked for that particular tuple.

SQL provides two syntactic forms for declaring a foreign key within a CREATE TABLE statement.

Inline Declaration
For a foreign key consisting of a single attribute, the declaration may follow the attribute’s name and type. The form is REFERENCES <table>(<attribute>).
```
CREATE TABLE Studio (
    name CHAR(30) PRIMARY KEY,
    address VARCHAR(255),
    presC# INT REFERENCES MovieExec(cert#)
);
```
Separate Clause
For foreign keys of one or more attributes, a separate clause can be appended to the list of attribute declarations. The form is FOREIGN KEY (<attributes>) REFERENCES <table>(<attributes>).
```
CREATE TABLE Studio (
    name CHAR(30) PRIMARY KEY,
    address VARCHAR(255),
    presC# INT,
    FOREIGN KEY (presC#) REFERENCES MovieExec(cert#)
);
```

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-20 20:19:50
Duration: 13.400645017623901
Cummulative cost: $1.65e-02

Maintaining Referential Integrity

Section outline

Discusses actions prevented by the DBMS to enforce foreign-key constraints, such as insertions or updates with invalid foreign key values.
Describes three alternative policies to enforce referential integrity when the referenced relation is modified: the default policy (reject), the cascade policy, and the set-null policy.
Explains how to specify these policies using ON DELETE and ON UPDATE clauses with SET NULL or CASCADE.

#3

read: Section 7.1.2: Maintaining Referential Integrity (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Discusses actions prevented by the DBMS to enforce foreign-key constraints, such as insertions or updates with invalid foreign key values.
- Describes three alternative policies to enforce referential integrity when the referenced relation is modified: the default policy (reject), the cascade policy, and the set-null policy.
- Explains how to specify these policies using `ON DELETE` and `ON UPDATE` clauses with `SET NULL` or `CASCADE`.

Referential integrity ensures that relationships between tables remain consistent. When one table has a foreign key to another table, actions on the referenced table can invalidate the foreign key values in the referencing table. The database management system must enforce this constraint.

Definition: Referential Integrity Constraint

Let $R_1$ and $R_2$ be two relations. Let $K$ be a set of attributes that form a primary or unique key for $R_2$. Let $F$ be a corresponding set of attributes in $R_1$. A referential integrity constraint from $F$ in $R_1$ to $K$ in $R_2$ requires that for any tuple $t_1 \in R_1$, the value of its foreign key attributes, $t_1[F]$, must either be wholly NULL or must match the value of the key attributes, $t_2[K]$, for some tuple $t_2 \in R_2$. Formally: \[ \forall t_1 \in R_1, \ (t_1[F] \text{ is NULL}) \lor (\exists t_2 \in R_2 \text{ such that } t_1[F] = t_2[K]) \]

Prevented Actions
The DBMS prevents modifications that violate a foreign-key constraint. For a foreign key in relation $R_1$ referencing a key in $R_2$, the following actions are generally disallowed:
- An insertion or update to a tuple in $R_1$ that introduces a non-NULL foreign key value not present in the referenced key of $R_2$.
- A deletion of a tuple from $R_2$ whose key value is currently referenced by tuples in $R_1$.
- An update to the key value of a tuple in $R_2$ that is currently referenced by tuples in $R_1$.
Policies for Maintaining Integrity

While modifications to the referencing relation ($R_1$) that violate the constraint are always rejected, the schema designer can specify policies for handling violations caused by modifications to the referenced relation ($R_2$). There are three such policies.
- Default Policy (Reject): The system rejects any DELETE or UPDATE operation on the referenced relation that would leave dangling tuples in the referencing relation. This is the standard behavior if no other policy is specified.
- Cascade Policy: Changes to the referenced relation are propagated to the referencing relation. If a referenced tuple is deleted, all referencing tuples are also deleted. If a referenced key is updated, the corresponding foreign key values in all referencing tuples are updated to the new value.
- Set-Null Policy: If a referenced tuple is deleted or its key is updated, the foreign key values of all referencing tuples are set to NULL. This policy is only applicable if the foreign key attributes are permitted to be NULL.
SQL Declaration of Policies

These policies are declared within the foreign key definition using ON DELETE and ON UPDATE clauses. The desired action, CASCADE or SET NULL, follows these clauses. The policies for deletion and update can be specified independently.
```
CREATE TABLE Studio (
    name CHAR(30) PRIMARY KEY,
    address VARCHAR(255),
    presC# INT REFERENCES MovieExec(cert#)
        ON DELETE SET NULL
        ON UPDATE CASCADE
);
```
In this example, if a MovieExec tuple is deleted, the presC# for any studio they preside over is set to NULL. If a cert# in MovieExec is updated, the new value is cascaded to the corresponding presC# in the Studio table.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-20 20:20:07
Duration: 16.833184003829956
Cummulative cost: $2.23e-02

Deferred Checking of Constraints

Section outline

Presents the problem of inserting mutually dependent tuples (circular constraints) where immediate constraint checking would fail.
Introduces the concept of deferred constraint checking, which postpones validation until a transaction is about to commit.
Explains the use of DEFERRABLE and NOT DEFERRABLE keywords in constraint declarations.
Details the options INITIALLY DEFERRED and INITIALLY IMMEDIATE to control the default checking behavior.
Shows how to dynamically change the checking mode for a named constraint using the SET CONSTRAINT statement.

#4

read: Section 7.1.3: Deferred Checking of Constraints (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Presents the problem of inserting mutually dependent tuples (circular constraints) where immediate constraint checking would fail.
- Introduces the concept of deferred constraint checking, which postpones validation until a transaction is about to commit.
- Explains the use of `DEFERRABLE` and `NOT DEFERRABLE` keywords in constraint declarations.
- Details the options `INITIALLY DEFERRED` and `INITIALLY IMMEDIATE` to control the default checking behavior.
- Shows how to dynamically change the checking mode for a named constraint using the `SET CONSTRAINT` statement.

A significant issue arises with circular constraints, where two or more relations have foreign keys that reference each other.

Consider a scenario where a Studio tuple requires its presC# to exist in MovieExec(cert#), and simultaneously, a MovieExec tuple requires its cert# to exist in Studio(presC#). With immediate constraint checking, it becomes impossible to insert a new studio and its new president, as neither can be inserted first without violating a foreign-key constraint.
SQL provides a mechanism to defer constraint validation until a transaction is ready to commit.

This approach allows for the temporary violation of constraints within the scope of a transaction. All constraints must be satisfied by the time the transaction commits; otherwise, the transaction is rolled back.

Definition: Deferrable Constraint

A constraint is deferrable if its checking can be postponed until the end of a transaction. The default behavior is NOT DEFERRABLE, where checks are performed immediately after each data modification statement.

The deferral behavior of a constraint is specified using keywords in its declaration.
A constraint can be declared with one of two properties:
- DEFERRABLE: Allows the constraint checking to be postponed.
- NOT DEFERRABLE: The constraint is always checked immediately. This is the default.
For a DEFERRABLE constraint, the initial checking mode is also specified:
- INITIALLY DEFERRED: Checking is postponed until commit by default.
- INITIALLY IMMEDIATE: Checking is performed immediately after each statement by default, but can be changed to deferred within a transaction.
The checking mode for a named, deferrable constraint can be altered dynamically within a transaction.
If a constraint is named (e.g., MyConstraint) and declared as DEFERRABLE, its behavior can be controlled using the SET CONSTRAINT statement.
- To postpone checking: SET CONSTRAINT MyConstraint DEFERRED;
- To enforce immediate checking: SET CONSTRAINT MyConstraint IMMEDIATE;

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-20 20:20:19
Duration: 12.047591924667358
Cummulative cost: $1.53e-02