Multivalued Dependencies

Section outline

Introduces the concept of a multivalued dependency (MVD) as an assertion that two sets of attributes are independent of one another.
Positions MVD as a generalization of a functional dependency (FD).

#1

read: Section 3.6: Multivalued Dependencies (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Introduces the concept of a multivalued dependency (MVD) as an assertion that two sets of attributes are independent of one another.
- Positions MVD as a generalization of a functional dependency (FD).

A multivalued dependency is an assertion that two attributes or sets of attributes are independent of one another. This condition generalizes the notion of a functional dependency.

Multivalued Dependency (MVD)

An MVD is a statement about a relation $R$ asserting that when values for one set of attributes are fixed, the values in certain other attributes are independent of the values of all other attributes in the relation.
Generalization of Functional Dependency

Every functional dependency (FD) implies a corresponding multivalued dependency. However, some situations involving attribute independence cannot be explained by FDs alone and require the use of MVDs.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-21 11:32:31
Duration: 16.489032983779907
Cummulative cost: $1.58e-02

Attribute Independence and Its Consequent Redundancy

Section outline

Discusses situations where a relation schema is in BCNF but still exhibits redundancy not related to functional dependencies.
Identifies the primary cause of such redundancy as attempting to store two or more independent, set-valued properties of a key within a single relation.
Uses an example of a relation storing movie stars’ multiple addresses and multiple movies to illustrate this type of redundancy.
Establishes that this example relation is in BCNF because it has no nontrivial FDs, yet it contains significant data duplication.

#2

read: Section 3.6.1: Attribute Independence and Its Consequent Redundancy (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Discusses situations where a relation schema is in BCNF but still exhibits redundancy not related to functional dependencies.
- Identifies the primary cause of such redundancy as attempting to store two or more independent, set-valued properties of a key within a single relation.
- Uses an example of a relation storing movie stars' multiple addresses and multiple movies to illustrate this type of redundancy.
- Establishes that this example relation is in BCNF because it has no nontrivial FDs, yet it contains significant data duplication.

revise: At the end, give an example of the `Stars(name, street, city, title, year)` relation.
- Show the redundancy in the multiple (street, city) of the stars.
- Any cell that is considered redundant should be in color red.
- Use markdown table.  For color red, use `<span class="red">...</span>`.

It is possible for a relation schema to be in Boyce-Codd Normal Form (BCNF) and yet still exhibit a form of redundancy not captured by functional dependencies.
The most common source of this redundancy is the attempt to store two or more independent, set-valued properties of a key within a single relation.

When an entity possesses multiple, independent attributes that can each take on a set of values, storing them in a single relation necessitates representing every combination of these values. This leads to a Cartesian product of the value sets for each key, resulting in significant data repetition.
Consider a relation storing information about movie stars, their addresses, and the movies in which they have appeared.

The schema might be Stars(name, street, city, title, year). A star can have multiple addresses and can appear in multiple movies. These two properties—addresses and movies—are independent of one another. To represent this information, each of a star’s addresses must be paired with each of her movies, creating redundancy. For example, each address is repeated for every movie, and each movie is repeated for every address.
The Stars relation, despite its redundancy, is in BCNF.

There are no nontrivial functional dependencies in this relation. For any given star, knowing their name, one address, and one movie does not uniquely determine the year of another movie, or the city of another address. Since no attribute is functionally determined by a proper subset of the others, the only key consists of all five attributes {name, street, city, title, year}. As there are no nontrivial FDs, there can be no BCNF violations.
The table below illustrates the redundancy for an instance of the Stars relation. Each address is repeated for each movie, and each movie is repeated for each address.

name	street	city	title	year
C. Fisher	123 Maple St.	Hollywood	Star Wars	1977
C. Fisher	5 Locust Ln.	Malibu	Star Wars	1977
C. Fisher	123 Maple St.	Hollywood	Empire Strikes Back	1980
C. Fisher	5 Locust Ln.	Malibu	Empire Strikes Back	1980
C. Fisher	123 Maple St.	Hollywood	Return of the Jedi	1983
C. Fisher	5 Locust Ln.	Malibu	Return of the Jedi	1983

gemini-2.5-pro

Version: 2
Model: gemini-2.5-pro
Created: 2025-09-21 11:44:00
Duration: 17.92033576965332
Cummulative cost: $3.77e-02

Definition of Multivalued Dependencies

Section outline

Provides a formal definition for a multivalued dependency (MVD), denoted as $A_1 A_2 \cdots A_n \to\to B_1 B_2 \cdots B_m$.
Explains that an MVD holds if for any pair of tuples $t$ and $u$ agreeing on the $A$’s, there exists a tuple $v$ that agrees with $t$ and $u$ on the $A$’s, with $t$ on the $B$’s, and with $u$ on the remaining attributes.
Notes that this implies that for a fixed value of the $A$’s, the associated values of the $B$’s and the other attributes appear in all possible combinations.

#3

read: Section 3.6.2: Definition of Multivalued Dependencies (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Provides a formal definition for a multivalued dependency (MVD), denoted as $A_1 A_2 \cdots A_n \to\to B_1 B_2 \cdots B_m$.
- Explains that an MVD holds if for any pair of tuples $t$ and $u$ agreeing on the $A$'s, there exists a tuple $v$ that agrees with $t$ and $u$ on the $A$'s, with $t$ on the $B$'s, and with $u$ on the remaining attributes.
- Notes that this implies that for a fixed value of the $A$'s, the associated values of the $B$'s and the other attributes appear in all possible combinations.

A multivalued dependency (MVD) is a constraint that asserts the independence of certain sets of attributes within a relation. It is a generalization of a functional dependency.

Definition: Multivalued Dependency

Let $R$ be a relation schema with attribute set $U$. Let $X$ and $Y$ be subsets of $U$. The multivalued dependency $X \to\to Y$ holds on $R$ if for any two tuples $t_1$ and $t_2$ in a relation instance $r$ of $R$ such that $t_1[X] = t_2[X]$, there must exist a tuple $t_3$ in $r$ with the following properties:

$t_3[X] = t_1[X] = t_2[X]$
$t_3[Y] = t_1[Y]$
$t_3[U - X - Y] = t_2[U - X - Y]$

where $U - X - Y$ represents the set of attributes in $R$ that are not in $X$ or $Y$.

The MVD $A_1 A_2 \cdots A_n \to\to B_1 B_2 \cdots B_m$ is a statement about a relation $R$.

It asserts that when the values for the attributes $\{A_1, \dots, A_n\}$ are fixed, the values of the attributes $\{B_1, \dots, B_m\}$ are independent of the values of all other attributes in the relation.
The formal definition requires the existence of a third tuple for any pair of tuples that agree on the left-hand side attributes.

If two tuples, $t$ and $u$, agree on the values for the attributes $A_1, \dots, A_n$, then another tuple $v$ must exist in the relation. This tuple $v$ agrees with both $t$ and $u$ on the $A$’s, agrees with $t$ on the $B$’s, and agrees with $u$ on all other attributes.
This condition implies that values are combined in a combinatorial fashion.

By interchanging the roles of $t$ and $u$, a fourth tuple can also be shown to exist. The consequence is that for any fixed set of values for the A-attributes, the associated values for the B-attributes and the other attributes must appear in all possible combinations with each other in the relation.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-21 11:33:01
Duration: 14.375041007995605
Cummulative cost: $1.76e-02

Reasoning About Multivalued Dependencies

Section outline

Introduces a set of rules for reasoning about MVDs, analogous to Armstrong’s axioms for FDs.
Defines Trivial MVDs, where the right side is a subset of the left side.
Presents the transitive rule for MVDs.
Explains that MVDs do not obey the splitting part of the splitting/combining rule.
States the FD Promotion rule: every functional dependency implies a corresponding multivalued dependency (if $X \to Y$, then $X \to\to Y$).
Describes the Complementation Rule: If $X \to\to Y$ holds, then $X \to\to Z$ also holds, where $Z$ represents all attributes not in $X$ or $Y$.
Defines another form of trivial MVD where the union of the left and right sides comprises all attributes of the relation.

#4

read: Section 3.6.3: Reasoning About Multivalued Dependencies (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Introduces a set of rules for reasoning about MVDs, analogous to Armstrong's axioms for FDs.
- Defines Trivial MVDs, where the right side is a subset of the left side.
- Presents the transitive rule for MVDs.
- Explains that MVDs do not obey the splitting part of the splitting/combining rule.
- States the FD Promotion rule: every functional dependency implies a corresponding multivalued dependency (if $X \to Y$, then $X \to\to Y$).
- Describes the Complementation Rule: If $X \to\to Y$ holds, then $X \to\to Z$ also holds, where $Z$ represents all attributes not in $X$ or $Y$.
- Defines another form of trivial MVD where the union of the left and right sides comprises all attributes of the relation.

Similar to functional dependencies, a set of formal rules exists for reasoning about multivalued dependencies. These rules allow for the inference of new MVDs from a given set.

Trivial MVDs

An MVD is considered trivial if the attributes on the right-hand side are a subset of the attributes on the left-hand side. Such dependencies hold for any relation. Formally, for attribute sets $X$ and $Y$ in a relation $R$, if $Y \subseteq X$, then the MVD $X \to\to Y$ is trivial.
Transitive Rule

If $X \to\to Y$ and $Y \to\to Z$, then $X \to\to Z-Y$ holds. Note that this rule is more restricted than its counterpart for functional dependencies.
Non-Splitting Rule

Multivalued dependencies do not obey a splitting or decomposition rule. If $X \to\to YZ$, it does not necessarily follow that $X \to\to Y$ or $X \to\to Z$. This is a significant departure from the behavior of functional dependencies.
FD Promotion

Any functional dependency is also a multivalued dependency. This rule establishes a direct link between the two types of dependencies.

Definition: FD Promotion Rule

If a functional dependency $X \to Y$ holds for a relation $R$, then the multivalued dependency $X \to\to Y$ also holds for $R$.
Complementation Rule

This rule states that if an MVD $X \to\to Y$ holds in a relation $R(U)$, then the MVD that relates $X$ to all attributes not in $X$ or $Y$ must also hold.

Definition: Complementation Rule

Let $U$ be the set of all attributes for a relation $R$. If the MVD $X \to\to Y$ holds in $R$, then the MVD $X \to\to (U - X - Y)$ also holds in $R$.
Union Rule for Trivial MVDs

A consequence of the complementation rule is another form of trivial MVD. An MVD is trivial if the union of its left and right sides constitutes all attributes of the relation. Formally, if $X \cup Y = U$ for a relation $R(U)$, then $X \to\to Y$ holds. This follows from applying the complementation rule to the trivial MVD $X \to\to \emptyset$.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-21 11:33:22
Duration: 20.78642201423645
Cummulative cost: $2.40e-02

Fourth Normal Form

Section outline

Introduces Fourth Normal Form (4NF) as the normal form designed to eliminate redundancy caused by MVDs.
Formally defines 4NF: A relation $R$ is in 4NF if for every nontrivial MVD $X \to\to Y$ that holds for $R$, the set of attributes $X$ is a superkey for $R$.
Explains that 4NF is a stricter normal form than BCNF, meaning any relation in 4NF is also in BCNF.
Shows that a relation can be in BCNF but not in 4NF if it contains a nontrivial MVD whose left side is not a superkey.

#5

read: Section 3.6.4: Fourth Normal Form (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Introduces Fourth Normal Form (4NF) as the normal form designed to eliminate redundancy caused by MVDs.
- Formally defines 4NF: A relation $R$ is in 4NF if for every nontrivial MVD $X \to\to Y$ that holds for $R$, the set of attributes $X$ is a superkey for $R$.
- Explains that 4NF is a stricter normal form than BCNF, meaning any relation in 4NF is also in BCNF.
- Shows that a relation can be in BCNF but not in 4NF if it contains a nontrivial MVD whose left side is not a superkey.

The redundancy caused by multivalued dependencies (MVDs) can be eliminated by using these dependencies for decomposition. This leads to a new normal form, called Fourth Normal Form (4NF).
The condition for 4NF is essentially the BCNF condition, but applied to MVDs instead of just FDs.

Definition: Fourth Normal Form (4NF)

A relation schema $R$ is in 4NF with respect to a set of dependencies $F$ (that includes both FDs and MVDs) if for every nontrivial MVD $X \to\to Y$ in $F^+$, $X$ is a superkey for $R$.

A MVD $X \to\to Y$ is considered nontrivial if: 1. $Y$ is not a subset of $X$. 2. $X \cup Y$ is not the set of all attributes in $R$.

4NF is a stricter normal form than BCNF. Every relation in 4NF is also in BCNF.

Since every functional dependency $X \to Y$ is also a multivalued dependency $X \to\to Y$, a violation of BCNF implies a violation of 4NF. Therefore, if a relation is in 4NF, it must also be in BCNF.
A relation can be in BCNF but not in 4NF.

This situation occurs when a relation has a nontrivial MVD that is not also an FD, and the left-hand side of this MVD is not a superkey. For instance, a relation with attributes $\{A, B, C\}$ and MVD $A \to\to B$ might have no nontrivial FDs. In this case, the only key would be $\{A, B, C\}$. The relation is in BCNF. However, it violates 4NF because $A \to\to B$ is a nontrivial MVD, and $A$ is not a superkey.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-21 11:33:38
Duration: 15.362337112426758
Cummulative cost: $1.83e-02

Decomposition into Fourth Normal Form

Section outline

Presents an algorithm for decomposing a relation into a set of relations that are all in 4NF.
Describes the decomposition method: if a relation $R$ has a 4NF violation $X \to\to Y$, it is decomposed into two relations with schemas $R_1(X,Y)$ and $R_2(X, R-Y)$.
States that this decomposition process is recursive and guarantees a lossless join.

#6

read: Section 3.6.5: Decomposition into Fourth Normal Form (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Presents an algorithm for decomposing a relation into a set of relations that are all in 4NF.
- Describes the decomposition method: if a relation $R$ has a 4NF violation $X \to\to Y$, it is decomposed into two relations with schemas $R_1(X,Y)$ and $R_2(X, R-Y)$.
- States that this decomposition process is recursive and guarantees a lossless join.

Decomposition into Fourth Normal Form

The algorithm for decomposing a relation into Fourth Normal Form (4NF) is analogous to the BCNF decomposition algorithm. It iteratively resolves violations of the normal form by splitting the relation schema.

Algorithm 3.33: Decomposition into Fourth Normal Form

INPUT: A relation schema $R_0$ with a set of functional and multivalued dependencies $S_0$.

OUTPUT: A decomposition of $R_0$ into a set of relation schemas that are all in 4NF. The decomposition is guaranteed to have the lossless-join property.

METHOD: 1. Initialize the set of decomposed relations $\mathcal{D} = \{R_0\}$. 2. While there is a relation schema $R \in \mathcal{D}$ that is not in 4NF: a. Find a nontrivial multivalued dependency $X \to\to Y$ that holds on $R$ and violates 4NF (i.e., $X$ is not a superkey for $R$). b. Replace $R$ in $\mathcal{D}$ with two new relation schemas: i. $R_1(X \cup Y)$ ii. $R_2(R - Y)$ which is equivalent to $R_2(X \cup (R - X - Y))$. 3. The final set $\mathcal{D}$ is the desired 4NF decomposition.

The Decomposition Process

When a relation $R$ is found to have a 4NF violation, say a nontrivial MVD $X \to\to Y$ where $X$ is not a superkey, the relation is decomposed into two smaller relations. The first, $R_1$, contains the attributes involved in the MVD ($X$ and $Y$). The second, $R_2$, contains the attributes on the left-hand side of the MVD ($X$) plus all other attributes not on the right-hand side ($R-Y$).
Lossless-Join Property

The decomposition is guaranteed to be lossless. The MVD $X \to\to Y$ is precisely the condition required to ensure that the natural join of the decomposed relations $R_1(X, Y)$ and $R_2(X, R-Y)$ on their common attributes $X$ reconstructs the original relation $R$ without spurious tuples.
Recursive Application

This decomposition procedure is applied recursively. After a relation is split, the resulting schemas must themselves be checked for 4NF violations. Since each step of the decomposition results in schemas with strictly fewer attributes, the process is guaranteed to terminate, leaving a collection of relations all of which are in 4NF.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-21 11:33:54
Duration: 15.942198753356934
Cummulative cost: $1.40e-02

Relationships Among Normal Forms

Section outline

Summarizes the hierarchical relationship among normal forms: 4NF implies BCNF, which in turn implies 3NF.
Compares the properties of decompositions into 3NF, BCNF, and 4NF in terms of redundancy elimination and dependency preservation.
Highlights that only 4NF decomposition eliminates redundancy caused by MVDs.
Notes that while 3NF decomposition preserves FDs, BCNF and 4NF decompositions do not guarantee it, and none guarantee MVD preservation.

#7

read: Section 3.6.6: Relationships Among Normal Forms (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Summarizes the hierarchical relationship among normal forms: 4NF implies BCNF, which in turn implies 3NF.
- Compares the properties of decompositions into 3NF, BCNF, and 4NF in terms of redundancy elimination and dependency preservation.
- Highlights that only 4NF decomposition eliminates redundancy caused by MVDs.
- Notes that while 3NF decomposition preserves FDs, BCNF and 4NF decompositions do not guarantee it, and none guarantee MVD preservation.

Hierarchy of Normal Forms

The normal forms are hierarchically related. A relation that is in Fourth Normal Form (4NF) is, by definition, also in Boyce-Codd Normal Form (BCNF). Similarly, any relation in BCNF is also in Third Normal Form (3NF). This relationship can be expressed as: \[ \text{4NF} \implies \text{BCNF} \implies \text{3NF} \] Thus, 4NF represents the most stringent condition among the three.
Redundancy Elimination
The primary goal of normalization is to eliminate redundancy.
- BCNF and 4NF decompositions eliminate all redundancies caused by functional dependencies (FDs). A 3NF decomposition may not eliminate all such anomalies.
- 4NF is the only normal form that guarantees the elimination of redundancy caused by multivalued dependencies (MVDs) that are not also FDs.
Dependency Preservation
An important property of a decomposition is whether it preserves the original dependencies.
- 3NF: The synthesis algorithm for 3NF guarantees a lossless-join decomposition that preserves all functional dependencies.
- BCNF and 4NF: Decompositions into BCNF or 4NF are always lossless-join but do not guarantee the preservation of all FDs.
- MVD Preservation: None of the decomposition algorithms for 3NF, BCNF, or 4NF guarantee the preservation of multivalued dependencies.
Summary of Properties

The properties of decompositions into each normal form are summarized as follows:

Property 3NF BCNF 4NF

Eliminates FD Redundancy No Yes Yes

Eliminates MVD Redundancy No No Yes

Preserves FDs Yes No No

Preserves MVDs No No No

Property	3NF	BCNF	4NF
Eliminates FD Redundancy	No	Yes	Yes
Eliminates MVD Redundancy	No	No	Yes
Preserves FDs	Yes	No	No
Preserves MVDs	No	No	No

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-21 11:35:09
Duration: 12.422275066375732
Cummulative cost: $1.35e-02