Functional Dependencies

Section outline

Introduces the design theory for relational databases, focusing on eliminating redundancy and anomalies.
Identifies functional dependencies (FDs) as a core concept, generalizing the idea of a key.
Explains that this theory leads to normalization, a process of decomposing relations to improve the schema.
Defines the core concepts of functional dependencies, keys, and superkeys.
Uses a running example of a Movies1 relation to illustrate the concepts.

#1

read: Section 3.1: Functional Dependencies (section)

model: pro
rules: bullet definition math presentation short

Write an introductory overview of this section.  Provide motivations of why we want to define FD and keys of relations.

This section introduces the formal theory for designing relational database schemas. The primary motivation is to address flaws in an initial database design, such as data redundancy, which can lead to update anomalies.

Identifying Design Flaws

An initial relational schema, whether derived from a high-level design or defined directly, often contains problems. A common issue is attempting to store too much information in a single relation, which leads to redundant data and complicates data maintenance. These problems are formally referred to as “anomalies.”
A Formal Framework for Design

A well-developed theory provides the tools to analyze and improve relational schemas. This theory is centered around the concept of data dependencies, which are constraints on the data that can be stored in a relation. By understanding these dependencies, we can systematically refine the database design.
Functional Dependencies as the Foundation

The cornerstone of this theory is the functional dependency (FD). An FD is a generalization of the concept of a key. It expresses a constraint where the value of one set of attributes determines the value of another set of attributes within any tuple of a relation.
Normalization and Decomposition

The practical application of this theory is a process called normalization. Normalization uses functional dependencies to guide the decomposition of a relation into two or more smaller relations. The goal of this decomposition is to eliminate redundancy and the associated anomalies, resulting in a cleaner, more robust database schema.

gemini-2.5-pro

Version: 2
Model: gemini-2.5-pro
Created: 2025-09-03 20:41:36
Duration: 10.5155189037323
Cummulative cost: $3.00e-02

Definition of Functional Dependency

Section outline

Provides the formal definition of a functional dependency (FD) on a relation R.
Explains the notation $A_1 A_2 ... A_n \rightarrow B_1 B_2 ... B_m$, meaning the A attributes functionally determine the B attributes.
Clarifies that an FD is a constraint on all possible instances of a relation, not just a property of one instance.
Notes that an FD with multiple attributes on the right side is equivalent to a set of FDs, each with a single attribute on the right.

#2

read: Section 3.1.1: Definition of Functional Dependency (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Provides the formal definition of a functional dependency (FD) on a relation R.
- Explains the notation $A_1 A_2 ... A_n \rightarrow B_1 B_2 ... B_m$, meaning the A attributes functionally determine the B attributes.
- Clarifies that an FD is a constraint on all possible instances of a relation, not just a property of one instance.
- Notes that an FD with multiple attributes on the right side is equivalent to a set of FDs, each with a single attribute on the right.

Definition: Functional Dependency (FD)

A functional dependency on a relation $R$ is a statement of the form \[ A_1 A_2 \dots A_n \rightarrow B_1 B_2 \dots B_m \] This statement holds if for any two tuples $t$ and $u$ in any valid instance of relation $R$, the following condition is met: If the tuples $t$ and $u$ agree on all of the attributes $A_1, A_2, \dots, A_n$, then they must also agree on all of the attributes $B_1, B_2, \dots, B_m$.

Notation

The expression $A_1 A_2 \dots A_n \rightarrow B_1 B_2 \dots B_m$ is read as “$A_1, A_2, \dots, A_n$ functionally determine $B_1, B_2, \dots, B_m$”. The set of attributes $\{A_1, \dots, A_n\}$ is the determinant, and $\{B_1, \dots, B_m\}$ is the dependent.
Constraint on a Relation Schema

An FD is a constraint on the relation schema. It is not a property of a single instance of the relation. For a relation $R$ to satisfy an FD, the FD must hold for all possible legal instances of $R$.
Equivalence

An FD with multiple attributes on the right side is equivalent to a set of FDs, each having a single attribute on the right side. The single FD $A_1 A_2 \dots A_n \rightarrow B_1 B_2 \dots B_m$ is equivalent to the set of FDs: \[ \begin{align*} A_1 A_2 \dots A_n &\rightarrow B_1 \\ A_1 A_2 \dots A_n &\rightarrow B_2 \\ &\vdots \\ A_1 A_2 \dots A_n &\rightarrow B_m \end{align*} \]

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-03 20:35:31
Duration: 14.950706958770752
Cummulative cost: $1.76e-02

Keys of Relations

Section outline

Defines a key for a relation R as a set of attributes that satisfies two conditions: (1) it functionally determines all other attributes, and (2) it is minimal (no proper subset has property 1).
Discusses the designation of a primary key when a relation has more than one key.

#3

read: Section 3.1.2: Keys of Relations (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Defines a key for a relation R as a set of attributes that satisfies two conditions: (1) it functionally determines all other attributes, and (2) it is minimal (no proper subset has property 1).
- Discusses the designation of a primary key when a relation has more than one key.

Definition: Key

A set of one or more attributes $K = \{A_1, A_2, \dots, A_n\}$ is a key for a relation $R$ if: 1. The attributes in $K$ functionally determine all other attributes of the relation. That is, for any two distinct tuples in $R$, they cannot have the same values for all attributes in $K$. 2. No proper subset of $K$ also functionally determines all other attributes of $R$. The key must be minimal.

Conditions for a Key
A set of attributes forms a key for a relation if it satisfies two essential properties.
- Functional Determination: The set of attributes must functionally determine all other attributes in the relation. If the set of all attributes is $U$, then the functional dependency $K \rightarrow (U-K)$ must hold.
- Minimality: The key must contain no redundant attributes. If any single attribute were removed from the key, the remaining attributes would no longer functionally determine all other attributes of the relation.
Primary Key

A relation can have more than one key. If so, it is common to designate one of these keys as the primary key. This choice may influence implementation issues, such as physical storage. The theory of functional dependencies, however, grants no special role to the primary key over any other key.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-03 20:36:06
Duration: 16.29501509666443
Cummulative cost: $1.46e-02

Superkeys

Section outline

Defines a superkey as a set of attributes that contains a key.
Notes that a superkey functionally determines all other attributes of the relation but does not need to satisfy the condition of minimality.

#4

read: Section 3.1.3: Superkeys (section)

model: pro
rules: bullet definition math presentation short


Present the following.  Use definition formal definition when it applies.

- Defines a superkey as a set of attributes that contains a key.
- Notes that a superkey functionally determines all other attributes of the relation but does not need to satisfy the condition of minimality.

Definition: Superkey

A set of attributes that contains a key is called a superkey. If $K$ is a key for a relation $R$, then any set of attributes $S$ such that $K \subseteq S$ is a superkey of $R$.

Every superkey functionally determines all other attributes of the relation.

This property follows from the definition of a key. Since a key functionally determines all other attributes, any superset of that key will also functionally determine all other attributes.
A superkey does not need to be minimal.

While a key must be a minimal set of attributes that provides unique identification, a superkey can contain additional, extraneous attributes. Every key is a superkey, but not every superkey is a key. For example, if {title, year, starName} is a key, then {title, year, starName, length} is a superkey, but it is not a key because it is not minimal.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-03 20:37:00
Duration: 12.31078028678894
Cummulative cost: $1.43e-02

Some definitions

# defs
model: pro
rules: bullet definition math presentation short

- Define what a candidate key is.  Give an example.
- Define what a prime attribute is.  Use the example above to illustrate a prime attribute vs a non-prime attribute.
- Define the difference between a full functional dependent on a key, vs partial dependent on a key. Use an example.

Candidate Key

A relation may have more than one key. Each such key is referred to as a candidate key. From the set of candidate keys, one is designated as the primary key.

Definition: Candidate Key
A set of attributes $K$ for a relation $R$ is a candidate key if it satisfies two conditions:
1. $K$ is a superkey for $R$. That is, $K$ functionally determines all attributes in $R$. Formally, $K \rightarrow R$.
2. $K$ is minimal. There is no proper subset $K' \subset K$ such that $K'$ is also a superkey for $R$.
For the relation Movies1(title, year, length, genre, studioName, starName), the set of attributes $\{title, year, starName\}$ is a candidate key. It functionally determines all other attributes, and no proper subset of it does.
Prime Attribute

An attribute that is part of any candidate key is called a prime attribute. All other attributes are non-prime.

Definition: Prime Attribute
An attribute $A$ of a relation schema $R$ is a prime attribute if $A$ is a member of some candidate key of $R$. Otherwise, $A$ is a non-prime attribute.
In our Movies1 example, the candidate key is $\{title, year, starName\}$.
- Prime attributes: title, year, starName.
- Non-prime attributes: length, genre, studioName.
Full and Partial Functional Dependency

These concepts describe how non-key attributes relate to a candidate key. They are crucial for defining normal forms, particularly the Second Normal Form (2NF).

Definition: Full Functional Dependency
A non-prime attribute $B$ is fully functionally dependent on a candidate key $K$ if it is functionally dependent on $K$ but not on any proper subset of $K$.
Let $X$ be a set of attributes and $A$ be an attribute. We say $A$ is fully functionally dependent on $X$ if:
1. $X \rightarrow A$
2. There is no proper subset $Z \subset X$ for which $Z \rightarrow A$.
Definition: Partial Dependency

A non-prime attribute $B$ has a partial dependency on a candidate key $K$ if it is functionally dependent on a proper subset of $K$.

In the Movies1 relation, the candidate key is $\{title, year, starName\}$. We have the functional dependency: \[ \{title, year\} \rightarrow \{length, genre, studioName\} \] Since $\{title, year\}$ is a proper subset of the candidate key, the non-prime attributes length, genre, and studioName are partially dependent on the candidate key. This partial dependency is a source of data redundancy and update anomalies.

gemini-2.5-pro

Version: 1
Model: gemini-2.5-pro
Created: 2025-09-08 00:02:21
Duration: 18.71535086631775
Cummulative cost: $2.20e-02