Subqueries
Subqueries
- A subquery is a query that is part of another query, used to help in its evaluation.
- Subqueries can return a single constant (scalar value) for comparison in a
WHEREclause. - Subqueries can return relations that are used in
WHEREclauses with operators likeEXISTS,IN,ALL, andANY. - Subqueries can appear in
FROMclauses, acting as derived relations.
#1
read: Section 6.3: Subqueries (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- A subquery is a query that is part of another query, used to help in its evaluation.
- Subqueries can return a single constant (scalar value) for comparison in a `WHERE` clause.
- Subqueries can return relations that are used in `WHERE` clauses with operators like `EXISTS`, `IN`, `ALL`, and `ANY`.
- Subqueries can appear in `FROM` clauses, acting as derived relations.
In SQL, a query can be nested within another to facilitate complex data retrieval. Such a nested query is termed a subquery.
A subquery is a SELECT-FROM-WHERE expression that is contained within another SQL query. The outer query is known as the main query. The result of the subquery is used by the main query to complete its evaluation.
Subqueries can be utilized in several distinct ways within a main query.
Scalar Subqueries
A subquery can be formulated to return a single atomic value, known as a scalar. This subquery must produce a relation with exactly one column and one row. The resulting single value can then be used in a
WHEREclause for direct comparison, much like a constant. For instance, it can be compared with an attribute’s value in the main query. A run-time error occurs if the subquery returns zero tuples or more than one tuple.Subqueries in
WHEREClause ConditionsSubqueries can return a relation (typically a single column) that is used as an operand for certain relational operators in the
WHEREclause. These operators evaluate to a boolean value.EXISTS R: True if the relationRreturned by the subquery is not empty.s IN R: True if the valuesis equal to one of the values in the unary relationR.s op ALL R: True if the comparisons op vis true for every valuevin the unary relationR. Theopcan be any standard comparison operator (e.g.,=,>,<).s op ANY R(ors op SOME R): True if the comparisons op vis true for at least one valuevin the unary relationR.
Subqueries in the
FROMClauseA subquery can be placed in the
FROMclause of a main query. The result of this subquery is treated as a temporary, or derived, relation. This derived relation must be given an alias (a tuple variable) so that it can be referenced in other parts of the main query, such as theSELECTlist or theWHEREclause. This allows for the construction of intermediate results that can then be joined with other relations.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:03:44
- Duration: 28.977590799331665
- Cummulative cost: $1.69e-02
Subqueries that Produce Scalar Values
- Explains how a select-from-where expression that returns a single value (a scalar) can be used like a constant, typically for comparison in a
WHEREclause.
#2
read: Section 6.3.1: Subqueries that Produce Scalar Values (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Explains how a select-from-where expression that returns a single value (a scalar) can be used like a constant, typically for comparison in a `WHERE` clause.
A subquery can be formulated to return a single atomic value, known as a scalar. Such a query is constrained to produce a relation with a single attribute and a single tuple.
Definition: Scalar SubqueryA scalar subquery is a
SELECTstatement, enclosed in parentheses, that returns a result set containing exactly one column and one row. The value in that single cell is the result of the subquery.The primary utility of a scalar subquery is its ability to be used in expressions where a constant value is expected. It is most frequently found in the
WHEREclause to provide a value for comparison.Consider the task of finding the producer of the film ‘Star Wars’. We require two relations:
MoviesandMovieExec. TheproducerC#for ‘Star Wars’ can be obtained from theMoviestable. This value can then be used to find the producer’s name in theMovieExectable.SELECT name FROM MovieExec WHERE cert# = (SELECT producerC# FROM Movies WHERE title = 'Star Wars');The subquery
(SELECT producerC# ...)is evaluated first. Its result, a single certificate number, is then used by the outer query’sWHEREclause, effectively resolving the outer query to:SELECT name FROM MovieExec WHERE cert# = 12345; -- Assuming 12345 is the resultA critical constraint is that a scalar subquery must return precisely one value. If the subquery returns zero tuples or more than one tuple, a run-time error will occur. The database system cannot resolve the comparison in the outer query’s
WHEREclause under such conditions.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:03:58
- Duration: 14.000209093093872
- Cummulative cost: $1.62e-02
Conditions Involving Relations
- Introduces SQL operators that can be applied to a relation (expressed as a subquery) to produce a boolean result.
- Defines
EXISTS R, which is true if relationRis not empty. - Defines
s IN R, which is true if scalarsis equal to one of the values in the unary relationR. - Defines
s > ALL Rands > ANY Rfor comparing a scalarswith all or at least one value in a unary relationR, applicable with all six comparison operators.
#3
read: Section 6.3.2: Conditions Involving Relations (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Introduces SQL operators that can be applied to a relation (expressed as a subquery) to produce a boolean result.
- Defines `EXISTS R`, which is true if relation `R` is not empty.
- Defines `s IN R`, which is true if scalar `s` is equal to one of the values in the unary relation `R`.
- Defines `s > ALL R` and `s > ANY R` for comparing a scalar `s` with all or at least one value in a unary relation `R`, applicable with all six comparison operators.
SQL provides several operators that can be applied to a relation, which must be expressed as a subquery, to produce a boolean result. These operators are fundamental for formulating complex conditions within a WHERE clause.
EXISTSThis operator tests whether a relation contains any tuples. It is used to check for the existence of rows in a subquery’s result.
Definition: EXISTSThe condition
EXISTS Ris true if and only if the relationRis not empty. \[ \text{EXISTS } R \iff R \neq \emptyset \] The negation,NOT EXISTS R, is true if and only if \(R\) is empty.INThis operator determines if a given value matches any value in a set of values returned by a subquery. The subquery must return a single column.
Definition: INFor a scalar value \(s\) and a unary relation \(R\), the condition
s IN Ris true if and only if \(s\) is equal to one of the values in \(R\). \[ s \text{ IN } R \iff s \in R \] The conditions NOT IN Ris the negation. The expressions = ANY Ris equivalent tos IN R.ALLThis operator compares a scalar value with every value in a single-column set returned by a subquery. It is used with a standard comparison operator.
Definition: ALLFor a scalar value \(s\), a unary relation \(R\), and a comparison operator \(\theta \in \{=, \neq, <, \leq, >, \geq\}\), the condition
s θ ALL Ris true if and only if the comparison \(s \ \theta \ t\) is true for every value \(t\) in \(R\). \[ s \ \theta \text{ ALL } R \iff \forall t \in R (s \ \theta \ t) \] For instance,s > ALL Rmeans that \(s\) is greater than every value in \(R\).ANYorSOMEThis operator compares a scalar value with at least one value in a single-column set returned by a subquery. It is also used with a standard comparison operator.
Definition: ANY / SOMEFor a scalar value \(s\), a unary relation \(R\), and a comparison operator \(\theta\), the condition
s θ ANY Ris true if and only if there exists at least one value \(t\) in \(R\) for which the comparison \(s \ \theta \ t\) is true.SOMEis a synonym forANY. \[ s \ \theta \text{ ANY } R \iff \exists t \in R (s \ \theta \ t) \] For example,s > ANY Rmeans that \(s\) is greater than at least one value in \(R\).
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:04:20
- Duration: 21.665996074676514
- Cummulative cost: $2.49e-02
Conditions Involving Tuples
- Describes the representation of tuples in SQL as a parenthesized list of scalar values, such as
(123, 'foo'). - Extends conditions like
INandANYto operate on tuples, comparing a tupletwith the tuples in a relationR.
#4
read: Section 6.3.3: Conditions Involving Tuples (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Describes the representation of tuples in SQL as a parenthesized list of scalar values, such as `(123, 'foo')`.
- Extends conditions like `IN` and `ANY` to operate on tuples, comparing a tuple `t` with the tuples in a relation `R`.
In SQL, a tuple is represented by a parenthesized list of scalar values.
The components of such a list can be constants, such as
(123, 'foo'), or attributes, such as(name, address, netWorth). A mixture of constants and attributes is also permitted.Relational comparison operators can be generalized to operate on tuples.
If a tuple
thas the same number of components as the tuples in a relationR, it is syntactically valid to form expressions such ast IN Rort <> ANY R. The comparison between the tupletand the tuples ofRproceeds component by component, based on the assumed standard order of attributes for the relationR.
IN)
Let \(t = (v_1, v_2, \dots, v_n)\) be a tuple of \(n\) scalar values. Let \(R\) be a relation of arity \(n\). The condition \(t \text{ IN } R\) is true if and only if there exists a tuple \(r \in R\) such that \(r = (v_1, v_2, \dots, v_n)\).
Nested subqueries can leverage tuple comparisons to link relations on multiple attributes simultaneously.
Consider the query to find the producers of movies in which ‘Harrison Ford’ starred. The query links the
Moviesrelation to theStarsInrelation using a composite key(title, year).SELECT name FROM MovieExec WHERE cert# IN (SELECT producerC# FROM Movies WHERE (title, year) IN (SELECT movieTitle, movieYear FROM StarsIn WHERE starName = 'Harrison Ford' ) );In this structure, the middle subquery’s
WHEREclause evaluates to true for aMoviestuple if its(title, year)pair is present in the set of title-year pairs returned by the innermost subquery.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:04:36
- Duration: 16.760929107666016
- Cummulative cost: $1.89e-02
Subqueries in FROM Clauses
- Demonstrates the use of a parenthesized subquery as a relation in a
FROMclause. - Notes that a subquery used in the
FROMclause must be given a tuple-variable alias.
#6
read: Section 6.3.5: Subqueries in FROM Clauses (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Demonstrates the use of a parenthesized subquery as a relation in a `FROM` clause.
- Notes that a subquery used in the `FROM` clause must be given a tuple-variable alias.
In SQL, a subquery can be used as a relation within the FROM clause. This allows for the creation of temporary, derived relations that can be referenced in the outer query, much like a stored table.
FROM Clause
A subquery may appear in a FROM clause in place of a relation name. Such a subquery must be enclosed in parentheses and must be given a tuple-variable alias. The result of the subquery is treated as a relation whose tuples are represented by this alias.
A parenthesized subquery can be used as a relation in a
FROMlist.The result of the subquery forms a temporary relation that exists only for the duration of the outer query’s execution. This is particularly useful for breaking down complex queries into more manageable, logical steps.
The subquery must be assigned a tuple-variable alias.
Since the relation generated by the subquery is not a stored table and has no intrinsic name, an alias is syntactically mandatory. This alias is used to refer to the tuples of the subquery’s result.
Tuples in the result of the subquery are referenced through its alias.
Consider a subquery with alias
Rthat produces a column namedA. This column would be referenced in the outer query asR.A.
Example
Let us find the producers of Harrison Ford’s movies. We can construct a subquery that first finds the certificate numbers of these producers and then join this result with the MovieExec relation.
SELECT name
FROM MovieExec, (SELECT producerC#
FROM Movies, StarsIn
WHERE title = movieTitle AND
year = movieYear AND
starName = 'Harrison Ford'
) AS Prod
WHERE cert# = Prod.producerC#;The subquery in the
FROMclause first joinsMoviesandStarsInto find all movies starring ‘Harrison Ford’ and selects their producers’ certificate numbers (producerC#). This resulting set of certificate numbers is given the aliasProd. The outer query then joins theMovieExecrelation with this temporaryProdrelation on the certificate number to retrieve the names of the producers.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:05:03
- Duration: 13.806691884994507
- Cummulative cost: $1.40e-02
SQL Join Expressions
- Introduces explicit join expressions which produce relations and can be used as subqueries.
- Describes
CROSS JOINas a synonym for Cartesian product. - Explains theta-joins using the
JOIN...ONsyntax, where the product of two relations is filtered by a specified condition.
#7
read: Section 6.3.6: SQL Join Expressions (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Introduces explicit join expressions which produce relations and can be used as subqueries.
- Describes `CROSS JOIN` as a synonym for Cartesian product.
- Explains theta-joins using the `JOIN...ON` syntax, where the product of two relations is filtered by a specified condition.
SQL provides explicit join expressions to construct relations. These expressions can stand as queries by themselves or be used as subqueries within the FROM clause of a SELECT-FROM-WHERE statement. They often serve as shorthands for more elaborate queries.
CROSS JOINThis operation is a synonym for the Cartesian product. The expression
R CROSS JOIN Scomputes the product of relationsRandS, denoted in relational algebra as \(R \times S\). The resulting relation contains all possible pairings of a tuple fromRand a tuple fromS.Movies CROSS JOIN StarsIn;The schema of the result includes all attributes of both
MoviesandStarsIn.JOIN ... ONThis syntax constructs a theta-join. A theta-join is a product of two relations followed by a selection based on a specified condition. The meaning of
R JOIN S ON Cis equivalent to applying a selection \(\sigma_C\) to the product \(R \times S\).Movies JOIN StarsIn ON Movies.title = StarsIn.movieTitle AND Movies.year = StarsIn.movieYear;The result is a relation where each tuple is formed by combining a tuple from
Moviesand a tuple fromStarsInthat satisfy the conditionCspecified in theONclause.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:05:14
- Duration: 10.927731037139893
- Cummulative cost: $1.31e-02
Natural Joins
- Defines the
NATURAL JOINoperator in SQL, which behaves like the relational algebra operator. - The join condition equates all pairs of attributes with a common name, and one of each pair is projected out of the result.
#8
read: Section 6.3.7: Natural Joins (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Defines the `NATURAL JOIN` operator in SQL, which behaves like the relational algebra operator.
- The join condition equates all pairs of attributes with a common name, and one of each pair is projected out of the result.
The SQL NATURAL JOIN operator provides a direct implementation of the natural join from relational algebra.
Implicit Join Condition
The join condition is implicitly established by equating all pairs of attributes from the two relations that share a common name. There are no other conditions in the join predicate.
Automatic Projection
From each pair of equated attributes, one is projected out of the final result. This ensures that the resulting relation does not contain duplicate columns for the join attributes.
Let \(R\) and \(S\) be two relations. Let the set of attribute names common to both schemas be \(\{A_1, A_2, \dots, A_k\}\). The natural join of \(R\) and \(S\), written in SQL as R NATURAL JOIN S, is equivalent to the relational algebra expression \(R \bowtie S\).
Formally, it is defined as a projection of a selection on the Cartesian product:
\[ R \bowtie S = \pi_{\text{schema}(R) \cup \text{schema}(S)} (\sigma_{R.A_1 = S.A_1 \land \dots \land R.A_k = S.A_k}(R \times S)) \]
The resulting schema consists of all attributes from \(R\) and all attributes from \(S\), with the common attributes appearing only once.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:05:28
- Duration: 13.812438249588013
- Cummulative cost: $1.59e-02
Outerjoins
- Introduces outerjoins to augment the result of a join with dangling tuples (those that do not match any tuple in the other relation), padded with
NULLvalues. - Explains
FULL OUTER JOINwhich pads and includes dangling tuples from both relations. - Explains
LEFT OUTER JOINandRIGHT OUTER JOINwhich include dangling tuples from only the left or right relation, respectively. - Shows that outerjoins can be combined with both
NATURAL JOINandJOIN...ONconditions.
#9
read: Section 6.3.8: Outerjoins (section)
model: pro
rules: bullet definition math presentation short
Present the following. Use definition formal definition when it applies.
- Introduces outerjoins to augment the result of a join with dangling tuples (those that do not match any tuple in the other relation), padded with `NULL` values.
- Explains `FULL OUTER JOIN` which pads and includes dangling tuples from both relations.
- Explains `LEFT OUTER JOIN` and `RIGHT OUTER JOIN` which include dangling tuples from only the left or right relation, respectively.
- Shows that outerjoins can be combined with both `NATURAL JOIN` and `JOIN...ON` conditions.
The outerjoin operator augments the result of a join by including the dangling tuples, which are those tuples in one relation that do not match any tuple in the other relation according to the join condition. These dangling tuples are padded with NULL values for the attributes of the other relation.
Let \(R\) and \(S\) be relations. The outerjoin extends the standard join operation to retain tuples that do not satisfy the join condition.
The full outerjoin (\(R \text{ NATURAL FULL OUTER JOIN } S\)) is the set of tuples that are in the natural join of \(R\) and \(S\), plus tuples for each dangling tuple in \(R\) padded with
NULLs, plus tuples for each dangling tuple in \(S\) padded withNULLs.The left outerjoin (\(R \text{ NATURAL LEFT OUTER JOIN } S\)) is the set of tuples that are in the natural join of \(R\) and \(S\), plus tuples for each dangling tuple in \(R\) padded with
NULLs.The right outerjoin (\(R \text{ NATURAL RIGHT OUTER JOIN } S\)) is the set of tuples that are in the natural join of \(R\) and \(S\), plus tuples for each dangling tuple in \(S\) padded with
NULLs.
FULL OUTER JOINThis operation retains all tuples from both relations. Tuples that match are joined, while dangling tuples from either relation are included in the result, padded with
NULLfor the attributes of the other relation. For example,MovieStar NATURAL FULL OUTER JOIN MovieExecproduces a relation of individuals who are stars, executives, or both.LEFT OUTER JOINThis operation retains all tuples from the left-hand relation. Dangling tuples from the left relation are included and padded with
NULLs. Dangling tuples from the right relation are discarded. For example,Movies LEFT OUTER JOIN StarsIn ON ...will list all movies, including those for which no stars are listed inStarsIn.RIGHT OUTER JOINThis operation is symmetric to the
LEFT OUTER JOIN. It retains all tuples from the right-hand relation, padding any dangling tuples from that relation withNULLs, while discarding dangling tuples from the left relation.Outerjoins with
JOIN ... ONThe outerjoin modifiers (
FULL,LEFT,RIGHT) can be applied to theta-joins specified with anONclause, in addition to natural joins. The logic for including and padding dangling tuples is based on the explicit condition provided in theONclause.
- Version: 1
- Model: gemini-2.5-pro
- Created: 2025-09-20 13:05:43
- Duration: 15.537811040878296
- Cummulative cost: $1.90e-02