# Set Operations-Introduction

The SQL operations **union**, **intersect**, and **except** operate on relations and correspond to the mathematical set-theory operations ∪, ∩, and −. We shall now construct queries involving the **union**, **intersect**, and **except** operations over two sets.

• The set of all courses taught in the Fall 2009 semester:

**select** *course_id*

**from** section

**where*** semester = ’Fall*’ **and** *year= 2009*;

• The set of all courses taught in the Spring 2010 semester:

**select** *course_id*

**from** *section*

**where** *semester = ’Spring’ ***and*** year= 2010;*

In our discussion that follows, we shall refer to the relations obtained as the result of the preceding queries as c1 and c2, respectively, and show the results when these queries are run on the *section* relation of Figure 2.6 in Figures 3.9 and 3.10.Observe that c2 contains two tuples corresponding to *course_id* CS-319, since two sections of the course have been offered in Spring 2010.

**1. The Union Operation**

To find the set of all courses taught either in Fall 2009 or in Spring 2010, or both,we write:

**(select*** course_id*

**from*** section*

**where*** semester = ’Fall’ ***and*** year= 2009)*

**union**

**(select*** course_id*

**from*** section*

**where*** semester = ’Spring’ ***and*** year= 2010);*

The **union** operation automatically eliminates duplicates, unlike the **select** clause.

Thus, using the *section* relation of Figure 2.6, where two sections of CS-319 are

offered in Spring 2010, and a section of CS-101 is offered in the Fall 2009 as well as

in the Fall 2010 semester, CS-101 and CS-319 appear only once in the result, shown

in Figure 3.11.

If we want to retain all duplicates, we must write union all in place of union:

(**select*** course_id*

**from*** section*

**where*** semester = ’Fall’ ***and*** year= 2009)*

**union*** all
(*

**select**

*course_id*

**from**

*section*

**where**

*semester = ’Spring’*

**and**

*year= 2010);*

The number of duplicate tuples in the result is equal to the total number of

duplicates that appear in both c1 and c2. So, in the above query, each of CS-319

and CS-101 would be listed twice. As a further example, if it were the case that 4

sections of ECE-101 were taught in the Fall 2009 semester and 2 sections of ECE-101 were taught in the Fall 2010 semester, then there would be 6 tuples with ECE-101 in the result.

**2. The Intersect Operation**

To find the set of all courses taught in the Fall 2009 as well as in Spring 2010 we

write:

(**select*** course_id*

**from*** section*

**where*** semester = ’Fall’ ***and*** year= 2009)*

**intersect**

*(***select*** course_id*

**from*** section*

**where*** semester = ’Spring’ ***and*** year= 2010*);

The *result* relation, shown in Figure 3.12, contains only one tuple with CS-101. The

**intersect** operation automatically eliminates duplicates. For example, if it were

the case that 4 sections of ECE-101 were taught in the Fall 2009 semester and 2

sections of ECE-101 were taught in the Spring 2010 semester, then there would be

only 1 tuple with ECE-101 in the result.

If we want to retain all duplicates, we must write **intersect all** in place of

**intersect**:

(**select*** course_id*

**from*** section*

**where** semester = ’Fall’ **and** year= 2009)

**intersect all**

(**select** course_id

**from** section

**where** semester = ’Spring’ **and** year= 2010);

The number of duplicate tuples that appear in the result is equal to the minimum

number of duplicates in both c1 and c2. For example, if 4 sections of ECE-101

were taught in the Fall 2009 semester and 2 sections of ECE-101 were taught in the

Spring 2010 semester, then there would be 2 tuples with ECE-101 in the result.

**3. The Except Operation**

To find all courses taught in the Fall 2009 semester but not in the Spring 2010

semester, we write:

(**select** *course_id*

**from** section

**where*** semester = ’Fall’ ***and*** year= 2009)*

**except**

*(***select*** course_id
from section
where semester = ’Spring’ and year= 2010);*

The result of this query is shown in Figure 3.13. Note that this is exactly relation

c1 of Figure 3.9 except that the tuple for CS-101 does not appear. The **except**

operation7 outputs all tuples from its first input that do not occur in the second

input; that is, it performs set difference. The operation automatically eliminates

duplicates in the inputs before performing set difference. For example, if 4 sections

of ECE-101 were taught in the Fall 2009 semester and 2 sections of ECE-101 were

taught in the Spring 2010 semester, the result of the **except** operation would not

have any copy of ECE-101.

If we want to retain duplicates, we must write **except all** in place of **except**:

(**select** *course_id*

**from** *section*

**where** *semester* = ’*Fall*’ **and*** year= 2009)*

**except all**

(**select** *course_id*

**from** *section*

**where*** semester = ’Spring*’ **and*** year= 2010);*

The number of duplicate copies of a tuple in the result is equal to the number of

duplicate copies in c1 minus the number of duplicate copies in c2, provided that

the difference is positive. Thus, if 4 sections of ECE-101 were taught in the Fall

2009 semester and 2 sections of ECE-101 were taught in Spring 2010, then there are 2 tuples with ECE-101 in the result. If, however, there were two or fewer sections

of ECE-101 in the the Fall 2009 semester, and two sections of ECE-101 in the Spring

2010 semester, there is no tuple with ECE-101 in the result.

**Frequently Asked Questions**

## Recommended Posts:

- Characteristics of the Database Approach
- View of Data
- Subclasses, Superclasses, and Inheritance
- what is database management system
- Database-System Applications
- Specialization and Generalization
- Constraints and Characteristics of Specialization and Generalization Hierarchies
- Modeling of UNION Types Using Categories
- A Sample UNIVERSITY EER Schema, Design Choices, and Formal Definitions
- Example of Other Notation: Representing Specialization and Generalization in UML Class Diagrams
- Data Abstraction, Knowledge Representation, and Ontology Concepts
- Using High-Level Conceptual Data Models for Database Design
- Using High-Level Conceptual Data Models for Database Design
- Using High-Level Conceptual Data Models for Database Design
- Using High-Level Conceptual Data Models for Database Design

**3/5**