Basic Structure of SQL Queries




The basic structure of an SQL query consists of three clauses: select, from, and
where. The query takes as its input the relations listed in the from clause, operates
on them as specified in the where and select clauses, and then produces a relation
as the result. We introduce the SQL syntax through examples, and describe the
general structure of SQL queries later.


1. Qeries on a Single Relation

Let us consider a simple query using our university example, “Find the names
of all instructors.” Instructor names are found in the instructor relation, so we 

Basic Structure of SQL Queries

putthat relation in the from clause. The instructor’s name appears in the name
attribute, so we put that in the select clause.

select name
from instructor;

The result is a relation consisting of a single attribute with the heading name. If
the instructor relation is as shown in Figure 2.1, then the relation that results from
the preceding query is shown in Figure 3.2.

Now consider another query, “Find the department names of all instructors,”
which can be written as:

select dept_name
from instructor;

Since more than one instructor can belong to a department, a department name
could appear more than once in the instructor relation. The result of the above
query is a relation containing the department names, shown in Figure 3.3.

In the formal, mathematical definition of the relational model, a relation is a
set. Thus, duplicate tuples would never appear in relations. In practice, duplicate
elimination is time-consuming. Therefore, SQL allows duplicates in relations as
well as in the results of SQL expressions. Thus, the preceding SQL query lists
each department name once for every tuple in which it appears in the instructor
relation.

In those cases where we want to force the elimination of duplicates, we insert
the keyword distinct after select. We can rewrite the preceding query as:

select distinct dept_name
from instructor;

Basic Structure of SQL Queries

if we want duplicates removed. The result of the above query would contain each
department name at most once.

SQL allows us to use the keyword all to specify explicitly that duplicates are
not removed:

select all dept_name
from instructor;

Since duplicate retention is the default, we shall not use all in our examples. To
ensure the elimination of duplicates in the results of our example queries, we
shall use distinct whenever it is necessary.

The select clause may also contain arithmetic expressions involving the op?erators +, −, ∗, and / operating on constants or attributes of tuples. For example,
the query:

select ID, name, dept_name, salary * 1.1
from instructor;

returns a relation that is the same as the instructorrelation, except that the attribute
salary is multiplied by 1.1. This shows what would result if we gave a 10% raise
to each instructor; note, however, that it does not result in any change to the
instructor relation.

SQL also provides special data types, such as various forms of the date type,
and allows several arithmetic functions to operate on these types. We discuss this
further in Section 4.5.1.

The where clause allows us to select only those rows in the result relation of
the from clause that satisfy a specified predicate. Consider the query “Find the
names of all instructors in the Computer Science department who have salary
greater than $70,000.” This query can be written in SQL as:

Basic Structure of SQL Queries

select name
from instructor

where dept_name = ’Comp. Sci.’ and salary > 70000;

If the instructor relation is as shown in Figure 2.1, then the relation that results
from the preceding query is shown in Figure 3.4.

SQL allows the use of the logical connectives and, or, and not in the where
clause. The operands of the logical connectives can be expressions involving
the comparison operators <, <=, >, >=, =, and <>. SQL allows us to use the
comparison operators to compare strings and arithmetic expressions, as well as
special types, such as date types.

We shall explore other features of where clause predicates later in this topic

2. Queries on Multiple Relations

So far our example queries were on a single relation. Queries often need to access
information from multiple relations. We now study how to write such queries.

An an example, suppose we want to answer the query “Retrieve the names
of all instructors, along with their department names and department building
name.”

Looking at the schema of the relation instructor, we realize that we can get
the department name from the attribute dept_name, but the department building
name is present in the attribute building of the relation department. To answer the
query, each tuple in the instructor relation must be matched with the tuple in
the department relation whose dept_name value matches the dept_name value of the instructor tuple.

In SQL, to answer the above query, we list the relations that need to be accessed
in the from clause, and specify the matching condition in the where clause. The
above query can be written in SQL as

select name, instructor.dept_name, building
from instructor, department
where instructor.dept_name= department.dept_name;

If the instructor and department relations are as shown in Figures 2.1 and 2.5
respectively, then the result of this query is shown in Figure 3.5.

Note that the attribute dept_name occurs in both the relations instructor and
department, and the relation name is used as a prefix (in instructor.dept_name, and department.dept_name) to make clear to which attribute we are referring. In contrast, the attributes name and building appear in only one of the relations, and therefore do not need to be prefixed by the relation name.

Basic Structure of SQL Queries

This naming convention requires that the relations that are present in the from
clause have distinct names. This requirement causes problems in some cases,
such as when information from two different tuples in the same relation needs to
be combined. In Section 3.4.1, we see how to avoid these problems by using the
rename operation.

We now consider the general case of SQL queries involving multiple relations.
As we have seen earlier, an SQL query can contain three types of clauses, the
select clause, the from clause, and the where clause. The role of each clause is as
follows:

• The select clause is used to list the attributes desired in the result of a query.

• The from clause is a list of the relations to be accessed in the evaluation of
the query.

• The where clause is a predicate involving attributes of the relation in the
from clause.

A typical SQL query has the form

select A1, A2,..., An
from r1, r2,...,rm
where P;

Each Ai represents an attribute, and each ri a relation. P is a predicate. If the where
clause is omitted, the predicate P is true.

Although the clauses must be written in the order select, from, where, the
easiest way to understand the operations specified by the query is to consider the
clauses in operational order: first from, then where, and then select.

The from clause by itself defines a Cartesian product of the relations listed
in the clause. It is defined formally in terms of set theory, but is perhaps best
understood as an iterative process that generates tuples for the result relation of
the from clause.

for each tuple t1 in relation r1

for each tuple t2 in relation r2

...
for each tuple tm in relation rm

Concatenate t1, t2,..., tm into a single tuple t
Add t into the result relation

The result relation has all attributes from all the relations in the from clause.
Since the same attribute name may appear in both ri and rj , as we saw earlier,
we prefix the the name of the relation from which the attribute originally came,
before the attribute name.

For example, the relation schema for the Cartesian product of relations instructor and teaches is:

(instructor.ID, instructor.name, instructor.dept_name, instructor.salary
teaches.ID, teaches.course_id, teaches.sec_id, teaches.semester, teaches.year)

With this schema, we can distinguish instructor.ID from teaches.ID. For those attributes that appear in only one of the two schemas, we shall usually drop the
relation-name prefix. This simplification does not lead to any ambiguity. We can
then write the relation schema as:

(instructor.ID, name, dept_name, salary
teaches.ID, course_id, sec_id, semester, year)

To illustrate, consider the instructor relation in Figure 2.1 and the teaches
relation in Figure 2.7. Their Cartesian product is shown in Figure 3.6, which
includes only a portion of the tuples that make up the Cartesian product result.2
The Cartesian product by itself combines tuples from instructor and teaches
that are unrelated to each other. Each tuple in instructor is combined with every
tuple in teaches, even those that refer to a different instructor. The result can be
an extremely large relation, and it rarely makes sense to create such a Cartesian
product.

Basic Structure of SQL Queries

Instead, the predicate in the where clause is used to restrict the combinations
created by the Cartesian product to those that are meaningful for the desired
answer. We would expect a query involving instructor and teaches to combine a
particular tuple t in instructor with only those tuples in teaches that refer to the
same instructor to which t refers. That is, we wish only to match teaches tuples with
instructor tuples that have the same ID value. The following SQL query ensures
this condition, and outputs the instructor name and course identifiers from such
matching tuples.

select name, course_id
from instructor, teaches
where instructor.ID= teaches.ID;

Note that the above query outputs only instructors who have taught some course.
Instructors who have not taught any course are not output; if we wish to output
such tuples, we could use an operation called the outer join.

If the instructor relation is as shown in Figure 2.1 and the teaches relation is
as shown in Figure 2.7, then the relation that results from the preceding query
is shown in Figure 3.7. Observe that instructors Gold, Califieri, and Singh, who
have not taught any course, do not appear in the above result.

If we only wished to find instructor names and course identifiers for instructors in the Computer Science department, we could add an extra predicate to the

where clause, as shown below.

select name, course_id
from instructor, teaches
where instructor.ID= teaches.ID and instructor.dept_name = ’Comp. Sci.’;

Note that since the dept_name attribute occurs only in the instructor relation, we
could have used just dept_name, instead of instructor.dept_name in the above query.

In general, the meaning of an SQL query can be understood as follows:

Basic Structure of SQL Queries

1. Generate a Cartesian product of the relations listed in the from clause

2. Apply the predicates specified in the where clause on the result of Step 1.

3. For each tuple in the result of Step 2, output the attributes (or results of
expressions) specified in the select clause.


The above sequence of steps helps make clear what the result of an SQL query
should be, not how it should be executed. A real implementation of SQL would
not execute the query in this fashion; it would instead optimize evaluation by
generating (as far as possible) only elements of the Cartesian product that satisfy
the where clause predicates.


When writing queries, you should be careful to include appropriate where
clause conditions. If you omit the where clause condition in the preceding SQL
query, it would output the Cartesian product, which could be a huge relation.
For the example instructor relation in Figure 2.1 and the example teaches relation
in Figure 2.7, their Cartesian product has 12 ∗ 13 = 156 tuples — more than we
can show in the text! To make matters worse, suppose we have a more realistic
number of instructors than we show in our sample relations in the figures, say 200
instructors. Let’s assume each instructor teaches 3 courses, so we have 600 tuples
in the teaches relation. Then the above iterative process generates 200 ∗ 600 =
120,000 tuples in the result.



Frequently Asked Questions

+
Ans: The set of relations in a database must be specified to the system by means of a data-definition language (DDL). view more..
+
Ans: IBM developed the original version of SQL, originally called Sequel, as part of the System R project in the early 1970s. view more..
+
Ans: An object database is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are table-oriented. Object-relational databases are a hybrid of both approaches. view more..
+
Ans: The basic structure of an SQL query consists of three clauses: select, from, and where. view more..
+
Ans: This chapter discusses techniques for securing databases against a variety of threats. It also presents schemes of providing access privileges to authorized users. view more..
+
Ans: The natural join operation operates on two relations and produces a relation as the result. view more..
+
Ans: Reason to rename a relation is a case where we wish to compare tuples in the same relation. view more..
+
Ans: SQL permits a variety of functions on character strings. Read to know about them. view more..
+
Ans: The company is organized into departments. Each department has a unique name, a unique number, and a particular employee who manages the department. We keep track of the start date when that employee began managing the department. A department may have several locations.  A department controls a number of projects, each of which has a unique name, a unique number, and a single location view more..
+
Ans: Entities and Their Attributes. The basic object that the ER model represents is an entity, which is a thing in the real world with an independent existence. An entity may be an object with a physical existence (for example, a particular person, car, house, or employee) view more..
+
Ans: A database usually contains groups of entities that are similar. For example, a company employing hundreds of employees may want to store similar information concerning each of the employees. These employee entities share the same attributes, but each entity has its own value(s) for each attribut view more..
+
Ans: An entity type DEPARTMENT with attributes Name, Number, Locations, Manager, and Manager_start_date. Locations is the only multivalued attribute. We can specify that both Name and Number are (separate) key attributes because each was specified to be unique view more..
+
Ans: There are several implicit relationships among the various entity types. In fact, whenever an attribute of one entity type refers to another entity type, some relationship exists. For example, the attribute Manager of DEPARTMENT refers to an employee who manages the department; the attribute view more..
+
Ans: Select clause uses. view more..
+
Ans: Entity types that do not have key attributes of their own are called weak entity types. In contrast,regular entity types that do have a key attribute—which include all the examples discussed so far—are called strong entity types view more..
+
Ans: If some cardinality ratio or dependency cannot be determined from the requirements, the users must be questioned further to determine these structural constraints view more..
+
Ans: The individual entity instances in an entity set and the individual relationship instances in a relationship set. In ER diagrams the emphasis is on representing the schemas rather than the instances. view more..
+
Ans: The UML methodology is being used extensively in software design and has many types of diagrams for various software design purposes. We only briefly present the basics of UML class diagrams here, and compare them with ER diagrams view more..




Rating - 3/5
534 views

Advertisements