# Structure of Relational Databases

A relational database consists of a collection of **tables,** each of which is assigned a

unique name. For example, consider the instructor table of Figure 2.1, which stores

information about instructors. The table has four column headers:* ID, name, dept_
name, *and

*salary*. Each row of this table records information about an

*instructor*,

consisting of the instructor’s

*ID, name, dept_name*, and

*salary*. Similarly, the

*course*

table of Figure 2.2 stores information about courses, consisting of a

*course_id, title,*

dept_name, and

dept_name

*credits,*for each course. Note that each instructor is identified by

the value of the column ID, while each course is identified by the value of the

column

*course_id*.

Figure 2.3 shows a third table, *prereq*,which stores the prerequisite courses for

each course. The table has two columns, *course_id* and *prereq_id*. Each row consists

of a pair of course identifiers such that the second course is a prerequisite for the

first course.

Thus, a row in the* prereq* table indicates that two courses are related in the

sense that one course is a prerequisite for the other. As another example, we

consider the table *instructor*, a row in the table can be thought of as representing

the relationship between a specified *ID* and the corresponding values for *name,
dept_name*, and

*salary*values.

In general, a row in a table represents a relationship among a set of values.

Since a table is a collection of such relationships, there is a close correspondence

between the concept of table and the mathematical concept of relation, fromwhich

the relational data model takes its name. In mathematical terminology, a ** tuple** is

simply a sequence (or list) of values. A relationship between n values is represented

mathematically by an

**of values, i.e., a tuple with**

*n-tuple**n*values, which

corresponds to a row in a table.

Thus, in the relational model the term **relation** is used to refer to a table, while

the term **tuple** is used to refer to a row. Similarly, the term **attribute** refers to a

column of a table.

Examining Figure 2.1,we can see that the relation* instructor* has four attributes:

*ID, name, dept_name, *and* salary.*

We use the term **relation instance** to refer to a specific instance of a relation,

i.e., containing a specific set of rows. The instance of *instructor* shown in Figure 2.1

has 12 tuples, corresponding to 12 instructors.

The order in which tuples appear in a relation is irrelevant, since a relation

is a set of tuples. Thus, whether the tuples of a relation are listed in sorted order,

as in Figure 2.1, or are unsorted, as in Figure 2.4, does not matter; the relations in

the two figures are the same, since both contain the same set of tuples. For ease

of exposition, we will mostly show the relations sorted by their first attribute.

For each attribute of a relation, there is a set of permitted values, called the

**domain** of that attribute. Thus, the domain of the* salary* attribute of the *instructor*

relation is the set of all possible salary values, while the domain of the name

attribute is the set of all possible instructor names.

We require that, for all relations *r*, the domains of all attributes of *r* be atomic.

A domain is **atomic** if elements of the domain are considered to be indivisible

units. For example, suppose the table *instructor* had an attribute *phone_number,*

which can store a set of phone numbers corresponding to the instructor. Then the

domain of *phone_number* would not be atomic, since an element of the domain is a

set of phone numbers, and it has subparts, namely the individual phone numbers

in the set.

The important issue is not what the domain itself is, but rather how we use

domain elements in our database. Suppose now that the phone number attribute

stores a single *phone_number*. Even then, if we split the value from the phone

number attribute into a country code, an area code and a local number, we would

be treating it as a nonatomic value. If we treat each phone number as a single

indivisible unit, then the attribute *phone_number* would have an atomic domain.

The **null** value is a special value that signifies that the value is unknown or

does not exist. For example, suppose as before that we include the attribute *phone_
number* in the instructor relation. It may be that an instructor does not have a

phone number at all, or that the telephone number is unlisted. We would then

have to use the null value to signify that the value is unknown or does not exist.

We shall see later that null values cause a number of difficulties when we access

or update the database, and thus should be eliminated if at all possible.

**Frequently Asked Questions**

## Recommended Posts:

- Characteristics of the Database Approach
- View of Data
- Subclasses, Superclasses, and Inheritance
- what is database management system
- Database-System Applications
- Specialization and Generalization
- Constraints and Characteristics of Specialization and Generalization Hierarchies
- Modeling of UNION Types Using Categories
- A Sample UNIVERSITY EER Schema, Design Choices, and Formal Definitions
- Example of Other Notation: Representing Specialization and Generalization in UML Class Diagrams
- Data Abstraction, Knowledge Representation, and Ontology Concepts
- Using High-Level Conceptual Data Models for Database Design
- Using High-Level Conceptual Data Models for Database Design
- Using High-Level Conceptual Data Models for Database Design
- Using High-Level Conceptual Data Models for Database Design

**3/5**