View of Data

A database system is a collection of interrelated data and a set of programs that allow users to access and modify these data. A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained.

1. Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for efficiency has led designers to use complex data structures to represent data in the database. Since many database-system users are not computer trained, developers hide the complexity from users through several levels of abstraction, to simplify users’ interactions with the system:
• Physical level. The lowest level of abstraction describes how the data are actually stored. The physical level describes complex low-level data structures in detail.
• Logical level. The next-higher level of abstraction describes what data are stored in the database, and what relationships exist among those data. The logical level thus describes the entire database in terms of a small number of relatively simple structures. Although implementation of the simple structures at the logical level may involve complex physical-level structures, the user of the logical level does not need to be aware of this complexity. This is referred to as physical data independence. Database administrators, who must decide what information to keep in the database, use the logical level of abstraction.
• View level. The highest level of abstraction describes only part of the entire database. Even though the logical level uses simpler structures, complexity remains because of the variety of information stored in a large database. Many users of the database system do not need all this information; instead, they need to access only a part of the database. The view level of abstraction exists to simplify their interaction with the system. The system may provide many views for the same database. Figure 1.1 shows the relationship among the three levels of abstraction. An analogy to the concept of data types in programming languages may clarify the distinction among levels of abstraction. Many high-level programming 


View of Data

languages support the notion of a structured type. For example, we may describe a record as follows: 1
type instructor = record
ID : char (5);
name : char (20);
dept name : char (20);
salary : numeric (8,2);

This code defines a new record type called instructor with four fields. Each field has a name and a type associated with it. A university organization may have several such record types, including
• department, with fields dept name, building, and budget
• course, with fields course id, title, dept name, and credits
• student, with fields ID , name, dept name, and tot cred

At the physical level, an instructor, department, or student record can be described as a block of consecutive storage locations. The compiler hides this level of detail from programmers. Similarly, the database system hides many of the lowest-level storage details from database programmers. Database administrators, on the other hand, may be aware of certain details of the physical organization of the data.

At the logical level, each such record is described by a type definition, as in the previous code segment, and the interrelationship of these record types is defined as well. Programmers using a programming language work at this level of abstraction. Similarly, database administrators usually work at this level of abstraction. Finally, at the view level, computer users see a set of application programs that hide details of the data types. At the view level, several views of the database are defined, and a database user sees some or all of these views. In addition
to hiding details of the logical level of the database, the views also provide a security mechanism to prevent users from accessing certain parts of the database. For example, clerks in the university registrar office can see only that part of the database that has information about students; they cannot access information about salaries of instructors.

2. Instances and Schemas
Databases change over time as information is inserted and deleted. The collection of information stored in the database at a particular moment is called an instanc of the database. The overall design of the database is called the database schema. Schemas are changed infrequently, if at all The concept of database schemas and instances can be understood by analogy
to a program written in a programming language. A database schema corresponds to the variable declarations (along with associated type definitions) in a program. Each variable has a particular value at a given instant. The values of the variables in a program at a point in time correspond to an instance of a database schema. Database systems have several schemas, partitioned according to the levels of abstraction. The physical schema describes the database design at the physical level, while the logical schema describes the database design at the logical level. A database may also have several schemas at the view level, sometimes called subschemas, that describe different views of the database. Of these, the logical schema is by far the most important, in terms of its effect on application programs, since programmers construct applications by using the logical schema. The physical schema is hidden beneath the logical schema, and can usually be changed easily without affecting application programs. Application programs are said to exhibit physical data independence if they do not depend
on the physical schema, and thus need not be rewritten if the physical schema changes We study languages for describing schemas after introducing the notion of data models in the next section.
3. Data Models
Underlying the structure of a database is the data model: a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. A data model provides a way to describe the design of a database at the physical, logical, and view levels.

There are a number of different data models that we shall cover in the text. The data models can be classified into four different categories:
Relational Model. The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns, and each column has a unique name. Tables are also know as relations. The relational model is an example of a record-based model. Record-based models are so named because the database is structured in
fixed-format records of several types. Each table contains records of a particular type. Each record type defines a fixed number of fields, or attributes. The columns of the table correspond to the attributes of the record type. The relational data model is the most widely used data model, and a vast majority of current database systems are based on the relational model. 

Entity-Relationship Model. The entity-relationship ( E-R ) data model uses a collection of basic objects, called entities, and relationships among these objects. An entity is a “thing” or “object” in the real world that is distinguishable from other objects. The entity-relationship model is widely used in database design.
Object-Based Data Model. Object-oriented programming (especially in Java, C++, or C#) has become the dominant software-development methodology. This led to the development of an object-oriented data model that can be seen as extending the E-R model with notions of encapsulation, methods (functions), and object identity. The object-relational data model combines
features of the object-oriented data model and relational data model. 
Semistructured Data Model. The semistructured data model permits the specification of data where individual data items of the same type may have different sets of attributes. This is in contrast to the data models mentioned earlier, where every data item of a particular type must have the same set of attributes. The Extensible Markup Language ( XML ) is widely used to
represent semistructured data.  

Historically, the network data model and the hierarchical data model preceded the relational data model. These models were tied closely to the underlying implementation, and complicated the task of modeling data. As a result they are used little now, except in old database code that is still in service in some places. They are outlined online in Appendices D and E for interested readers.


Frequently Asked Questions

Ans: Self-describing nature of a database system Insulation between programs and data, and data abstraction Support of multiple views of the data Sharing of data and multiuser transaction processing view more..
Ans: A database system is a collection of interrelated data and a set of programs that allow users to access and modify these data. A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained. view more..
Ans: The specialization relationship may also be referred to as a superclass-subclass relationship. Higher- and lower-level entity sets also may be designated by the terms superclass and subclass, respectively. The person entity set is the superclass of the employee and student subclasses. view more..
Ans: A database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data. view more..
Ans: Databases are widely used in enterprises, banking and finance, universities, airlines, telecommunication, etc. view more..
Ans: Specialization is the process of defining a set of subclasses of an entity type; this entity type is called the superclass of the specialization. We use the term generalization to refer to the process of defining a generalized entity type from the given entity types. view more..
Ans: we discuss differences between specialization/generalization lattices (multiple inheritance) and hierarchies (single inheritance), and elaborate on the differences between the specialization and generalization processes during conceptual database schema design. We discuss constraints that apply to a single specialization or a single generalization. view more..
Ans: it is sometimes necessary to represent a single superclass/subclass relationship with more than one superclass, where the superclasses represent different entity types. In this case, the subclass will represent a collection of objects that is a subset of the UNION of distinct entity types. view more..
Ans: For our sample database application, consider a UNIVERSITY database that keeps track of students and their majors, transcripts, and registration as well as of the university’s course offerings. The database also keeps track of the sponsored research projects of faculty and graduate students. view more..
Ans: The basic notation for specialization/generalization is to connect the subclasses by vertical lines to a horizontal line, which has a triangle connecting the horizontal line through another vertical line to the superclass. A blank triangle indicates a specialization/generalization with the disjoint constraint, and a filled triangle indicates an overlapping constraint. view more..
Ans: The similarities and differences between conceptual modeling and knowledge representation, and introduces some of the alternative terminology and a few additional concepts.The goal of KR techniques is to develop concepts for accurately modeling some domain of knowledge by creating an ontology that describes the concepts of the domain and how these concepts are interrelated. view more..
Ans: Recovery from transaction failures usually means that the database is restored to the most recent consistent state just before the time of failure. To do this, the system must keep information about the changes that were applied to data items by the various transactions. view more..
Ans: Several types of locks are used in concurrency control. To introduce locking concepts gradually, first we discuss binary locks, which are simple, but are also too restrictive for database concurrency control purposes, and so are not used in practice. Then shared/exclusive locks - also known as read/write locks - which provide more general locking capabilities and are used in practical database locking systems. view more..
Ans: In this section we discuss the concepts of concurrent execution of transactions and recovery from transaction failures. view more..
Ans: Conceptual modeling is a very important phase in designing a successful database application. Generally, the term database application refers to a particular database and the associated programs that implement the database queries and updates. view more..
Ans: Data is converted into information, and information is then evaluated and organised so that it can be used purposefully as knowledge. view more..

Rating - 3/5