Chapter 1: Introduction
data about data like
- database schema. - integrity constraints. - authorization.
alternative way of evaluating a given query
- equivalent expressions. - different algorithms for each operation.
storage manager tasks
- interaction with the file manager. - efficient storing, retrieving & updating of data.
Application programs genrally access database through one of:
- language extensions to allow embedded SQL. - application program interface which allow SQL queries to be sent to a database.
entity relationship model
- models an enterprise as a collection of enitities & relationships. - represented diagrammatically by an enitity-relationship diagram.
storage manage issues
- storage access. - file organization. - indexing and hashing.
database system has several subsystems
- storage manager. - query processor.
data model
a collection of conceptual tools for describing data, data relationships, data semantics, and data constraints.
storage manager
a program module that provides the interface between the low-level data stored in the database & the application programs and queries submitted to the system.
superkey
a set of one or more attributes that, taken collectively, allow us to identify uniquely a tuple in the relation.
The management of data involves
both the definition of structures for the storage of information and the provision of mechanisms for the manipulation of information.
Database applications are typically
brokenup into a front-end part that runs at client machines and a part that runs at the back end.
Such minimal superkeys are called
candidate keys.
data manipulation language
language for accessing & manipulating the data organized by the appropriate data model.
The relational datamodel is
the most widely deployed model for storing data in databases. Other data models are the object-oriented model, the object relational model, and semi structured data models.
the database system must provide for
the safety of the information stored, in the face of system crashes or attempts at unauthorized access.
If data are to be shared among several users
the system must avoid possible anomalous results.
transaction management component
ensures that the database remains in consistent (correct) state despite system failures and translation failures.
XML a great way to
exchange data, not just documents.
Knowledge-discovery techniques
attempt to discover automatically statistical rules and patterns from data.
The entity-relationship (E-R) datamodel is
awidely used datamodel for database design. It provides a convenient graphical representation to view data, relationships, and constraints.
database applications
banking, airlines, universities, sales, online retailers, manufacturing.
Database systems can be
centralized, or client-server, where one server machine executes work on behalf of multiple client machines.
transaction
collection of operations that performs a single logical function in database application.
The field of data mining
combines knowledge discovery techniques invented by artificial intelligence researchers and statistical analysts, with efficient implementation techniques that enable them to be used on extremely large databases.
A database-management system
(DBMS) consists of a collection of interrelated data and a collection of programs to access that data. The data describe one particular enterprise.
the architecture of a database system is greatly influenced by the underlying computer system on which the database is running
- centralized. - client-server. - parallel (multi-processor). - distributed.
DBMS
- collection of interrelated data. - set of program to access the data. - an environment that is both convenient & efficient to use.
data models a collection of tools for describing
- data. - data relationships. - data semantics. - data constraints.
older data models
- network model. - hierarchical model.
quary processing
- parsing and tanslation. - optimization. - evaluation.
levels of abstraction
- physical level. - logical level. - view level.
Two classes of query language
- procedural: user specifies what data is required & how to get those data. - declarative: user specifies what data is required without specifies how to get those data.
drawback of using file system to store data
- redundancy & inconsistency. - difficulty in accessing data. - data isolation, multiple files & formats. - integrity problems. - atomicity of updates. - concurrent accesses. - security problems.
data models
- relational model. - entity-relational data model. - object-based data model. - semistructured data model.
SQL
- the most widely used query language. - widely used non-procedural language.
Transaction management
ensures that the database remains in a consistent (correct) state despite system failures. The transaction manager ensures that concurrent transaction executions proceed without conflicting.
DDL compiler generates
a set of table templates stored in a data dictionary.
Database systems are ubiquitous today
and most people interact, either directly or indirectly, with databases many times every day.
view level
application program hide details of data types for security purposes.
concurrency-control manager
controls the interation among the concurrent transactions, to ensure the consistency of the database.
metadata
data about data
logical schema
database design at the logical level.
physical schema
database design at the physical level.
database logical design
deciding on the database schema, database design requires that we find a "good" collection of relation schema.
physical design
deciding on the physical layout of the database.
physical level
describes how a record stored.
logical level
describes the data stored in database, & the relationships among the data.
XML originally intended as
document markup language not database language.
Database systems can also be designed to
exploit parallel computer architectures.
XML
extensible markup language.
normalization theory
formalize what designs are bad, & test from them.
data-definition language (DDL)
is a language for specifying the database schema and as well as other properties of the data.
data-manipulation language (DML)
is a language that enables users to access or manipulate data.
data dictionary contains
metadata
XML a wide variety of tools is available for
pasing, browsing & querying XML documents/data.
DML also know as
query language.
referential integrity
references constraint in SQL.
Distributed databases
span multiple geographically separated machines.
data definition language DDL
specification notation for defining the database schema.
Database systems are designed to
store large bodies of information.
query processor
subsystem compiles and executes DDL and DML statements.
storage manager
subsystem provides the interface between the lowlevel data stored in the database and the application programs and queries submitted to the system.
physical data independence
the ability to modify the physical schema without changing the logical schema.
instance
the actual content of the database at a particular point in time.
In three-tier architectures
the back end part is itself broken up into an application server and a database server.
Database design mainly involves
the design of the database schema.
Different types of user interfaces have been designed for
the different types of users.
In two-tier architectures
the front end directly communicates with a database running at the back end.
schema
the logical structure of the data.
The architecture of a database system is greatly influenced by
the underlying computer system on which the database system runs.
prymary key
to denote a candidate key that is chosen by the database designer as the principal means of identifying tuples within a relation.
The primary goal of a DBMS is
to provide an environment that is both convenient and efficient for people to use in retrieving and storing information.
A major purpose of a database system is
to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained.
There are four different types of database-system users
way they expect to interact with the system.
business decision
what attributes should we record in the database.
computer science decision
what relation schema should we have & how should the attributes be distributed among the various relation schema.
Nonprocedural DMLs
which require a user to specify only what data are needed, without specifying exactly how to get those data, are widely used today.
XML defined by
www consortium (W3C).