Database systems exam 1
OO data model is based on
- An object is an abstraction of a real-world entity - Attributes describe the properties of an object. For example, a PERSON object includes the attributes Name, Social Security Number, and Date of Birth. - Has all of the regular object functions: inheritance, methods etc.
Other things DBMS does
- DBMS stores definitions of data elements and DBMS stores definitions of data elements and relationships (metadata) in a data dictionary - DBMS looks up required data component structures and relationships - Changes automatically recorded in the dictionary - DBMS provides data abstraction and removes structural and data dependency
Things DBMS does
- DBMS transforms data entered to conform to required data structures - DBMS transforms pyhsically retrieved data to conform to user's logical expectations • Security management - DBMS creates a security system that enforces user security and data privacy - Security rules determine which users can access the database, which items can be accessed, etc.
DBMS functions
- Data dictionary management. The DBMS stores definitions of the data elements and their relationships (metadata) in a data dictionary - Data storage management. The DBMS creates and manages the complex structures required for data storage - Security management. The DBMS creates a security system that enforces user security and data privacy - Multiuser access control - Backup and recovery management. The DBMS provides backup and data recovery to ensure data safety and integrity.
Integrity
- Entity Integrity: All primary key entries are unique, and no part of a primary key may be null. Each row will have a unique identity, and foreign key values can properly reference primary key values. - Reference integrity: All primary key entries are unique, and no part of a primary key may be null. Purpose Each row will have a unique identity, and foreign key values can properly reference primary key values. Example No invoice can have a duplicate number, nor can it be null. In short, all invoices are uniquely identified by their invoice number.
Five major parts of a database system:
- Hardware: all the system's physical devices - Software - People - Procedures - Data
Two questions to identify the relationship type?
- How many instances of B are related to one instance of A? - How many instances of A are related to one instance of B?
Naming conventions
- Naming occurs during translation of business rules to data model components • Names should make the object unique and distinguishable from other objects • Names should also be descriptive of objects in the environment and be familiar to users
NoSql databases
- Not based on the relational model, hence the name NoSQL - Supports distributed database architectures - Provides high scalability, high availability, and fault tolerance - Supports very large amounts of sparse data - Geared toward performance rather than transaction consistency
translating business rules into data components
- Nouns translate into entities • Verbs translate into relationships among entities • Relationships are bidirectional
(Nosql) Sparse data
- Number of attributes is very large - Number of actual data instances is low
Three types of software required
- Operating system software - DBMS software DBMS software - Application programs and utility software
Data dictionary
- Provides detailed accounting of all tables found within the user/designer-created database - Contains ( ) at least) all the attribute names and characteristics for each table in the system - Contains metadata: data about data
Who are the people in database system
- System and database administrators - Database designers - Systems analysts and programmers Systems analysts and programmers - End users
Why are business rules important?
- They help to standardize the company's view of data. - They can be a communications tool between users and designers. - They allow the designer to understand the nature, role, and scope of the data.
(No sql) key/value model
- Two data elements: key and value • Every key has a corresponding value or set of values
Left outer join
- Yields all of the rows in the left table - Including those that do not have a matching value in the right table
Right outer join
- Yields all of the rows in the right table - Including those that do not have matching values in the left table
Disadvantages of DBMS
- increased cost: require complex hardware and software and personnel - management complexity - maintaining concurrency: keep system current - DBMS vendors frequently update their software
Problems with file systems
- lengthy development times: simple data task require a lot of programming - difficult to get quick answers - lack of security and limited data sharing - structurally dependent
types of databases
-Centralized database: data located at a single site • Distributed database: data distributed across several different sites several different sites • Operational database: supports a company's day-to-day operations - Transactional or production database
types of relationships
1:1 1:Many Many:Many
The evolution of database models
1st generation: File system 2nd generation: Hierarchical and network 3rd: relational 4th: Object oriented/object relational Next gen: XML, hybrid DBMS
Data warehouse
: stores data used for tactical or strategic decisions
a foreign key
A foreign key (FK) is an attribute whose values match the primary key values in the another table
Candidate key
A superkey without unnecessary attributes
Nulls can also represent
An unknown attribute value. A known, but missing, attribute value. A "not applicable" condition can create problems when functions such as COUNT, AVERAGE, and SUM
key attribute
Any attribute that is part of a key Any attribute that is part of a key
super key
Any key that uniquely identifies each row
Object oriented model
Both data and their relationships are contained in a single structure known as an object.
composite key
Composed of more than one attribute
Network Model
Created to represent complex data relationships more effectively than the hierarchical model: Improves database performance - Collection of records in 1:M relationships Set composed of two record types: • Owner • Member
Why is database design important?
Designer must identify database's expected use • Well-designed database: designed database: - Facilitates data management - Generates accurate and valuable information • Poorly designed database: - Causes difficult-to-trace errors
entity integrity
Each row (entity instance) in the table has its own unique identity
Key
Each row in a table must be uniquely identifiable Key: one or more attributes that determine other attributes i.e. if i have student id i can find student name, number,
File systems evolved
Initially, computer file systems resembled manual systems • As number of files increased, file systems evolved - Each file used its own application program to store, retrieve, and modify data - Each file was owned by individual or department that commissioned its creation
Nulls
No data entry - Not permitted in primary key - Should be avoided in other attributes
Produce an alphabetic listing on employees by last name
SELECT FROM *EMPLOYEE ORDER BY EMPLOYEE_LASTNAME
types of databases
Single user database: one user at a time Multi-user database: Workgroup database: small number of users
relational model
Table (relations) - Matrix consisting of row/column intersections - Each row in a relation is called a tuple
controlled redundancy
Tables within the database share common attributes that enable the tables to be linked together
External model
The external model is the end users' view of the data environment
physical independence
When you can change the physical model without affecting the internal model, you have physical independence.
Why a Spreadsheet Is Not a Database?
While a spreadsheet allows for the creation of multiple tables, it does not support even the most basic database functionality such as support for self-documentation through metadata, enforcement of data types or domains to ensure consistency of data within a column, defined relationships among tables, or constraints to ensure consistency of data across related tables. Most users lack the necessary training to recognize the limitations of spreadsheets for these types of tasks.
field
a character or group of characters which has specific meaning > used to define and store data
file
a collection of related records
conceptual model
a global view of the entire database as viewed by the entire organization
internal model
a global view of the entire database as viewed by the entire organization
a secondary key
a key that is used strictly for data retrieval purposes. Suppose customer data are stored in a CUSTOMER table in which the customer number is the primary key. Do you suppose that most customers will remember their numbers?
record
a logically connected set of fields used to describe a person place or thing
constraints
a restriction placed on data
Unified Modeling Language (UML)
a set of standards used to graphically model a system (OO)
adhoc query
a spur of the moment query
SQL based relational database contains three parts
a user interface, a set of tables stored in the database the SQL "engine."
Data anomalies:
abnormalities when all changes in redundant data are not made changes in redundant data are not made correctly
structurally dependent
access to files is dependent of structure of data
DBMS handles
all request and hides the complexity of the database also allow database to be used by many users
JOIN
allows information to be combined from two or more tables
SELECT,
also known as RESTRICT, yields values for all rows found in a table that satisfy a given condition
Database system: defines
and regulates the collection, storage, management, use of data
Hybrid DBMSs
are emerging that retain the advantages of the relational model and at the same time provide programmers with an object-oriented view of the underlying data
Hierarchical system
basically and upside down tree that contains levels and segments: a set of 1:M between a parent and its children
flags
can be used instead of nulls to represent the absence of a value i.e. -999
ERD (Entity relationship diagram)
chen model and crows feet model
UNION
combines all rows from two tables, excluding duplicate rows. The tables must have the same attribute characteristics (the columns and domains must be compatible) to be used in the UNION
tables are related through
common attributes
Database system
consists of logically related data stored in a single logical data repository - May be physically distributed among multiple storage facilities - DBMS eliminates most of file system's problems - Current generation stores data structures, relationships between structures, and access paths
meta data
data about data: the characteristics and relationships that link data i.e. the name of each data element and the type of data stored
data vs. information
data is raw facts and information is data in context
DBMS
database management system basically manages a database and controls access: an intermediary between the user and the database
Relational algebra
defines the theoretical way of manipulating table contents using the eight relational operators: SELECT, PROJECT, JOIN, INTERSECT, UNION, DIFFERENCE, PRODUCT, and DIVIDE
SQL engine
executes all queries
levels of abstraction
external conceptual internal
Good data modeling
facilitates communication between the designer, user, and the developer
advantages of DBMS
improved data sharing improved data security: framework for managing access minimized data inconsistency improved data access: adhoc queries quick and fast increased decision making increased end user productivity
entity
is a anything about which data can be collected (person, place, thing etc.)
attribute
is a characteristic of an entity i.e. LAST NAME is an attribute of customer
database management
is a field that focuses on the proper generation, storage, and retrieval of data
An inner join
is a join that only returns matched records from the tables that are being joined
data model
is a model of real world data structures: is an abstraction
Database
is a shared integrate computer structure that store a collection of end user data and meta data
relationship
is an association among entities
A natural join
links tables by selecting only the rows with common values in their common attribute(s)
equijoin
links tables on the basis of an equality condition that compares specified columns of each table: If any other comparison operator is used, the join is called a theta join
logical data form vs. physical data form
logical: human view vs. physical: the was it actually stored
physical level
lowest level of abstraction, describing the way data are saved on storage media such as disks or tapes
referential integrity
means that if the foreign key contains a value, that value refers to an existing valid tuple (row) in another relation
RDBMS
relational database management system Arguably the most important advantage of the RDBMS is its ability to hide the complexities of the relational model from the user. The RDBMS manages all of the physical details, while the user sees the relational database as a collection of tables in which data are stored. The user can manipulate and query the data in a way that seems intuitive and logical.
data redundancy
same data stored in different places - decreases data security - data incosistency
reasons for the rise of relational databases
structural independence: pwoerfull query language
query result set
the answer the database sends back
Outer join
the matched pairs would be retained, and any unmatched values in the other table would be left null
schema
which is the conceptual organization of the entire database as viewed by the database administrator
PRODUCT
yields all possible pairs of rows from two tables—also known as the Cartesian product
DIFFERENCE
yields all rows in one table that are not found in the other table; that is, it subtracts one table from the other
PROJECT
yields all values for selected attributes. In other words, PROJECT yields a vertical subset of a table
INTERSECT
yields only the rows that appear in both tables. As was true in the case of UNION, the tables must be union-compatible to yield valid results
a data model should contain
• A description of the data structure that will store the end-user data. • A set of enforceable rules to guarantee the integrity of the data. • A data manipulation methodology to support the real-world data transformations.
A summary of data models
• Common characteristics: - Conceptual simplicity with semantic completeness - Represent the real world as closely as 35 possible - Real-world transformations must comply with consistency and integrity characteristics • Each new data model capitalized on the shortcomings of previous models • Some models better suited for some tasks
reasons to study file systems
• Reasons for studying file systems: - Complexity of database design is easier to understand - Understandingyp p file system problems helps to avoid problems with DBMS systems - Knowledge of file system is useful for converting file system to database system