Database Systems - Chapter 1
Data Quality
A comprehensive approach to ensuring the accuracy, validity and timeliness of data.
Data Independence
A condition in which data access is unaffected by changes in the physical data storage characteristics
Data Inconsistency
A condition in which different versions of the same data yield different (inconsistent) results. For example, departments stores salesman name as Bill Brown and another department stores the same salesman's name a William Brown.
Key Points About Data & Information
-Data constitutes the building blocks of information -Information is produced by processing data -Information is used to reveal the meaning of data -Accurate, relevant and timely information is the key to good decision making -Good decision making is the key to organizational survival in a global environment
Data Anomaly
A data abnormality in which inconsistent changes have been made to a database. For example, an employee moves, but the address change is not corrected in all files in the database.
Transactional Database
see operational database
From a general management point of view, the database system is composed of five major parts:
1. Hardware- all of the systems physical devices 2. Software- Operating System Software, DBMS, and Application Programs and Utility Software 3. People- System Administrator, DBA, Database Designers, System Analysts and Programmers, and End Users 4. Procedures- The instructions and rules that govern the design and use of the database system. 5. Data- The collection of facts stored in the database
What Advantages Does DBMS Provide?
1. Improved Data Sharing- Better access to more and better managed data = end users respond quickly to changes. 2. Improved Data Security- DBMS provides a framework for better enforcement of data privacy and security policies. 3. Better Data Integration- Much easier to see how actions in one segment of the company affect another segment. 4. Minimized Data Inconsistency- Properly designed database = reduced probability of data inconsistency. 5. Improved Data Access- DBMS makes it possible to produce quick answers to ad hoc queries. 6. Improved decision making- 7. Increased end-user productivity-
Database System Disadvantages:
1. Increased Costs 2. Management Complexity 3. Maintaining Currency 4. Vendor Dependence 5. Frequent Upgrade/Replacement Cycles
Problems Associated with File Systems
1. Lengthy Development Time- Even the simplest data retrieval task requires extensive programming. 2. Difficulty of Getting Quick Answers- The need to write programs to produce the simplest reports makes ad hoc queries impossible. 3. Complex System Administration- Even a simple file system requires creating and maintaining several file management program. 4. Lack of Security and Limited Data Sharing- Sharing data among multiple geographically dispersed users introduces a lot of security risks. 5. Extensive Programming- Any change to a file structure, no matter how minor, forces modifications in all of the programs that use the data in that file.
Uncontrolled Data Redundancy sets the stage for the following:
1. Poor Data Security 2. Data Inconsistency 3. Data Entry Errors 4. Data Integrity Problems
A Critique of the File System Method Serves TWO Major Purposes:
1. Understanding the shortcomings of the database enables you to understand the development of modern databases. 2. Many of the problems are not unique to file systems. Failure to understand such problems is likely to lead to their duplication in a database environment, even though database technology makes it easy to avoid them.
Data Dictionary
A DBMS component that stores metadata- data about the data. The data dictionary contains data definitions as well as data characteristics and relationships. May also include data that is external to the DBMS.
DBMS Functions
A DBMS performs several important functions that guarantee the integrity and consistency of the data in the database 1. Data Dictionary Management- Any changes made in a database structure are automatically recorded in the data dictionary, freeing you from having to modify all the programs that access the changed structure 2. Data Storage Management- DBMS creates and manages complex structures required for data storage, relieving you of defining and programming physical data characteristics. 3. Data Transformation and Presentation- DBMS transforms entered data to conform to required data structures. relieves you of distinguishing between logical and physical data formats. 4. Security Management- DBMS enforces user security and data privacy 5. Multiuser Access Control- DBMS uses sophisticated algorithms to ensure multiple users can access database concurrently without compromising Integrity 6. Backup and Recovery Management- DBMS provides backup and data recovery to ensure data safety and integrity 7. Data Integrity Management- DBMS promotes and enforces integrity rules, minimizing data redundancy and maximizing data consistency 8. Database Access Languages and Application Programming Interfaces- Provides data access through a query language: SQL 9. Database Communication Interfaces- DBMS accepts end-user requests via multiple, different network environments
Structural Independence
A data characteristic in which a change in the database schema does not affect data access
Structural Dependence
A data characteristic in which change in the database schema affects data access, thus requiring changes in all access programs. A file system exhibits structural dependence.
Operational Database
A database designed primarily to support a company's day to day operations. AKA a Transactional Database, OLTP Database, or Production Database.
Analytical Database
A database focused primarily on storing historical data and business metrics used for tactical or strategic decision making
Centralized Database
A database located at a single site
XML Database
A database system that stores and manages semistructured XML data.
General-Purpose Database
A database that contains a wide variety of data used in multiple disciplines.
Discipline Specific Database
A database that contains data focused on specific subject areas.
Cloud Database
A database that is created and maintained using cloud services, such as Microsoft Azure and Amazon AWS
Multiuser Database
A database that supports multiple concurrent users.
Single User Database
A database that supports only one user at a time
Distributed Database
A logically related database that is stored in two or more physically independent sites
XML: Extensible Markup Language
A metalanguage used to represent and manipulate data elements. Unlike other markup languages XML permits the manipulation of a document's data elements.
Workgroup Database
A multiuser database that usually supports fewer than 50 users or is used for a specific department in an organization.
NoSQL (Not Only SQL)
A new generation of database management system that is not based on the traditional relational database model. NoSQL databases are designed to handle an unprecedented volume of data, variety of data types and structures, and velocity of data operations that are characteristic of new business requirements (Facebook, Twitter, Instagram...).
SQL: Structured Query Language
A powerful and flexible relational database language composed of commands that enable users to create database and table structures , perform various types of data manipulation and data administration and query the database to extract useful information.
Data Management
A process that focuses on data collection, storage and retrieval. Common data management functions include addition, deletion, modification and listing.
Query
A question or task asked by an end user of a database in the form of SQL code. A specific request for data manipulation issued by the end user or the application to the DBMS
Business Intelligence
A set of tools and processes used to capture, collect, integrate, store and analyze data to support business decision making.
OLAP: Online Analytical Processing
A set of tools that provide advanced data analysis for retrieving, processing, and modeling data from the data warehouse.
Database
A shared integrated computer structure that houses a collection of related data. A database contains two types of related data: -End-user data, that is raw facts of interest to the end user -Metadata, that is data about the data.
Desktop Database
A single user database that runs on a personal computer
Data Warehouse
A specialized database that stores historical and aggregated data in a format optimized for decision support
Why Is Database Design Important?
A well designed database facilitates data management and generates accurate and valuable information. A poorly designed database is likely to become a breeding ground for difficult-to-trace errors that may lead to poor decision making- and poor decision making can lead to failure of an organization.
Performance Tuning
Activities that make a database perform more efficiently in terms of storage and access speed.
Database System
An organization of components that defines and regulates the collection, storage, management and use of data in a database environment.
Relational Database Model
Background: E. F. Codd (IBM) Current industry standard Basic Structure : - Table - Tuples - Database - Key: unique identifier - integrated information is gathered through the linking of common fields in different table. Database Models: method of database storage so the database can be maintained and queried correctly and efficiently.
Metadata
Data about data; that is, data about data characteristics and relationships. For example, the type of values stored on each data element and whether the data element can be left empty.
Chapter 1 Summary: One
Data consists of raw facts. Information is the result of processing data to reveal its meaning. Accurately, timely and relevant information is the key to good decision making, and good decision making is the key to organizational survival in a global market.
Chapter 1 Summary: Two
Data is usually stored in a database. To implement a database and to manage its contents, you need a database management system (DBMS). The DBMS serves and the intermediary between the user and the database. The database holds the data that you have collected and data about data known and "metadata."
Unstructured Data
Data that exists in its original, raw state; that is, in the format in which it was collected.
Semistructured Data
Data that has already been processed to some extent
Structured Data
Data that has been formatted to facilitate storage, use, and information generation
Data, Record, Field or Attribute, Table
Data- Every piece on info on the table. Record = Rows in table Field or Attribute = Columns in table Table = "the spreadsheet"
Chapter 1 Summary: Three
Database design defines the database structure. A well designed database facilitates data management and generates accurate and valuable information. A poorly designed database can lead to poor decision making and poor decision making can lead to the failure of an organization
Chapter 1 Summary: Seven
Database management systems were developed to address the file system's inherent weaknesses. Rather than depositing data in independent files, a DBMS presents the database to the end user as a single data repository. This arrangement promotes data sharing, thus eliminating the potential problem of islands of information. In addition, the DBMS enforces data integrity, eliminates redundancy and promotes data security.
Chapter 1 Summary: Four
Databases can be classified according to the number of users supported, where the data is located, the type of data stored, the intended data usage and the degree to which the data is structured.
Chapter 1 Summary: Five
Databases evolved from manual and then computerized file systems. In a file system, data is stored in independent files, each requiring its own data management programs. Although this method of data management is largely outmoded, understanding its characteristics makes database design easier to comprehend.
Data Redundancy
Exists when the same data is stored unnecessarily at different places
Data Processing (DP) Specialist
In a file system, the person responsible for developing and managing a computerized file processing system.
Data Integrity
In a relational database, a condition in which the data in the database complies with all entity and referential integrity constraints. Data Integrity means that: -Data is accurate: No data inconsistencies -Data is verifiable: Data will always yield consistent results
Insertion Anomalies, Modification Anomalies, Deletion Anomalies
Insertion Anomalies: when you insert data and there was a mistake Modification Anomalies: mistake is made our brought in when you update the database Deletion Anomalies: errors brought in when you perform Deletion
Data
Raw facts, or facts that have not yet been processed to reveal their meaning to the end user. Keep in mind that raw data must be properly formatted for storage, processing, and presentation.
Chapter 1 Summary: Six
Some limitations of a file system data management are that it requires extensive programming, system administration can be complex and difficult, making changes to existing structures is difficult, and security features are likely to be inadequate. Also, independent files tend to contain redundant data, leading to problems of structural and data dependence.
Knowledge
The body of information and facts about a specific subject. Knowledge implies familiarity, awareness and understanding of information as it applies to an environment. A key characteristic is the new knowledge can be derived from old knowledge. Data is the foundation of Information and Information is the bedrock of knowledge!
Query Result Set
The collection of data rows returned by a query
DBMS: Database Management System
The collection of programs that manages the database structure and controls access to the data stored in the database. The DBMS serves as the intermediary between the user and the database. The database structure itself is stored as a collection of files and the only way to access the data in those files is through the DBMS
Island of Information
The file system promotes the storage of the same basic data in different locations. Pools of independent, often duplicated and inconsistent data created and managed by different departments
Insert, Update, Delete
The only three ways for data to change: INSERT: add new record(s) to the table UPDATE: make changes to existing record(s) DELETE: remove record(s). This is deleting the entire record
Enterprise Database
The overall company data representation which provides support for present and expected future needs
Database Design
The process that yields the description of the database structure and determines the database components. The second phase of the Database Life Cycle.
Information
The result of processing raw data to reveal its meaning. Information consists of transformed data and facilitates decision making. To reveal meaning, information requires context.
Physical Data Format
The way a computer "sees" (stores) data
Logical Data Format
The way a person views data within the context of a problem domain
Social Media
Web and mobile technologies that enable "anytime, anywhere, always on" human interactions.
Ad Hoc Query
a "spur-of-the-moment" question
OLTP: Online Transaction Processing Database
see operational database
Production Database
see operational database