Chapter 1: Database Systems
Database Consultant
Help companies leverage database technologies to improve business processes and achieve specific goal Skills: Database fundamentals, data modeling, database design, SQL, DBMS, hardware, vendor-specific technologies, etc.
To reveal meaning, information requires _______________.
context
Information in produced by processing _________________.
data
In-memory databases
primary memory (RAM), IBM's solidDB and Oracle's TimesTen
NoSQL (Not only SQL)
is a new generation of DBMS that is not based on the tradition relationship database model; are designed to handle the unprecedented volume of data, variety of data types and structures, and velocity of data operations that are characteristic of these new business requirements
Data management
is a process that focuses on data collection, storage, and retrieval
Database
is a shared, integrated computer structure that houses a collection of related data. It contains two types of data: end-user data (raw facts) and metadata
Metadata
is data about data, through which the end-user data is integrated and managed. It describes the data characteristics and the set of relationships that links the data found within the database (numeric, dates, or text). It presents a more complete picture of the data in the database
Database management (DBMS)
is the collection of programs that manages the database structure and controls access to the data stored. It resembles a very well-organized filing cabinet in which powerful software helps manage the cabinet's contents
Information
is the result of processing raw data to reveal its meaning. It consists of transformed data and facilitates decision making
Operating system software
manages all hardware components and makes it possible for all other software to run on the computers. Examples: Microsoft Windows, Linux, Mac OS, UNIX, and MVS.
database administrator
manages and maintain DBMS and databases; Skills: Database fundamentals, SQL, vendor course
DBMS software
manages the database within the database system. Examples: Microsoft's SQL Server, Oracle Corporation's Oracle, Oracle's MySQL, and IBM's DB2.
three types of software are needed to make the database system function fully
operating system software, DBMS software, and application programs and utilities
physical data format
The way a computer "sees" (stores) data.
Workgroup database
A multiuser database that usually supports fewer than 50 users or is used for a specific department in an organization.
logical data format
The way a person views data within the context of a problem domain.
query language
A nonprocedural language that is used by a DBMS to manipulate its data. An example of a query language is SQL.
Advantages of the DBMS
(1) Improved data sharing (2) Improved data security (3) Better data integration (4) Minimized data inconsistency (5) Improved data access (6) Improved decision making (7) Increased end-user productivity
Problems with file System Data Processing
(1) Lengthy development times (2) Difficulty of getting quick answers (3) Complex system administration (4) Lack of security and limited data sharing (5) Extensive programming
Uncontrolled data redundancy sets the stage for the following:
(1) Poor data security (2) Data inconsistency (3) Data-entry errors (4) Data integrity problems
The data anomalies are commonly defined as follows:
(1) Update anomalies (2) Insertion anomalies (3) Deletion anomalies
Complex database technologies include:
(1) Very large databases (VLDB) (2) Big Data databases (3) In-memory databases (4) Cloud databases
Procedures
(1) are the instructions and rules that govern the design and use of the database system (2) a critical, although occasionally forgotten, component of the system (3) play an important role in a company because they enforce the standards by which business is conducted within the organization and with customers (4) also help to ensure that companies have an organized way to monitor and audit the data that enter the database and the information generated from those data
Roles of the DBMS
(1) serves as the intermediary between the user and the database (2) receives all application requests and translates them into the complex operations required to fulfill those requests (3) hides much of the database's internal complexity from the application programs and users
DBMS Functions
- data dictionary management - data storage management - Data transformation and presentation - Security management - Multiuser access control - Backup and recovery management - Data integrity management - Database access languages and application programming interfaces - Database communication interfaces
Disadvantages of database systems
1. Increased Costs 2. Management Complexity 3. Maintaining Currency 4. Vendor Dependence 5. Frequent Upgrade/Replacement Cycles
data dictionary
A DBMS component that stores metadata—data about data. The data dictionary contains data definitions as well as data characteristics and relationships. May also include data that is external to the DBMS
data independence
A condition in which data access is unaffected by changes in the physical data storage characteristics.
data inconsistency
A condition in which different versions of the same data yield different (inconsistent) results.
data anomaly
A data abnormality in which inconsistent changes have been made to a database. For example, an employee moves, but the address change is not corrected in all files in the database.
structural dependence
A data characteristic in which a change in the database schema affects data access, thus requiring changes in all access programs.
structural independence
A data characteristic in which changes in the database schema do not affect data access.
data dependence
A data condition in which data representation and manipulation are dependent on the physical data storage characteristics.
Analytical database
A database focused primarily on storing historical data and business metrics used for tactical or strategic decision making.
Centralized database
A database located at a single site.
XML database
A database system that stores and manages semistructured XML data.
General-purpose database
A database that contains a wide variety of data used in multiple disciplines, such as census databases.
Discipline-specific database
A database that contains data focused on specific subject areas.
Cloud database
A database that is created and maintained using cloud services, such as Microsoft Azure or Amazon AWS.
Multiuser database
A database that supports multiple concurrent users.
Single-user database
A database that supports only one user at a time.
distributed database
A logically related database that is stored in two or more physically independent sites.
Structured Query Language (SQL)
A powerful and flexible relational database language composed of commands that enable users to create database and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information.
desktop database
A single-user database that runs on a personal computer.
database system
An organization of components that defines and regulates the collection, storage, management, and use of data in a database environment.
______________ databases focuses primarily on storing historical data and business metrics used for tactical or strategic decision making.
Analytical
Data Scientist
Analyze large amounts of varied data to generate insights, relationships, and predictable behaviors Skills: Data analysis, statistics, advanced mathematics, SQL, programming, data mining, machine learning, data visualization
Big Data databases
Cassandra (Facebook) and BigTable (Google)
Database Developer
Create and maintain database based applications, Skills: programming, database fundamentals, SQL
____________________ exists when different versions of the same data appear in the different places.
Data inconsistency
unstructured data
Data that exists in its original, raw state; that is, in the format in which it was collected.
semistructured
Data that has already been processed to some extent.
structured data
Data that has been formatted to facilitate storage, use, and information generation
Cloud databases
Database that reside on the Internet, usually accessed through data services.
data type
Defines the kind of values that can be used or stored. Also, used in programming languages and database systems to determine the operations that can be applied to such data.
Cloud Computing Data Architect
Design and implement the infrastructure for next-generation cloud database systems Skills: internet technologies, cloud storage technologies, data security, performance tuning, large databases, etc
Database Architect
Design and implementation of database environments (conceptual, logical, and physical) Skills: DBMS fundamentals, data modeling, SQL, hardware knowledge, etc.
Database Designer
Design and maintain databases; Skills: system design, database design, SQL
Database Analyst
Develop databases for decision support reporting Skills: SQL, query optimization, data warehouses
data redundancy
Exists when the same data is stored unnecessarily at different places.
data integrity
In a relational database, refers to a condition in which the data in the database is in compliance with all entity and referential integrity constraints.
islands of information
In the old file system environment, pools of independent, often duplicated, and inconsistent data created and managed by different departments.
_____________ is the result of processing raw data to reveal its meaning
Information
__________________ is data about data through which the end-user data are integrated and managed.
Metadata
very large databases (VLDBs)
Oracle Exadata, IBM's Netezza, HP's Vertica, and Teradata; now being superseded by Big Data databases
System administrators
Oversee the database system's general operations.
enterprise database
The overall company data representation, which provides support for present and expected future needs.
Data processing (DP) specialist
The person responsible for developing and managing a computerized file processing system.
database design
The process that yields the description of the database structure and determines the database components. The second phase of the database life cycle.
People
This component includes all users of the database system. On the basis of primary job functions, five types of users can be identified in a database system: system administrators, database administrators, database designers, system analysts and programmers, and end users
End users
are the people who use the application programs to run the organization's daily operations. Examples: sales clerks, supervisors, managers, and directors are all classified by this type of user. Many employ the information obtained from the database to make tactical and strategic business decisions.
Application programs and utility software
are used to access and manipulate data in the DBMS and to manage the computer environment in which data access and manipulation take place
_______________________ is a special language used to represent and manipulate data elements in a textual
XML (Extensible Markup Language)
Field
a character or group of characters (alphabetic or numeric) that has a specific meaning; used to define and store data
File
a collection of related records. For example, a file might contain data about the students currently enrolled a Gigantic University
Data quality
a comprehensive approach to promoting the accuracy, validity, and timeliness of the data
Transactional database
a data base designed primarily to support a company's day-to-day operations. Also known as a OLTP, operational database, or production database
Operational database
a data base designed primarily to support a company's day-to-day operations. Also known as a transactional database, OLTP data base, or production database
Online transaction processing (OLTP) database
a data base designed primarily to support a company's day-to-day operations. Also known as a transactional database, operational database, or production database
Record
a logically connected set of one or more fields that describes a person, place, or thing. Examples: customer's name, address, phone number, date of birth, credit limit, and unpaid balance
Extendable Markup Language (XML)
a metalanguage used to represent and manipulate data elements in a textual format. Unlike other markup languages, it permits the manipulation of a document's data elements Examples: MS SQL server, IBM DB2, MySQL, Oracle RDBMS
Business intelligence
a set of tools and processes used to capture collect, integrate, store, and analyze data to support business decision making
Online analytical processing (OLAP)
a set of tools that provide advanced data analysis for retrieving, processing, and modeling data form the data warehouse
Data warehouse
a specialized database that stores historical and aggregated data in a format optimized for decision support
query
a specific request issued to the DBMS for data manipulation—for example, to read or update the data; A question or task asked by an end user of a database in the form of SQL code
Ad hoc query
a spur-of-the-moment question
An ______________ is a spur-of-the-moment question.
ad hoc query
Database administrators
also known as DBAs, manage the DBMS and ensure that the database is functioning properly
System analysts and programmers
design and implement the application programs. They design and create the data-entry screens, reports, and procedures through which end users access and manipulate the database's data
Database designers
design the database structure. They are, in effect, the database architects. If the database design is poor, even the best application programmers and the most dedicated DBAs cannot produce a useful database environment. Because organizations strive to optimize their data resources, they's job description has expanded to cover new dimensions and growing responsibilities.
From a general management point of view, the database system is composed of the five major parts
hardware, software, people, procedures, and data
Database Security Office
implement security policies for data administration Skills: DBMS fundamentals, database administration, SQL, data security technologies, etc.
Data is the foundation of...
information, which is the bedrock of knowledge
A __________ is a specific request issued to the DBMS for data manipulation.
query
Hardware
refers to all of the system's physical devices, including computers (PCs, tablets, workstations, servers, and supercomputers), storage devices, printers, network devices (hubs, switches, routers, fiber optics), and other devices (automated teller machines, ID readers, and so on)
Knowledge
the body of information and facts about a specific subject. It implies familiarity, awareness, and understanding of information as it applies to an environment
query results set
the collection of data rows returned by a query
Data
the raw facts, or facts that have not yet been processed to reveal their meaning to the end user; constitutes the building blocks of information; such as a telephone number, a birth date, a customer name, and a year-to-date (YTD) sales value. It has little meaning unless it has been organized in some logical manner
Social media
web and mobile technologies that enable "anywhere, anytime, always on" human interactions. Websites such as Google, Facebook, Twitter, and LinkedIn capture vast amounts of data bout end users and consumers