Chapter 3: Databases and Data Warehouses
ETL (extraction, transformation, and loading)
(1) Extracting needed data from its sources (2) transforming the data into a standardized format (3) loading the transformed data into a data warehouse
Application Generation Subsystem
Contains facilities to help develop transaction-intensive applications
CRUD
Create, Read, Update, Delete; is a part of security management, which allows you to control what type of access that different people have in the database
Query-and-Reporting Tools
are used to generate simple queries and reports
Physical View
deals with how information is physically arranged, stored, and accessed on some type of storage device such as a hard disk
Artificial Intelligence
includes tools such as neural networks and fuzzy logic to form the basis of "information discovery" and build business intelligence in OLAP; represents the growing convergence of various IT tools for working with information
Report Generators
helps quickly define formats of reports and what information you want to see in a report
Data Mining Tools
the software tools you use to query information in a data warehouse; support the concept of OLAP
Foreign Key
a primary key of one file that appears in another file; creates the logical ties between the tables/files
Backup
A copy of the information stored on a computer
Characteristics of Data Warehouses- Notes
Data Warehouses: (1) are multdimensional; they contain layers of columns and rows (2) they support decision making (OLAP), NOT transaction processing (OLTP)
Examples of Data Mining Tools
Examples of Data Mining Tools: (1) Query-and-Reporting Tools (2) Artificial Intelligence Multidimensional Analysis Tools (3) Digital Dashboards (4) Statistical Tools Data-mining tools are to data warehouse users to what data manipulation subsystem tools are to database users
(1) DBMS Engine (2) Data Definition Subsystem (3) Data Manipulation Subsystem (4) Application Generation Subsystem (5) Data Administration Subsystem
What are the 5 important software components of a DBMS?
Online Transaction Processing and Online Analytical Processing
What are the two types of information processing?
Database
a collection of information that you organize and access according to the logical structure of that information
Primary Key
a field (or group of fields in some cases) that uniquely describes each record; this field cannnot be blank
Data Warehouse
a logical collection of information- collected from multiple operational databases- used to create business intelligence that supports business analysis activities and decision making tasks
Structured Query Language (SQL)
a standardized fourth-generation query language found in most DBMSs; the most basic form of an SQL statement is SELECT (fields of info); FROM (what logical relationships (tables) to use); WHERE (specify the selection of criteria)
Data Mart
a subset of a data warehouse in which only a focused portion of the data warehouse information is kept; example: a data mart for the merchandising department that contains only information on merchandising
DBMS Engine
accepts logical requests from the various other DBMS subsystems, converts them into their physical equivalent, and accesses the database and data dictionary as they exist on a storage device
View
allows you to see the contents of a database file, make whatever changes you want, perform simple sorting, and query to find the location of specific information; provides each file in the form of a spreadsheet workbook
Multidimensional Analysis (MDA) Tools
are slice-and-dice techniques that allow you to view multidimensional information from different perspectives; using this tool, you can easily "slice" the data warehouse cube and bring the information that you are looking for to the front of the cube for viewing (values of information are not affected)
Business Intelligence (BI)
collective information that gives you the ability to make effective, important, and often strategic business decisions; is a resource/component of the overall framework or field of analytics
Data Dictionary
contains the logical structure for the information in a database; contains important information (or logical properties) about your information
Operational Databases
contains valuable information that forms the basis for business intelligence; databases that support OLTP
Relation
describes each two-dimensional table or file in the relational model (connects the tables to each other)
Digital Dashboard
displays key information gathered from several sources on a computer screen in a format tailored to the needs and wants of an individual knowledge worker
Logical View
focuses on how you as a knowledge worker need to arrange and access information to meet your particular business needs
Statistical Tools
help you apply various mathematical models to the information stored in a data warehouse to discover new information
Data Manipulation Subsystem
helps add, change, and delete information in a databse and query it for valuable information
Data Definition Subsystem
helps create and maintain the data dictionary and define the structure of the files in a database
Query-by-Example (QBE)
helps graphically design the answer to a question; makes it so that you can just enter in one number/constraint for a specific question, and the query will give the correct answer
Data Administration Subsystem
helps manage the overall database environmentby providing facilities for backup and recovery, security management, query optimization, concurrency control, and change management; is used by an administrator- someone responsible for assuring that the database environment meets the entire information needs of an organization
Database Management System (DBMS)
helps specify the logical organitzation for a database and access and use the information within a database
Database Administration
is the function in an organization that is responsible for the more technical and operational aspects of managing the information contained in organizational information repositories
Integrity Constraint
rules that help ensure the quality of the information (ex. you can't create an order for a customer that doesn't exist)
Data Administration
the function in an organization that plans for, oversees the development of, and monitors the information resource
Online Transaction Processing (OLTP)
the gathering of input information, processing that information, and updating existing informatoin to reflect the gathered and processed information; are supported by databases and DBMSs; can be used to query basic information
Online Analytical Processing (OLAP)
the manipulation of information to support decision making; is supported by a data warehouse that contains data mining tools; uses information from multiple databases
Key Performance Indicators (KPIs)
the most essential quantifiable measures used in analytics initiatives to monitor success of a business activity; example: a sales manager would have these related to the amount of sales for that month
Recovery
the process of reinstalling the backup information in the event the information was lost
Analytics
the science of fact-based decision making; is a growing field of study, research, and career opportunities that focuses on the integrated use of technology tools and statistical techniques to create real time, high quality, fact-based business intelligence in support of decision making
Relational Database
this uses a series of logically related two-dimensional table or file in the relational model; is composed of two distinct parts: (1) the information itself and (2) the logical structure of that information; to find information you only need to know the field name of the column of information and its logical row, not physical row