Chapter 13 CM
explanatory analytics
Data analysis that provides ways to discover relationships, trends, and patterns among data.
Predictive Analytics
Data analytics that use advanced statistical and modeling techniques to predict future business outcomes with great accuracy.
very large databases (VLDBs)
Database that contains huge amounts of data—gigabyte, terabyte, and petabyte ranges are not unusual.
Online Analytical Processing (OLAP)
Decision support system (DSS) tools that use multidimensional data analysis techniques. OLAP creates an advanced data analysis environment that supports decision making, business modeling, and operations research.
roll up
(1) To aggregate data into summarized components, that is, higher levels of aggregation. (2) In SQL, an OLAP extension used with the GROUP BY clause to aggregate data by different dimensions. Rolling up the data is the exact opposite of drilling down the data.
Business Intelligence (BI)
A comprehensive, cohesive, and integrated set of tools and processes used to capture, collect, integrate, store, and analyze data with the purpose of generating and presenting information to support business decision making.
Star Schema
A data modeling technique used to map multi-dimensional decision support data into a relational database. The star schema represents data using a central table known as a fact table in a 1:M relationship with one or more dimension tables.
multidimensional database management system
A database management system that uses proprietary techniques to store data in matrixlike arrays of n dimensions known as cubes.
Materialized view
A dynamic table that not only contains the SQL query command to generate rows but stores the actual rows. The materialized view is created the first time the query is run and the summary rows are stored in the table. The materialized view rows are automatically updated when the base tables are updated.
data mart
A small, single-subject data warehouse subset that provides decision support to a small group of people.
data warehouse
A specialized database that stores historical and aggregated data in a format optimized for decision support. An integrated, subject-oriented,time-variant, nonvolatile collection of data the provides support for decision making.
data analytics
A subset of business intelligence functions that encompasses a wide range of mathematical, statistical, and modeling techniques with the purpose of extracting knowledge from data.
attribute hierarchy
A top-down data organization that is used for two main purposes: aggregation and drill-down/roll-up data analysis.
snowflake schema
A type of star schema in which dimension tables can have their own dimension tables. The snowflake schema is usually the result of normalizing dimension tables.
data visualization
Abstracting data to provide information in a visual format that enhances the user's ability to effectively comprehend the meaning of the data.
Decision Support System (DSS)
An arrangement of computerized tools used to assist managerial decision making within a business.
relational online analytical processing (ROLAP)
Analytical processing functions that use relational databases and familiar relational query tools to store and analyze multidimensional data.
Metrics
In a data warehouse, numeric facts that measure a business characteristic of interest to the end user.
Dimension tables
In a data warehouse, tables used to search, filter, or classify facts within a star schema.
facts
In a data warehouse, the measurements(values) that measure a specific business aspect or activity. For example sales figures are numeric measurements that represent product or service sales. Facts commonly used in business data analysis include units, costs, prices, and revenues.
fact table
In a data warehouse, the star schema table that contains facts linked and classified through their common dimensions. A fact table is in a one-to-many relationship with each associated dimension table.
data cube
The multidimensional data structure used to store and manipulate data in a multidimensional DBMS. The location of each data value in the data cube is based on its x-, y-, and z-axes. Data cubes are static, meaning they must be created before they are used, so they cannot be created by an ad hoc query.
Partitioning
The process of splitting a table into subsets of rows or columns.
Extraction, transformation, and loading (ETL)
In a data warehousing environment, the integrated processes of getting data from original sources into the data warehouse. ETL includes retrieving data from original data sources (extraction), manipulating the data into an appropriate form (transformation), and storing the data in the data warehouse (loading).
dimensions
In a star schema design, qualifying characteristics that provide additional perspectives to a given fact.
master data management (MDM)
In business intelligence, a collection of concepts, techniques, and processes for the proper identification, definition, and management of data elements within an organization.
dashboard
In business intelligence, a web-based system that presents key business performance indicators or information in a single, integrated view with clear and concise graphics.
Key performance indicators (KPIs)
In business intelligence, quantifiable numeric or scale-based measurements that assess a company's effectiveness or success in reaching strategic and operational goals. Examples of KPI are product turnovers, sales by promotion, sales by employee, and earnings per share.
Governance
In business intelligence, the methods for controlling and monitoring business health and promoting consistent decision making.
cube cache
In multidimensional OLAP, the shared, reserved memory area where data cubes are held. Using the cube cache assists in speeding up data access.
sparsity
In multidimensional data analysis, a measurement of the data density held in the data cube.
Portal
In terms of business intelligence, a unified, single point of entry for information distribution.
Periodicity
Information about the time span of data stored in a table, usually expressed as current year only, previous years, or all years.
data mining
a process that employs automated tolls to analyze data in a data warehouse and other sources and to proactively identify possible relationships and anomalies.
Multidimensional online analytical processing (MOLAP)
an extension of online analytical processing to multidimensional database management systems.
slice and dice
the ability to focus on slices of a data cube(drill down or roll up) to perform a more detailed analysis.
Replication
the process of creating and managing duplicate versions of a database. Replication is used to place copies in a different locations and to improve access time and fault tolerance
drill down
to decompose data into more atomic components- that is, data at lower levels of aggregation. This approach is used primarily in a decision support system to focus on specific geographic areas, business types and so on.