Chapter 9: Data Warehousing
Reconciled Data
Detailed, current data intended to be the single, authoritative source for all decision support applications
Data Governance
High-level organizational groups and processes that oversee data stewardship across the organization. It usually guides data quality initiatives, data architecture, data integration, and master data management, data warehousing and business intelligence, and other data related matters
Multidimensional OLAP (MOLAP)
OLAP tools that load data into an intermediate structure, usually a three-or-higher-dimensional array
Relational OLAP (ROLAP)
OLAP tools that view the database as a traditional relational database, in either a star schema or other normalized or denormalized set of tables
Enterprise Data Warehouse (EDW)
a centralized, integrated data warehouse that is the control point and single source of all data made available to end users for decisions support applications
Logical Data Mart
a data mart created by a relational view of a data warehouse
Depended Data Mart
a data mart filled exclusively from an enterprise data warehouse and its reconciled data
Independent Data Mart
a data mart filled with data extracted from the operational environment, without the benefit of a data warehouse
Data Mart
a data warehouse that is limited in scope, whose data are obtained by selecting and summarizing data from a data warehouse or from separate extract, transform, and load processes from source data systems
Data Steward
a person assigned the responsibility of ensuring that organizational applications properly support the organization's enterprise goals for data quality
Star Schema
a simple database design in which dimensional data are separated from fact or event data. A dimensional model is another name for a star schema
Data Warehouse
a subject-oriented, integrated, time-variant, non-update-able collection of data used in support of management decision-making processes
Informational System
a system designed to support decision making based on historical point-in-time and prediction data for complex queries or data-mining applications
Operational System
a system that is used to run a business in real time, based on current data. Also called a system of record
Real-time Data Warehouse
an enterprise data warehouse that accepts near-real-time feeds of transactional data from the systems of record, analyzes warehouse data, and in near-real time relays business rules to the data warehouse and systems of record so that immediate action can be taken in response to business events
Snowflake Schema
an expanded version of a star schema in which dimension tables are normalized into several related tables
Big Data
an ill-defined term applied to databases whose volume, velocity and variety strain the ability of commonly used relational DBMS's to capture, manage, and process the data within a tolerable elapsed time
Operational Data Storage (ODS)
an integrated, subject-oriented, continuously update-able, current-valued (with recent history), enterprise-wide, detailed database designed to serve operational users as they do decision support processing
Transient Data
data in which changes to existing records are written over previous records, thus destroying the previous data content
Periodic Data
data that are never physically altered or deleted once they have been added to the store
Derived Data
data that have been selected, formatted, and aggregated for end-user decision support applications
Data Mining
knowledge discovery, using a sophisticated blend of techniques from traditional statistics, artificial intelligence, and computer graphics
Conformed Dimension
one or more dimension tables associated with two or more fact tables for which the dimension tables have the same business meaning and primary key with each fact table
NoSQL
short for "Not only SQL", NoSQL is a class of database technology used to store and access textual and other unstructured data using more flexible structures than the rows and columns format of relational databases
Grain
the level of detail in a fact table, determined by the intersection of all the components of the primary key, including all foreign keys and any other primary key elements
Data Visualization
the representation of data in graphical and multimedia formats for human analysis
Online Analytical Processing (OLAP)
the use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques