Bus 491 Midterm
Multidimensional OLAP (MOLAP)
OLAP tools that load data into an intermediate structure, usually a three or higher-dimensional array
Relational OLAP (ROLAP)
OLAP tools that view the database as a traditional relational database in either a star schema or other normalized or denormalized set of tables
NoSQL
Short for "Not only SQL", NoSQL is a class of database technology used to store and access textual and other unstructured data, using more flexible structures than the rows and columns format of relational databases
derived data
data that have been selected, formatted, and aggregated for end-user decision support applications
Operational Metadata
describe the data in the various operational systems (as well as the external data) that feed the enterprise data warehouse. Operational metadata typically exist in a number of different formats and unfortunately are often of poor quality
Data mart metadata
describe the derived data layer and the rules for transforming reconciled data to derived data.
reconciled data
detailed current data intended to be the single authoritative source for all decisions support applications
The Need for Data Warehousing
1. A business requires an integrated, company-wide view of high-quality information. 2. The information systems department must separate informational from operational systems to improve performance dramatically in managing company data
Data Mart
A data warehouse that is limited in scope, whose data are obtained by selecting and summarizing data from a data warehouse or from seperate extract, transform and load processes from source data systems
Data Warehouse
A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes
Data Mining (DM)
Advanced methods for exploring and modeling relationships in large amount of data.
Real-time data warehouse
An enterprise data warehouse that accepts near-real-time feeds of transactional data from the systems of record, analyzes warehouse data, and in near-real-time relays business rules to the data warehouse and systems of record so that immediate action can be taken in response to business events
Data Propagation
Duplicates data across databases, usually with near-real-time delay. Even-Driven propagation
Response Modeling
Improve response rates by identifying prospects who are more likely to respond to a direct solicitation.
Data Mining
Knowledge discovery using a blend of statistical, AI, and computer graphics techniques
Data mining
Knowledge discovery, using a sophisticated blend of techniques from traditional statistics, artificial intelligence, and computer graphics
Data Federation
Provides a virtual view of integrated data without actually bringing the data all into one physical centralized database
Business Analytics /Intelligence
The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.
Grain
The level of detail in a fact table, determined by the intersection of all the components of the primary key, including all foreign keys and any other primary key elements
Data visualization
The representation of data in graphical and multimedia formats for human analysis
Online Analytical Processing (OLAP)
The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques
enterprise data warehouse
a centralized, integrated data warehouse that is the control point and single source of all data made available to end users for decision support applications
logical data marts
a data mart created by a relational view of a data warehouse
dependent data mart
a data mart filled exclusively from an enterprise data warehouse and its reconciled data
independent data mart
a data mart filled with data extracted from the operational environment, without the benefit of a data warehouse
subject-oriented data warehouse
a data warehouse is organized around the key subjects or high-level entities of the enterprise. Major subjects may include customers, patients, students, products, and time.
Event
a database action (create/ update/ delete) that results from a transaction
star schema
a simple database design in which dimenstional data are seperated from fact or event data. A dimensional model is another name for a star Schema
Informational system
a system designed to support decision making based on historical point-in-time and prediction data for complex queries or data-mining applications
Operational system
a system that is used to run a business in real time, based on current data; also called a system of record
Snowflake Schema
an expanded version of a start schema in which dimension tables are normalized into several related tables
Big Data
an ill-defined term applied to databases whose size strains the ability of commonly used relational DBMSs to capture, manage, and process the data within a tolerable elapsed time
Operational data store (ODS)
an integrated, subject-oriented, continulously updateable, current-valued (with recent history), enterprise-wide, detailed database designed to serve operational users as they do decision support processing
Enterprise Data Warehouse Metadata
are derived (or least consistent with) the enterprise data model. EDW metadata describe the reconciled data layer as well as the rules for extracting, transforming, and loading operational data into reconciled data mart
nonupdateable data warehouse
data in the data warehouse are loaded and refreshed from operational systems but cannot be updated by end users
Time-variant data warehouse
data in the data warehouse contain a time dimension so that they may be used to study trends and change
transient data
data in which changes to existing records are written over previous records, thus destroying the previous data content
periodic data
data that are never physically altered or deleted once they have been added to the store
Conformed dimension
one or more dimension tables associated with two or more fact tables for which the dimension tables have the same business meaning and primary key with each fact table
Integrated data warehouse
the data housed in the data warehouse are definited using consistent naming conventions, formats, encoding structures, and related characteristics gathered from several internal systems of record and also often from sources external to the organization. This means that the data warehouse holds the one version of "the truth"