Data Warehouse: Test 1

Ace your homework & exams now with Quizwiz!

Snowflake Schema

A common form of dimensional model. In a snowflake schema, different hierarchies in a dimension can be extended into their own dimensional tables. Therefore, a dimension can have more than a single dimension table. Dimensions can join to other Dimensions. Slower as more normalised, join dimension to dimension to fact table. Increases the descriptive information on Facts, breaks down into more detail, i.e. products and product categories

Star Schema

A common form of dimensional model. In a star schema, each dimension is represented by a single dimension table. Dimensions ONLY join to Fact table. Faster processing as less joins. Less descriptive information on Facts, only breaks down to ONE Dimension table. Lose information about products and product categories

Conformed Dimension

A dimension that has exactly the same meaning and content when being referred to from different fact tables.

Why you need Business Intelligence

A goal of every business is to make better business decisions than their competitors. That is where business intelligence (BI) comes in. BI turns the massive amount of data from operational systems into a format that is easy to understand, current, and correct so decisions can be made on the data. You can then analyze current and long-term trends, be instantly alerted to opportunities and problems, and receive continuous feedback on the effectiveness of your decisions.

Dimensional Data Model:

A type of data modeling suited for data warehousing. In a dimensional model, there are two types of tables: dimensional tables and fact tables. Dimensional table records information on each dimension, and fact table records all the "fact", or measures. Dimensional data model is commonly used in data warehousing systems. This section describes this modeling technique, and the two common schema types, star schema and snowflake schema. Results from the Dimension Modelling phase are the dimensional models (star and snowflake schemas) for each Data Mart

Fact table

A type of table in the dimensional model. A fact table typically includes two types of columns: fact columns and foreign keys to the dimensions.

Data Warehouse

A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process (as defined by Bill Inmon

Father of Data Warehousing

Bill Inmon. Data First, requirements then built the BI

data mart

Data marts have the same definition as the data warehouse (see below), but data marts have a more limited audience and/or data content.

KPI'S

Key, Performance, Indicato. measurable value that demonstrates how effectively a company is achieving key business objectives. Organizations use KPIs to evaluate their success at reaching targets.

The approach you are applying to your data warehouse design is the .... methodology.

Kimball

OLTP

On Line Transaction Processing databases

OLAP

On-Line Analytical Processing. OLAP should be designed to provide end users a quick way of slicing and dicing the data.

Aggregation

One way of speeding up query performance. Facts are summed up for selected dimensions from the original fact table. The resulting aggregate table will have fewer rows, thus making queries that can use them go faster.

ETL

Stands for Extraction, Transformation, and Loading. The movement of data from one area to another.

Dimension

The same category of information. For example, year, month, day, and week are all part of the Time Dimension.

drill down

These are common OLAP operations. A drill-down operation is the opposite of roll-up for more detailed data.

roll up

These are common OLAP operations. Roll-up allows us to look at coarser, "big picture" data by dropping one or more dimensions or climbing up along the dimension hierarchies.

Slice and dice

To slice and dice is to break a body of information down into smaller parts or to examine it from different viewpoints so that you can understand it better.

Inmon's approach to DW design is to consider the .... first, requirements and then build the Business Intelligence reporting.

data

data mart vs. data warehouse

datawarehouse is the distributor in a supply chain where as data mart is like a retail store in a supply chain

Kimball's focuses on a ..... oriented approach to design, with the business intelligence requirements first then the data.

function

The most important ingredient to a BI solution

it must include a data warehouse.

Business intelligence

the information that is available for the enterprise to make decisions on. A data warehousing (or data mart) system is the backend, or the infrastructural, component for achieving business intelligence. Business intelligence also includes the insight gained from doing data mining analysis, as well as unstructured data (thus the need for content management systems). For our purposes here, we will discuss business intelligence in the context of using a data warehouse infrastructure.

OLTP vs OLAP Data Model

transactional (OLTP) and analytical (OLAP). In general we can assume that OLTP systems provide source data to data warehouses, whereas OLAP systems help to analyze it. - OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF). - OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema).

metric

A measured value. For example, "Total Sales" is a metric.

Attribute

Attributes represent a single type of information in a dimension. For example, year is an attribute in the Time dimension.

Data Warehousing Guru & original architects of Data Warehousing

Ralph Kimball. Functionality Oriented, BI first then data

Comparison between Relational & Data Warehouse Data Models

Relational Models: Relations, Relate via a PK and FK, Normalised data. Data Warehouse Models: Facts, Relate to Dimensions via a surrogate key, Aggregated data, Dimensions, Describe the facts, Allow for drill down and roll up (slice and dice)

Data Warehouse vs DBMS

DW: OLAP. normalised with many tables. DBMS: OLTP. Typically de-normalized with fewer tables using star or snowflake schemas.

Metadata

Data about data. For example, the number of tables in the database is a type of metadata.

ETL: Extract

Data needs to be extracted from the Data Sources. Identified in the Planning & Data Mart / Data Warehouse Design phases

ETL: load

Need to regularly load the Data into the Data Warehouse: Frequency depends on the requirements of the organisation. Tools to Load data depends on the Data Warehouse Implementation system: i.e. SQL Server uses the SSDT and SSIS to Extract, Transform and Load data. The load should be scheduled at the regular time intervals to keep the data warehouse data up to date: SQL Server allows jobs to Load the data outlined in the SSIS Packages and Deployment of the SSIS Packages

ETL: transform

The Data from the Data Sources needs to match the Data Warehouse. The Data Warehouse needs to ensure: Have all the attributes required, The correct Data types, The correct Data format, Only the attributes defined in the Data Model for the Data Warehouse


Related study sets

RealEstateU: Texas Real Estate Principles Part 1 Final Exam

View Set

PassPoint - Nursing Fundamentals 2

View Set

Government Unit 1: International Governments

View Set

Chapter 11 : Forecasting Requirements

View Set

Systemic Worksheet 5,6 & 7 Questions

View Set