Database Exam 4 Ch. 13

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Performance tuning the Star Schema: Facts tables

*Denormalize* to improve data access performance and save data storage space

Performance tuning the Star Schema: Dimension tables

*Normalized* to achieve semantic simplicity and facilitate end-user navigation through the dimensions

Data-mining tools functions

1. Analyze data 2. Uncover problems or opportunities hidden in data relationships 3. Form computer models based on their findings 4. Use models to predict business behavior

Data-mining phases

1. Data preparation (identify, clean, and integrate data set) 2. Data analysis and classification (Classification analysis, clustering and sequence analysis, link analysis, trend and deviation analysis) 3. Knowledge acquisition (Select and apply algorithms, decision trees, clustering, neural networks, etc.) 4. Prognosis (modeling, forecasting, and prediction)

Decision Support Database Requirements

1. Database schema - Complex data representations -Aggregated and summarized data -Queries extract multidimensional time slices 2. Data extraction and loading -Supports different data sources (flat files, hierarchical, network, and relational databases) -Checks for inconsistent data 3. Database size -DBMS must support very large databases (VLDBs)

The Data Warehouse

1. Integrated 2. Subject-oriented 3. Time-variant 4. Nonvolate collection of data (never deleted) for improved decision making *Usually only a read-only DB optimized for data analysis and query processing

4 Techniques to optimize data warehouse design

1. Normalizing dimensional tables (achieves semantic simplicity and allows end-user navigation through the dimensions) 2. Maintaining multiple fact tables to represent different aggregation levels 3. Denormalizing fact tables (improves data access performance and saves data storage space) 4. Partitioning and replicating tables

*Focus of Business Intelligence Architecture

1. Strategic, 2. Tactical use of information Example: Using KPI's *MULTIPLE tools from different vendors can be integrated into a SINGLE BI framework

Benefit of Data Mining

Automates the analysis of operational data to find previously unknown data characteristics, relationships, dependencies, and trends

How does Business Intelligence transform data, information, and knowledge?

BI Transforms: Data -> information Information -> knowledge Knowledge -> wisdom

Decision Support Data

Cannot rely on operational data for decision support tasks, must reformat into decision support data. **BI effectiveness depends on QUALITY of data gathered at OPERATIONAL level

*Star Schemas components

Components: 1. Facts 2. Dimensions 3. Attributes 4. Attribute hierarchies

Primary functions of Business Intelligence

Comprehensive, cohesive, integrated tools and processes which: Capture, collect, integrate, store, and analyze data

Fact table

Contains facts linked through dimensions. Found in the center of the star *Primary key of fact table is a composite primary key; many fact rows relate directly to each dimension row.

Areas of use for Predictive Analytics Models

Customer relationships, customer service, customer retention, fraud detection, targeted marketing, and optimized pricing (security and marketing things)

Storage location of attribute hierarchies

DBMS's data dictionary.

**Decision Support Data vs. operational data differences

DSS Differs from operational data in: 1. time span 2. granularity 3. dimensionality

*Data Warehouse vs. data mart

Data marts: 1. Contain more manageable data set than data warehouses 2. Typically lower cost and lower implementation time

Business Intelligence Architecture composed of:

Data, people, processes, technology, and management of components

*Star Schemas

Data-modeling technique that maps multidimensional decision support data into relational database

Predictive Analytics

Employs mathematical and statistical algorithms, neural networks, artificial intelligence, and other advanced modeling tools *Create actionable predictive models based on available data generated in the data-mining phase

Metrics

Facts computed at run time, opposite of stored facts

True/False: There is a mathematical limit to the number of dimensions

False: No mathematical limit exists to the number of dimensions

Slice and dice

Focus on slices of the data cube for more (diced) detailed analysis

*OLAP: Online Analytical Processing

HISTORICAL data for DAY TO DAY operations; accurate up to one point in time. Three Main characteristics: 1. Multidimensional data analysis techniques (Most distinctive feature) 2. Advanced database support 3. Easy-to-use interfaces

Data Analysis provides:

Information about short-term tactical evaluations and strategies

Benefit of Business Intelligence

Main goal: Improved decision making Other benefits: Improved organizational performance, a common user interface for data reporting and analysis, and a common data repository

Replication

Makes COPY of table and places it in different location

KPI: Key Performance Indicators

Measurements that assess company's effectiveness or success in reaching goals *Must be quantifiable!

*OLTP: Online Transaction Processing

Must be BLAZINGLY FAST and 100% reliable. REAL TIME, current data, always up-to-date. Example: Storage of credit card sales

*Star Schema FACT component

Numerical measurements that represent specific business aspects or activities (quantifiable information) *Normally stored in fact table Fact table = center of star Ex: Sales figures, units, costs, prices, and revenues

Strategies should be developed to meet:

Organizational goals using OPERATIONAL databases

Primary function of Decision Support Database

Primary function of DSD: Specialized DBMS tailored to provide fast answers to complex queries over a long historical period of time

Two main purposes of attribute hierarchies

Provide top-down data organization for: 1. Aggregation 2. Drill-down/roll-up data analysis Determines how the data is extracted and represented Used by OLAP tool to access warehouses properly

Benefit of data analytics

Provides advanced data analysis tools to extract knowledge from business data

*Star Schema DIMENSIONS component

Qualifying characteristics/categories that provide additional perspectives to given facts *Study facts through dimensions *Dimensions are stored in dimension tables which are smaller than fact tables. **Links to the fact table Ex: Sales can be compared by product from region to region from one time period to the next. Fact = sales #'s. Dimensions = product, location, and/or time.

Partitioning

SPLITS TABLE into subsets of rows or columns *Note the difference from replication

Data Marts

Small, single-subject data warehouses. Provides decision support to SMALL GROUP of people

Operational data- location and primary use?

Stored in relational database Optimized to support transactions representing daily operations, constantly updated

Requirements of the Data Warehouse

To create a Data Warehouse, requires: 1. Time 2. Money 3. Considerable managerial effort

True/False: Decision support data can be viewed in relation to other data

True: Decision support data is ALMOST ALWAYS viewed in relation to other data *This is why dimensions are of significance

True/False: Fact tables are bigger than dimension tables

True: Dimension tables are smaller than fact tables

True/False: Operational data is not suited for decision support

True: Operational data cannot be used for decision support

*Star Schema ATTRIBUTE component

Used to search, filter, and classify facts. *Dimensions provide descriptive characteristics about the facts through their attributes. Example attributes for location: Region, state, city, store, etc. *Must be able to slice and dice


Ensembles d'études connexes

Signal Transduction I: Ion Channel-Linked Receptors and G-Protein Coupled Receptors

View Set

Ch.1 Introduction to Nursing PrepU

View Set

Mastering Math Facts Multiplication Set Q (5x8, 8x5, 4x8, 8x4)

View Set

Commonly Missed Questions on the STAAR Released Test

View Set

MAT 120 Section 9.3 Measures of Regression

View Set

homework for econ exam 1, MICRO ECONOMICS, Chapter 2 (unit 1- obj. 10), Econ 1000 Chapter 3, 2.1-2.4, ECON Chapter 2 Homework, ECN101 Chapter 5 Key Terms, Econ 102 Exam, ECON102 CH. 4, 2.4 Gains from Trade, Microeconomics ch3, ECON E 201 Assignment 3...

View Set