ITP 487 - Exam II

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Dimension

- a group of characteristics which belong to the same business object (semantically) - each characteristics may contain additional attributes which are relevant for data analysis - or not!! - display attributes are InfoObjects for which data analysis does not make sense - navigational attribute are InfoObjects which are used to perform data analysis - time-dependent attributes - time-independent attributes

Transformation Rules

- algebraic rules - + - * / - string rules - relate to alphanumeric data - data and time rules - change time and date formats - programmatic or algorithm-based rules - use transformation programming languages

InfoProviders

- are data storage meta objects inside SAP BW - May store data records physically (persistent) - may also be used to access data stored outside SAP BW virtually (transparent views) - May be accessed with front end tools such as SAP BusinessObjects Analysis

Data load errors

- common cause is mismatched data - data package key and data package dimension assist in recovery when errors occur - roll back to data state prior to load - delete data load, repair rogue data, and reload

InfoCubes

- consists of a number of relational tables arranged together - fact table (max 233 key figures) - dimension tables - dimension (max 16 dimensions & 248 characteristics per dimension)

fact table

- consists of dimensional keys and key figures (facts) - values that can be added, subtracted, divided, multiplied, etc.

Data harmonization

- data are from multiple sources they need to be made consistent - involves mapping fields from source to target systems

Extraction

- data source is a collection of fields from the source system - extractors are programs that pull the data from the source system and bring them into the DW system - extracted data are staged in a PSA

Loading

- delta loads vs. full loads - data transfer process (DTP)

SAP BW Schema

- developed as an extended (snowflake) star schema as a response to problems experienced with the classic star schema - the enhancement comes from the fact that the dimension tables do not contain master data information - master data is stored in separate tables - master data tables can be used in multiple InfoCubes

Shortcomings of star schemas

- duplication of dimensional data - no support for multiple languages - slow joins due to alphanumeric keys - no historization (changing dimensional data) - hierarchies can cause anomalies

snowflake instead of star

- further normalization and expansion of the dimension tables in star schema results in a snowflake design - this reduces data redundancy which storage space but increases the number of joins - low-cardinality columns in the dimension have been removed to separate normalized tables that then link back into the original dimension table

Types of InfoObjects

- key figures = numeric measures - characteristics = master data - units = units of measure for key figures

Virtual cubes

- only represent logical views of a dataset - data resides in data source

Standard cube

- physically contains data in the database - they are data target - a type of InfoProvider - BW objects are called InfoProviders when queries can be executed based on them - Technically optimized for read access

Queries

- prefilters - presorts - restricted key figures - calculated key figures - conditions - predefined rule

dimension tables

- represent master data - provides a more detailed view of the fact

InfoObjects

- represent the structure that allows data to be stored in a BW system - used to describe business processes and information requirements (fields): asset, quantity on hand - either characteristics or key figures

Master data

- represents dimensional data - independent of any fact table - reusable in multiple fact tables - designed to supports multi-lingual systems - designed to support time dependent data - designed to improve query

fact table

- stores measures/ key figures - represents transactional data

InfoCube

- the central objects of the multi-dimensional model in SAP BW - reports and analyses are based on these - self-enclosed dataset for a business area from a reporting viewpoint - queries can be defined and executed on these

When to snowflake

- the dimension table consists of two or more sets of attributes which define information at different grains (detail) - the sets of attributes of the same dimension table are being populated by different source systems (very common in DW)

Snowflake schema

- the dimension tables are split into more tabes - reduce data redundancy - supports multiple languages, currencies & UoM - quicker joins via conversion of text keys to numbers - support historization - support cube administration

text tables

- whereas non-textual master data are stored in attribute tables, textual master data are stored in text tables - time-dependent - time-independent - language-dependent - language-independent - hierarchy

Fact Table with Currency Key

1,000 EUR 1,000 USD 1,000 ZWD EUR Euro USD US Dollar ZWD Zimbabwe Dollar

Dashboarding Process

1. identify end-user needs 2. specify the function of the dashboard 3. identify data sources 4. formulate the logic 5. create the user interface 6. bind the data 7. apply themes 8. add interactivity 9. test and deploy

Authoring Reports

1. identifying the needs of the report user 2. identifying data sources 3. building the layout for readability 4. binding analytical components to data sources 5. report structure 6. adding prompts for end-users 7. deployment

Fact Table with Unit of Measure Key

10 Kg 15 Dz 30 Gal 12 Ea Kg Kilogram Dz Dozen Gal Gallon Ea Each

InfoPackage

A "container" for selection criteria/condition Contains info about: - when - from which datasource - with which selection - full or delta

Data Store Object

A DSO is a two-dimensional storage unit which mainly stores consolidated and cleansed transaction data or master data on a lower granularity

Data Source

A collection of fields that have been extracted from the source system

Step 6

Activate data in DSO

conditional formatting

An Excel feature that enables you to specify how cells that meet one or more given conditions should be displayed.

Analysis for Microsoft office

Conduct multi-dimensional analysis of OLAP sources

Step 3

Create InfoPackage and load data into PSA

Step 5

Create and start Data Transfer Process to DSO

Step 8

Create and start data transfer process to cube

Step 2

Create dataSource

Step 4

Create transformation to DSO

Step 7

Create transformation to cube

Data transfer process

DTP is an object that determines how data is transferred between two persistent objects in SAP Netweaver BW

Transformation

Data Mapping & Data Harmonization

Data Harmonization

Data consolidation, data cleansing, data reformatting

ETL Process

Draw it!

Extraction Transformation Loading

ETL

Key figures

Fact

Historization

High End Bike (1/1/00 - 12/31/09) Deluxe Touring Bike (1/1/10-12/31/14) D-lux Touring Bike (1/1/15-12/31/9999)

Data Mapping

Mapping source fields to target fields

Persistant Storage Area

PSA

Interval-dependent hierarchy

Product DXTR1000- PRTR1000- DXTR2999 PRTR9999 1000-1999 1000-4999 2000-2999 5000-9999

SAP Crystal Reports

Provides users with a number of alternatives for report creation including wizards. These wizards guide the user through a number of input screens to create a report.

time-dependent hierarchies

Sales Territory US Spain Smith Patel Smith (1/1/12- (1/1/12- (1/1/15- 12/31/14) 12/31/99) 12/31/99)

Version-dependent hierarchy

Sales Territory (Version 1) North South Canada US Latin America Sales Territory (Version 2) North South Canada US S.America Mexico

Step 1

Select source system

Data flow in reporting

Source Systems - ERP - Data Warehouses - Databases - Flat Files - Legacy Systems - Web services - Senior data Semantic Layer - SAP Business Objects Universe Authoring - Report-authoring tool - Report design tool Report server - Report deployment End-user - Web - Mobile - Print

ETL Process

Source Systems => (Extraction) => Data Source/Staging => (Transformation) => (Load) => Cube

SAP Bex Query Designer

Tools in multidimensional analysis allow us to pre filter both key figures and characteristics for our cross tabs so that the data brought in for analysis is already targeted. Also smaller data sets are easier to analyze.

Dimension table

acts as a link between master data and fact table records

data reformatting

changing the format of data from data source to target requirements

Data consolidation

combining and integrating fata from data sources - realigning mismatched data - harmonize keys and records - combine or split data

InfoCube

consists of fact and dimensional tables

characteristic

dimension attributes

dimension

dimension table

Fact table

fact table

Report wizard

guides the user through choosing a data source and linking database tables. It also helps the user add fields and specify the grouping, summarization (totals), and sorting criteria to be used. The wizard also assists in the creation of a chart and record selection.

OLAP cube report wizard

guides the user to create a report in which OLAP data is displayed as a grid object which includes dimensions from the OLAP data source

mailing label report wizard

guides the user to create a report that is formatted to print on any size mailing label.

Cross-tab report wizard

guides users through the creation of a report in which the data is displayed as a cross-tab object

Star Schema

includes 1 fact table and multiple dimension tables

data cleansing

removing noisy data, replacing missing values, realigning formats - identification of signal and noise - getting rid of noise and "dirty" or missing data to avoid GIGO - null values can be misleading - identification and elimination of outliers - elimination of redundant data - reformatting of data

master data tables

text, attributes, hierarchy

Loading

the movement of data from source systems to their intermediate or final destinations

Extraction

the process of identifying data sources and source fields and acquiring, or sourcing, the data required for analysis

Transformation

the process of mapping and harmonizing - that is, making certain the data are consistent, cleansed and reliable - from their sources to the targets

Surrogate ID

used to convert textual keys to numeric keys to create links and provide for faster queries


Kaugnay na mga set ng pag-aaral

AS Biology Paper 1 June 2017 Mistakes

View Set

Sociology of the Family Chapter 3

View Set

Chapter 4 - Activity - Based Costing

View Set