ITP 487 Exam 1

¡Supera tus tareas y exámenes ahora con Quizwiz!

What is a key?

A combination of one or more columns that is used to identify rows in a relation

What are the shortcomings of Star Schmas?

- Duplication of dimensional data - No support for multiple languages - No historization (changing dimensional data) - Hierarchies can cause anomalies

What are the advantages of the Snowflake Schema?

- Reduces data redundancy - Supports multiple languages, currencies, and units of measure - Supports historization - Supports cube administration

How do you Snowflake?

- Extracting attributes from dimension tables and placing them in to new tables (master data)

What is data staging and what does it ensure?

- It is the process whereby data are organized and prepared for analysis.

What are the five main InfoObject Types?

- Key figures (revenue, sales amount, net value, etc.) - Characteristics (Customer, region, zip code, etc.) - Units (Unit of measure, currency) - Time (Calender year, fiscal year, month, day, etc.) - Some technical stuff (DataPackageID, etc.)

In a Star Schema, what are the two measures of data, and what type of data is it?

- Measures and Key Figures - Transactional data

What are the major improvements of the Snowflake over the Star Schema (4 main ways)?

- Reduces data redundancy - Supports multiple languages, currencies & units of measure - Supports historization - Supports cube administration

Why is SAP HANA so efficient?

- SAP's is an in-memory database - Columnar data storage - Data compression (avoids redundancy through dictionary compression, prefix encoding, etc.)

Advantages of Star Schema

- The star-like layout is easy to understand and implement - The one-to-many relationship between dimension tables and the fact table is straightforward. - Has only a single level of JOIN for any query. The fact table is joined to one or more dimension tables, so the JOINs are easy to build.

What are some reasons for the recent explosion of data?

- There are more users of technology - There is far more usere-generated content - Mobility (i.e. how the culture at golf tournaments is changing because of mobile phones)

What are the primary dimension tables in a Star Schema?

- Time dimension - Customer dimension - Product dimension - Sales Org dimension - Region dimension

What are the primary dimension tables in a Star Schema?

- Time dimesnion - Customer dimension - Product dimension - Sales Org dimension - Region dimension

What is data harmonizing?

- When data from multiple sources need to be made consistent

In a star schema, what is the fact table?

A fact table contains facts about the business; it contains a list of historical transactions with data bout key figures and measures

What is a primary key?

A candidate key selected as the primary means of identifying rows in a relation. It never changes.

Define Data Warehouse

A data warehouse is a process and architecture that requires robust planning to implement a platform, which consists of transformation, consolidation, cleansing and mapping of data from multiple operational data sources to a target DBMS that supports decision-making.

What is a Dimension inside of SAP BW?

A group of characteristics which belong to the same business object (semantically)

What is a composite key?

A key that consists of two or more columns

What is a foreign key?

A primary key of one relation that is placed in another relation to form a link between the relations

What is a surrogate key?

A primary key which the database creates

In SAP BW what does Process Chain mean?

A sequence of processes that get triggered in a specific order

What is a star schema and what is its purpose?

A simple relational model that is easy to understand and that represents business data in a lucid manner

What is the Referential Integrity Constraint (RIC)?

A statement that limits the values of the foreign key to those already existing as primary key values in the corresponding relation

What is a relation?

A two-dimensional table

Data mapping

Allows for connections between two systems. This connection allows for data initially captured for one purpose to be translated and used for another purpose. One system in a map is identified as the source while the other is the target.

What is a characteristic?

Each attribute in a dimension table is called a characteristic (i.e. In the Customer table, the customer's name, address, etc. are characteristics)

What is a canidate key?

Any key that could possibly be a primary key

What are the common Data Load Errors?

Common cause is mismatched data (i.e. text field in the source and numeric field in the target)

Define decisions

Course of action, monitoring, and correction

What does DBMS stand for?

Data Base Management System

What is the flow from data to decision?

Data -> Information -> Knowledge -> Wisdom -> Decision

What does DTP stand for? What does DTP do?

Data Transfer Process. It triggers data transfer between a source and its target.

What data do rows contain?

Data about entities

What data do columns contain?

Data about the attributes of the entry

Define data

Data can be defined as raw figures

What type of data do OLTP data systems use and who are the primary users of this data?

Data which is a current representation of the state of the business. The primary users of this data are Clerks, salespersons, and administrators.

Data which is most used in a DB can be considered what? (and vice versa)

Data which is used constantly is considered "Hot Data". Data which is rarely used in considered "Cold Data"

What does ETL stand for?

Extract, Transform, Load

What are two examples of InfoProviders?

I. aDSO's II. Composite Providers

What can a Composite Provider do?

I can merge data from different BW InfoProviders and / or HANA views.

What are the transformation rules when data cleansing?

I. Algebraic rules (i.e. converging weights or currencies) II. String rules (i.e. combine two names into one) III. Data and time rules (i.e. change date and time formats) IV. Programmatic or algorithm-based rules (i.e. use transformation programming languages)

What are the "four rules" of relations?

I. All entries in a column have to be of the same kind II. Each column has to have a unique name III. No two rows ma be identical IV. Cells of a table must hold a single value

What are the steps to building a SAP BW/4HANA Snowflake Schema?

I. Build fact table by creating key figures II. Build "dimensions" (folders in BW/4HANA) III. Build characteristics (attributes of dimensions) IV. Further the snowflake characteristics by establishing their text, attributes, hierarchies, and geo information

What are the five essential components of data warehosuing?

I. Data source II. Data conversion and extraction III. Data warehouse DBMS IV. Data warehouse administration V. BI tools (BI - Business Intelligence)

When do you Snowflake (2 Conditions)?

I. Dimension table consists of two or more sets of attributes which define information at different grains (detail) II. Sets of attributes of the same dimension table are being populated by different source systems (very common in data warehouses)

What are the five drivers of Business Intelligence?

I. Globalization II. Decentralization III. Smaller Margins IV. Faster reaction to the market V. Increasing importance of service

Why is data warehousing needed?

I. Integrated, company-wide view of high-quality information (from heterogeneous databases) II. Separation of operational and informational systems and data (for improved performance)

What are the two types of data that are stored in the sales fact table?

I. Measures II. Key figures

If a data load error occurs, what are the three ways that data transfer system can address these errors?

I. No update, no reporting II. Update valid records, no reporting III. Update valid records, reporting possible

When dealing with InfoProviders in BW on AnyDB, what are the two types of data, and some examples.

I. Physical data - Infocubes - Data Store II. Virtual data - MultiProviders: to create views - Virtual Providers

What two things do InfoProviders do in BW/4HANA

I. Physical data: aDSOs; can optimize data for fast multi-dimensional analysis or fast update II. Virtual data: Composite Providers

What are the Seven data mining techniques which were talked about in class?

I. Regression II. Decision Trees III. Neural Networks IV. Association Analysis V. Clustering VI. Classification VII. Genetic Algorithms

What are two more ways to fix data load errors?

I. Roll back to data state prior to load II. Delete data load, repair rogue data, and reload

What are the six objecties of data warehouses?

I. Single source of truth II. Standardized structures and representation for all enterprise information III. Easy access (single point of access to enterprise information) IV. Self service, high quality business reporting and analysis on all levels V. High performance environment fed from heterogeneous sources VI. Unburden systems and IT resources in the transactional environment

What are the three main types of master data?

I. Text II. Attributes III. Hierarchies

Given a data set, what do you do (4 steps)?

I. Understand the data (data dictionary, create a star schema) II. Design the Snowflake (Master tables, data types, lengths) III. Model in SAP BW/4HANA (Implement design, key figures, characteristics) IV. Load data set (master data, transactional data)

What is an entity?

Identifiable things that users want to track (customers, products, sales orders, vendors, etc.)

What are InfoObjects? What do they help us build?

InfoObjects are the "bricks" to build data structures inside of SAP BW, they represent business objects from an end user's view.

In SAP BW what does data source mean?

Interface object to source system representing the source data structure

In a Snowflake Schema, what is the purpose of the dimension table

It acts as a link between master data and fact table records

What does an aDSO consist of?

It can consists of InfoObjects and / or ordinary database fiels. Can be used in different layers depending on property sets

What is the ALPHA conversion routine?

It converts from internal to external length

What is SAP BW/4HANA?

It is SAP's version of Enterprise Data Warehouse, but it only runs on SAP HANA

What is the purpose of data mining?

It is important that the results provide enterprises with insights not available through traditional techniques or predefined relationships (i.e. relational tables)

How is Business Intelligence efficient?

It is user-centered to improve overall decision making

What does OLTP stand for and what is its purpose?

OLTP stands for Online Transaction Processing System and its primary purpose is to run the business on a current basis (operational systems).

Define wisdom

Reveals ideas principles, biases, and insights. Usually where policies are made

Define information

Reveals relationships between entities (the WHAT behind analytics)

Define knowledge

Reveals the trends and patterns (the WHY behind analytics)

How does data flow in SAP BW?

Source system --> data source --> aDSO

What is a composite entity

The database will not allow you to create a grade without having a student to assign the grade to

What type of data do OLAP data systems use and who are the primary users of this data?

The primary type of data used is historical point-in-time and prediction data. This is primarily used by managers, business analysts, and customers.

What does loading mean in ETL?

The movement of data from source systems to their intermediate or final destinations

Define data mining

The process of discovering meaningful correlations, patterns, and trends by sifting through large amounts of data stored in repositories

What does transformation mean in ETL?

The process of mapping and harmonizing data from their sources to the targets

What does extract mean in ETL?

The process of sourcing the data for analysis

What are InfoProviders?

They are the data containers inside of SAP BW. They may store data records physically (persistent) or virtually.

In SAP BW what does transformation mean

Transforms the data according to defined rules

In SAP BW what does Data Transfer Process (DTP) do?

Triggers the data transfer between a source and target

What is a transitive dependency?

When a column depends on another column to be defined (i.e.If A --> B and B --> C; therefore A -->C)


Conjuntos de estudio relacionados

28. Antbióticos betalactámicos e inhibidores.

View Set

35 Facts about President Barack Obama you may not know

View Set

PSYO 317 Midterm Study Guide Questions

View Set