BI & Analytics Quiz

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Company data isn't always in one location. its usually found across:

CRM Programs Marketing automation systems Social media platforms

In-memory systems utilize RAM memory - instead of hard drives - to

execute queries, increasing application performance

Hierarchies let you drill-down into data to

explore interesting patterns and anomalies

A multidimensional data model is organized around a central theme, for example, sales. This theme is represented by a

fact table.

Structured data resides in a

fixed form and labeled

Ad hoc analysis, virtually any report can be

formatted multidimensionally (pivoting and nesting dimensions) anyone can be taught

Core operational database functionality:

gather data update data store data retrieve data archive data

Data visualization is a

graphic display of the results of data mining or analytics, often in real-time

With descriptive analytics, raw data can be

grouped into easily digestible pieces, such as the number of unique page-views, or the sales numbers for a specific department

Map reduce can save

huge amounts of network bandwidth and resources

Transform, in order to properly analyze data, it must be

in the same format

Hadoop

infrastructure for storing and processing large sets of data across multiple servers

The general idea of this approach (data cube) is to

materialize certain expensive computations that are frequently inquired.

OLAP pivot table creates an

mdx query sent to OLAP cube, OLAP cube has data requested by mdx query returned from cube to OLAP pivot table

Facts are numerical measures. Thus, the fact table contains

measure (such as units_sold) and keys to each of the related dimensional tables.

Data marts are essentially smaller, more focused warehouses. Instead of aggregating data across a company, a data mart

might store the information of just a single department

OLAP Cube

multidimensional structure that stores and maintains discrete intersection values -Some OLAP systems let cubes intersect with each other

OLAP systems organize data into

multidimensional structures

OLTP systems are everywhere:

order tracking invoicing credit card processing retail POS banking airline reservations

Hierarchy

organizes data by levels

Descriptive analytics is the base upon which

other types of analytics are built

Descriptive programs analyze

past data and identify trends and relationships

The term business intelligence grew out of technology called

decision support

Attribute

descriptive non-hierarchical information Example: -model number -size -list price -color -flavor -street address

The measures to be analyzed depend on the

purpose of the OLAP system

Map reduce allows data to be

queried and processes on the server where it resides, instead of transporting the data across the network to be analyzed on the computer

A multidimensional data model is organized around a central theme, like sales transactions.

A fact table represents this theme.

When data is grouped or combined in multidimensional matrices called

Data Cubes.

OLAP was named by

IMB's E.F. Codd (inventor of SQL and relational databases)

Load

data is transferred into the central warehouse or data mart

Argued that all the "intelligence" in business intelligence results from

data mining

A multidimensional model views data in the form of a

data-cube.

Raw transactional data

not really useful for business intelligence

Roll-up or consolidation refers to data aggregation and computation in

one or more dimensions.

Ad hoc analysis

point-and-click drill-down is made usable by OLAPs rapid response model Lets managers and analysts perform ad hoc analysis

Key OLTP characteristics

processes a transaction according to rules performs all elements of a transaction in real time continually processes multiple transactions

OLTP systems gather

raw data used for multidimensional analysis raw data has to be converted into something suitable for analysis converting raw data to something useful isn't easy

Structured data is easy for computers to

read and query such information, because the data is already standardized

The fact table contains the names of the facts or measures of the

related dimensional tables.

Predictive analytics

searches for a correlation between a single unit or factor, and the features that pertain to it

Slicing refers to

selecting a subset of the cube by choosing a single value for one of its dimension and creating a smaller cube with one less dimension.

Text analytics is useful for analyzing the

sentiment of social media posts, or online customer feedback

Companies now have access to smartphone metadata, internet usage records, social media activity. Business intelligence platforms

sift through this data to find patterns and trends.

Using a process known as extract, transform, and load (ETL), warehouses

standardize data across systems, which allows it to be queried

Extract is the step where unstructured data (such as notes, or author information) is

tagged with metadata to make it easier to find

Roll-up or consolidation for instance,

the cube with cities is rolled up to regions to depict the data with respect to time (in periods) and item (material descriptions).

Transform

the data is normalized

The analysis gap

the large gap between data businesses collect and the information that decision makers require

Hadoop only the question (the query) is

transferred across the network. The analysis is done on the server. The answer is brought back to the computer.

A good rule of thumb is that 80% of all data produced is

unstructured (messages, comments)

Text analytics software combs through

unstructured textual data to find patterns

The more often load is done, the more

up to data analytic reports will be

Data cubes can have

very large numbers of members

OLAP-based ad hoc analysis lets

virtually any question be answered quickly

The data cube method has a few alternative names or a few variants, such as

"Multidimensional databases," "materialized views," and "OLAP (On-Line Analytical Processing)."

Dimensions let you

"slice and dice" multidimensional data

Slice Iocid =

1 is shown

2 dimensions and

1 measure

Business intelligence systems have grown more powerful and comprehensive, mainly due to:

1) Increased data collection 2) Greater storage capacity

Map reduce is the arm of

Hadoop

OLTP systems can be used to

answer transactional questions

Drill-down operation helps users navigate through the

data details.

Dicing generates a

subcube by picking two or more values from multiple dimensions of the cube. The cube is rotated independent of its dimension, therefore users can analyze data from different viewpoints.

Operation database

supports the day-to-day operations of a company Ex: lots of individual shoppers buying soda, each transaction stored in a database designed to store checkout transactions

Four important properties of a measure:

1. Always a quantity or expression that yields a quantity 2. Can take any quantitative format 3. Can be derived from any original data source or calculation 4. At least one measure required to perform OLAP analysis

Two test for dimensionality:

1. Can data about members be compared? -Sales numbers of one product compared to sales numbers of another product 2. Can data from members be aggregated into summaries? -Jan, Feb, Mar aggregate together as Q1

Packaged systems have 2 big limitations:

1. Can only report on their own data - "silos" of data ex: sales, marketing, accounting, finance 2. Don't really support multidimensional analysis

There are three main forms of business analytics:

1. Descriptive 2. Predictive 3. Decision

Data mining can be used to:

1. Group sets of data 2. Find outliers 3. Draw connections

All OLAP systems have to meet three key criteria:

1. Must support multidimensional analysis -"by" dimensions 2. Fast retrieval times -"infinite question syndrome" 3. Calculation engine that can handle specialized multidimensional math -Simple formulas

Business analytics, by analyzing and drawing connections between data, companies can:

1. Predict future trends 2. Gain competitive advantages 3. Reveal unknown inefficiencies

The three basic operations in OLAP are:

1. Roll-up (Consolidation) 2. Drill-down 3. Slicing and dicing.

Data comes in three main forms:

1. Structured 2. Semi structured 3. Unstructured

The OLAP approach is used to analyze multidimensional data from

multiple sources and perspectives.

OLTP (online transaction processing)

Capturing and storing data from ERP, CRM, POS Day-to-day business transactions The main focus is on efficiency of routine tasks

When the OLAP pivot table wants to get information from the OLAP Cube, it uses aa language called

MDX

Relational database ->

OLAP Cube -> OLAP Pivot Table

Analysis gap between raw data and BI can be bridged by combing

OLTP systems with BI systems

Tabular representation (Think Hershey's Chocolate Bar)

On the top of the bar: Prid, Timeid, Iocid, Sales

Modern BI systems designed to follow

OnLine Analytic Processing (OLAP) model

Multidimensional representation (Think Rubik's Cube)

Pid, Timeid, Iocid

Extract

Raw data is extracted from a source program (such as CRM or ERP software)

Dimensions are the perspectives or entities concerning which an organization keeps records. For example,

a shop may create a sales data warehouse to keep records of the store's sales for the dimension time, item, and location. These dimensions allow the storage to keep track of things, for example, monthly sales of items and the locations at which the items were sold.

OLAP systems provide

ad hoc analysis, slicing and dicing, pivoting dimensions, and drilling down through hierarchies

Each level in the hierarchy is the

aggregate of the levels beneath it

Data mining is the

analysis of large sets of data in order to find patterns and correlations

Example of drilling down enables users to

analyze data in the 5 steps (virtual day) of the first period separately. The data can be divided with respect to DC, months (time) and item (material descriptions).

Decision analytics is the software that helps companies

analyze future industries and market spaces

A measure is the data that's being

analyzed across multiple dimensions

Decision analytics looks at a companys internal data, then

analyzes external conditions (such as manufacturing trends, or predicted supply shortages) to recommend the best course of action for a company

Measure

any quantitative expression contained in an OLAP system

A data cube is created from a subset of

attributes in the database.

Specific attributes are chosen to be measure attributes, i.e., the

attributes whose values are of interest.

Unstructured data is information that

cant be easily read by computers

Data in operational databases

cant easily be analyzed

OLTP systems cant be used to answer most analytics questions:

cant search, sort, and summarize large numbers of records cant handle required calculations negative impact on OLTP system performance

Data marts limit the complexity of databases, and are

cheaper to implement than full warehouses

Instead of centralizes files, Hadoop uses a

cluster system that allows files to be stored on multiple servers

Data Warehouses are used to

consolidate disparate data in a central location

Highest level of OLAP structure is a

dimension: categorically consistent view of data

Each dimension has a table related to it, called a

dimensional table, which describes the dimension further.

View "by" qualifiers are usually

dimensions

A data cube enables data to be modeled and viewed in multiple dimensions. It is defined by

dimensions and facts (measures).

OLAP consists of

dimensions and measures

Another attributes are selected as

dimensions or functional attributes.

The measure attributes are aggregated according to the

dimensions.

The insights from analytics reports influence company

direction, product lineups, and even hiring decisions

Load process can occur

every week, day, hour, or even minute

OLAP provide tools for users to

examine/filter dimensional data

The goal of predictive analytics is to

find the same correlation across different data sets, which would allow companies to infer future patterns from past trends

Dashboard are the

interfaces that represent specific analysis No command-line interface

The first step in BI is taking

inventory of the data your company produces

Data analysis is the reason companies

invest in BI

Roll-up or consolidation

is actually performed on an OLAP cube

Hadoop can be complex to implement and run, and

is not well suited for ad hoc queries

Unstructured data is difficult to organize in traditional databases, because

it cant be stored in rows or columns

For example, a dimensional table for an item may contain the attributes:

item_name, brand, and type.

OLTP is optimize for managing

low-level business data

Hadoop is best suited for companies that produced

massive volumes of data

Stories are at the center of the SAC, and the underlying data lies in the

measures and dimensions defined in the multi-dimensional data model of your data

In BI, measures known by different names depending on the application:

metric/key performance indicator (KPI) Benchmark Ratio

A data cube enables data to be

modeled and viewed in multiple dimensions.

Hierarchy example

months, quarters, years


Kaugnay na mga set ng pag-aaral

Practice NCLEX Q's: Preeclampsia, Eclampsia,

View Set

NUR 325: pharmacological concepts

View Set

GCP ML SIMULATED TEST QUESTIONS - PRACTICE EXAM 3 - (please feel free to submit edits/corrections to Mike!)

View Set

Chapter 54- Heat and Cold Applications

View Set

CH 28: Assessment of Hematologic Function and Treatment Modalities

View Set