INFS 247 Quiz 3 Lectures 6-10 Prof. Eddington

¡Supera tus tareas y exámenes ahora con Quizwiz!

Data warehouses are used by different (____ ______) set of users than operational databases, such as data analysts and decision makers

much smaller

Data mining

"the process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques"

Why have a separate analytics DB?

1. Time Horizon Difference 2. Performance Reasons 3. Difference in Data Structuring

big data encompasses _____ _____ of stored data

80-90%

recency-frequency-monetary value (RFM)

A CRM metric and method for analyzing customer value. Recency: When was the most recent purchase by the customer? Frequency: How often does the customer purchase over a period of time? Monetary value: What is the monetary value of the customer's purchase over a period of time?

Business Intelligence

A broad term encompassing analytical use of data. Incorporates various methods (including database management, data warehousing, and data mining) that give companies ability to discover and utilize information they already own and turn it into the knowledge that directly impacts corporate performance.

Data Mart

A data store based on the same principle as a data warehouse, but with a more limited space.

Data Warehouse (EDW)

A system used for reporting and data analysis and is considered a core component of business intelligence

Slice and Dice

Adds, replaces, or eliminates specified dimension attributes (or particular values) from the already displayed result

Data Warehouses:

Are modeled and structured differently than operational databases. Use different types of technologies for storage and retrieval than operational databases.

SCM Automation

Enabled by the use of digital information and other technological advances

ETL includes the following tasks:

Extracting analytically useful data from the operational data sources Transforming such data so that it conforms to the structure of the target data warehouse model, while ensuring the quality of the transformed data Loading the transformed and quality assured data into the target data warehouse

Extraction-Transformation-Load (ETL)

Facilitates the retrieval of data from operational databases into the data warehouses

Example OLAP Query

For each individual store, show separately for male and female shoppers the number of product units sold for each product category

Analytical Information Examples

Information reflecting trends, sales, product statistics, and future growth productions

Operational (transactional) Information Examples

Information reflecting withdrawing cash from an ATM, making an airline reservation, and purchasing stocks

Executive dashboards

Intended for use by higher level decision makers within an organization contains an organized, easy-to-read display of a number of critically important queries describing the performance of the organization in general, the usage of executive dashboards should require little or no effort in training

Difference in data structuring

It is often impossible to structure a database which can be used in an efficient manner for both operational and analytical purposes

examples of sales support metrics and KPIs

Leads-to-conversions rate, average profit per customer in a time period, upsell revenues in a time period

Source systems can include external data sources, such as...

Market research data, census data, stock market data, weather data, geospatial data

Data warehouses and data mining

Mining transactional (operational) databases containing data related to current day-today organizational activities can be of use in certain situations, but generally isn't ideal. The enterprise-wide corporate data warehouse contains all the information from the operational data sources that has analytical value.

Front-end/BI applications

Most users who need to access analytical data in data warehouses and data marts can not engage in direct access (as they do not know how to use OLAP/BI tools). Instead, they use front-end/BI applications

examples of customer service metrics and KPIs

Number of complaints in a time period, customer ratings of service, average time to complete problem resolution

DW front-end (BI) applications

Often referred to as BI (Business Intelligence) applications. Provide access to the data warehouse for users who are engaging in indirect use.

Operational Databases collect...

Operational (transactional) Information

Source systems

Operational databases that provide analytically useful information for the data warehouse's subjects of analysis

MapReduce

Parallel computing divides complex tasks into a sequence of smaller tasks that are performed in parallel on multiple computers. Using multiple computers at the same time (parallel computing) vastly reduces the time needed for processing. Traditional parallel approaches are specialized (expensive) computers. Map reduce technique utilizes regular commodity (cheap) computers.

Association rules are expressed in the form:

People who buy X tend to buy Y

Mythical association rule

People who buy diapers tend to buy beer

OLAP/BI tool features

Slice and Dice Pivot (rotate) Drill Up/Drill Down

EDW components

Source systems (operational databases) Extraction-transformation-load (ETL) infrastructure Data warehouse (analytical database) Front-end (BI) applications

Time horizon difference

The difference between how long the data is useful for transactional vs. analytical needs.

Every operational database that is used as a source system for the data warehouse has two purposes:

The original operational purpose. As a source system for the data warehouse.

Performance Reasons

The performance of operational day-to-day tasks involving data use can be severely diminished if such tasks must compete for computing resources with analytical tasks

Online Analytical Processing (OLAP)/BI

The process of retrieving data from data warehouses and/or data marts for analytical purposes.

Online Transaction Processing (OLTP)

The process of updating (i.e. inserting, modifying, and deleting) and retrieving data from databases for operational purposes

Support measures...

The significance of the rule. We are interested in rules with relatively high support.

True or False: The same fact can have both an operational and analytical purpose

True

True or False: big data is not modeled up front for a pre-determined operational and/or analytical queries (retrievals)

True

supply chain

a network between a company, its suppliers, and its customers whose purpose is to produce and distribute the company's products and/or services

Data mining patterns should be:

accurate, meaningful, understandable, actionable

enable

actively and jointly manage all SCM-related processes (plan, make, source, deliver, return) and elements (such as business rules, contracts, regulations, risks, human resources), while measuring and monitoring SCM performance through usage of data and information systems

a supply chain consists of...

all of the steps it takes to get products and/or services to the customer

OLAP/BI tools provide _____ access to data in data warehouses and data marts

direct

Big data methods ___ ___ replace database and data warehousing approaches

do not

Business intelligence software

enables business users to see and use (analyze) large amounts of complex data: - database reporting tools (for retrieving data from databases) - OLAP/BI tools (for retrieving data from data warehouses and or/data marts) - data mining tools (for data mining)

Market basket analysis

finds groups of items that appear together in transactions

the bullwhip effect

occurs when product demand information passes throughout the supply chain in a distorted (delayed) way. can result in excessive or insufficient inventory.

supply chain logistics

often used as a synonym for "transportation" or "shipping" or "delivering." the part of the supply chain process that plans, implements, and controls the efficient, effective forward and reverse flow and storage of goods, services, and related information between the point of origin and the point of consumption

Analytical Information is based on...

operational (transactional) information

transactional structured data

operational databases - data modeled/structured and stored for anticipated pre-determined operational use

SCM levels of decision making

operational, tactical, strategic

return rate

percent of shipped items that are returned by customers. calculated by dividing the number of returned items by the number of all items sold. the goal is to decrease the rate. investigating this metric includes itemizing the reasons for returned items (unwanted, defective, damaged, late arriving) this serves to identify trends and address the underlying issues in order to reduce the rate of return.

back order rate

percentage of orders that cannot be filled at the time a customer places them. calculated by dividing the number of back orders by the number of all orders

examples of marketing support metrics and KPIs

promotional mail/e-mail response rate, increase in sales during the marketing campaign, increase in number of customers during the marketing campaign

Executive dashboards provide a _____ _____ to data in data warehouses and data marts

quick glance

Pivot (rotate)

reorganizes the values displayed in the original query result by moving values of a dimension column from one axis to another

Test promotions to subsets of different cells to determine...

response rates

source

schedule and execute necessary steps for obtaining materials, goods, and services from suppliers that are needed to produce a company's products and services

make

schedule and execute the necessary steps for creating a company's products and services

deliver

schedule and execute the necessary steps for delivering a company's products and services to their customers

return

schedule and execute the necessary steps for receiving returned products from their customers and for cancelled services by their customers

A data warehouse is created within an organization as a...

separate data store whose primary purpose is data analysis

Companies/organizations tend to store a lot of big data knowing that...

some of it may be of use later

Based on the point-and-click actions by the user of the OLAP/BI tool, the tool...

writes and executes the code in the language of the DBMS that hoses the data warehouse or data mart that is being queried

association rule

x → y

Data Warehouses keep...

years' worth of data in order to enable long-term analysis

We can't keep data in transaction-oriented databases for longer periods of time because...

it would cause a majority of database operations to deal with unnecessarily large amounts of old, no longer used data

Data lake

large data pool in which the schema and data requirements are not defined until the data is queried.

Operational databases can be used by a ____ _______ of employees, partners, and customers

large number

Processes run ____ _______ when performance deteriorates

less efficiently

customer relationship management (SCM)

leveraging existing customer data to - drive loyalty and engagement - maximize profitability CRM entails integration of - sales management - marketing - customer service

Regular transaction-oriented databases store data for a...

limited period of time, then the data loses its immediate usefulness and is archived

strategic

longer-term decisions. example: should we consider using overseas suppliers in the next 3 to 5 years

CRM objectives

maintain, enhance, and improve relationships with customers and potential customers. analyze customer behavior with the goal of: - increasing customer retention - driving sales growth and revenue - segmenting customer base - understanding seasonality - identifying potential new customers

Drill Up

makes the granularity of the data in the query result coarser

Drill Down

makes the granularity of the data in the query result finer

OLAP/BI tools allow users to retrieve needed data from data warehouses and data marts by using...

simple point-and-click mechanisms

while a data warehouse stores data in a predefined structure designed for analysis, a data lake is a...

storage repository that holds a vast amount of raw data in its native (original) format until it is needed.

each three-digit code is a...

cell

Data warehouses are...

central repositories of integrated data from one or more disparate sources

Data lake often serves as a...

corporate big data repository

OLAP/BI tools are designed for...

analysis of data in data warehouses and data marts

Analytical databases (data warehouses) collect...

analytical information

Big data methods allow organizations to...

analyze and get insight from the big data

unstructured/semi-structured, un-modeled data

big data

Standard database and data warehousing techniques ______ adequately deal with the diversity and volume of big data.

cannot

OLAP/BI tools analytical functionalities

creating and examining calculated data determining comparative or relative differences performing exception analysis, trend analysis, forecasting, and regression analysis data visualization functionalities

CRM software examples

customer service support, marketing and sales support (keeping track of birthdays and anniversaries), automation of customer service, sales, and marketing (automated email on your birthday)

Big data analysis can be performed on data in...

data lakes or directly on data in original sources

analytical structured data

data warehouses and data marts - data modeled/structured and stored for anticipated pre-determined analytical use

Data mining is an intersection of...

database management, artificial intelligence (machine learning), and statistics.

operational

day-to-day decsions. example: how much inventory to order today based on today's events

plan

develop and communicate plans for sourcing, making, delivering, and returning/cancelling a company's products/services

supply chain metrics and KPIs

examples: back order rate, turn over, return rate. calculated based on data in an organization's information systems. provided in reports, dashboards, etc.

OLAP drill up/drill down example

for each individual store, show separately for male and female shoppers the number of product units sold for each individual product in each category

OLAP slice and dice example

for stores 1 and 2, show separately for male and female shoppers the number of product units sold for each product category

supply chain operations reference (SCOR) model

framework for supply chain management. includes the following processes: - plan - source - make - deliver - return - enable

OLAP/BI tool data visualization

graphically visualizing the answers/analytics results

marketing collateral

hard dopy or electronic document informing potential customers about their products or services

Front-end/BI applications provide _____ access to data in data warehouses and data marts

indirect

Analytical Information

information collected and used in support of analytical tasks

Operational Information (Transactional Information)

information collected and used in support of day-to-day operational needs in businesses and other organizations

big data

massive volumes of diverse and rapidly growing data that are not formally modeled for efficient retrieval. big data is heterogeneous originating from various sources such as smart devices, social media, sensors. big data analytics is a part of overall data strategy, not a separate isolated initiative.

inventory turnover rate

measures how many times a year a company sells its entire inventory. calculated by dividing the cost of goods sold in a period by value of average inventory. higher rates indicates fewer days to sell the inventory. generally, the goal is to increase the rate.

tactical

medium-term decisions. example: what terms to include in the annual contract with our supplier for the upcoming year

cost of back orders

much higher rate of cancellations for back ordered products backorder notification cost loss of goodwill and future sales to current customers lost cost of initial customer acquisition

A typical data warehouse ________ retrieves selected analytically useful data from the operational data sources

periodically

Data-driven customer interaction

personalized marketing collateral: - form fields pull relevant information from database and populate marketing collateral. - standard personalization: name, location, age - advanced personalization: customized promotion based on customer behavior - used in personal selling, direct mail, online marketing - dependent on timeliness and availability of data

The data warehouse is sometimes referred to as the _____ _______ to indicate the fact that it is a destination for the data from the source systems

target system

Data mining leads to...

the discovery of novel and interesting patterns in large amounts of data and is concerned with predictive analytics (using past data to predict future events)

Big data methods allow organizations to analyze and get insight from...

the kinds of data that are not suited for regular database and data warehouse technologies

Upstream Supply Chain

the portion of the supply chain from raw materials to the production facility

downstream supply chain

the portion of the supply chain from the production facility to the end-customer

Confidence measures...

the strength of the correlation. Rules with low confidence are not meaningful, even if their support is high.

Each customer has a...

three-digit RFM code.

Assigning RFM score to customers

top 20% assigned a value of "5" next 20% assigned a value of "4" each customer gets an R, F, and M score between 1 (lowest) and 5 (highest)

Types of data stored in corporations:

transactional structured data, analytical structured data, unstructured/semi-structured, un-modeled data

Correlations are expressed in the form:

transactions that contain X are likely to contain Y


Conjuntos de estudio relacionados

Ch.2 Instructions: Language of the Computer

View Set

KCC-ZOO141 CH.4 Chapter Review Questions Fall17

View Set

The Process of Technological Change

View Set