Chapter 8- Understanding Big Data & It's Impact on Business

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Business Focus Areas of Big Data

-Data mining -Data analysis -Data visualization

text analysis

-analyzes unstructured data to find trends & patterns in words & sentences -text mining a firm's customer support email might identify which customer service representative is best able to handle the question, allowing the system to forward it to the right person

big data

-collection of large, complex data sets, including structured & unstructured data -cannot be analyzed using traditional database methods & tools

1. Variety

-diff forms of data -different forms of structured & unstructured data -data from spreadsheets & databases as well as from email, videos, photos, pdf, all of which have to be analyzed

Data-Mining Techniques

-estimation analysis -affinity grouping analysis -cluster analysis -classification analysis

speech anlysis

-process of analyzing recorded calls to gather info; brings structure to customer interactions & exposes info buried in customer contact center interactions w/ an enterprise -heavily used in the customer service department to help improve processes by identifying angry customers & routing them to the appropriate customer service representative

unstructured data examples

-satellite images -photographic data -video data -social media data -text messages -voice mail data

3. Volume

-scale of data -includes enormous volumes of data generated daily -massive volume created by machines & networks -big data tools necessary to analyze zettabytes & brontobytes

structured data examples

-sensor data -Weblog data -financial data -click-stream data -point of sale data -accounting data

regression model

-statistical process for estimating the relationships among variables -include many techniques for modeling & analyzing several variables when the focus is on the relationship btw a dependent variable & one/ more independent variables

forecasting model

-time-series info is time stamped info collected at a particular frequency -forecasts are predictions based on time-series info allowing users to manipulate the time series for forecasting activities

Virtualization examples

-traditional computing environment: application, operating system, server -virtualized computing environment: multiple applications, operating system, server

2. Veracity

-uncertainty of data, including biases, noise, & abnormalities -uncertainty of untrustworthiness of data -data must be meaningful to the problem being analyzed -must keep data clean & implement processes to keep dirty data from accumulating in systems

Data-Mining Process Model Overview

1. Business understanding 2. Data understanding 3. Data preparation 4. Data modeling 5. Evaluation 6. Deployment

4 Common Characteristics of Big Data

1. Variety 2. Veracity 3. Volume 4. Velocity

Three Elements of Data Mining

1. data 2. discovery 3. deployment

Classification analysis example

Age -Young->student->yes/no -Old->credit score->yes/no

IoT

Internet of Things

recommendation engine

a data mining algorithm that analyzes a customer's purchases & actions on a website & then uses the data to recommend complementary products

2. Data understanding

analysis of all current data along w/ identifying & data quality issues & activities include -gather data -describe data -explore data -verify data quality

Techniques used by data scientist to perform big data advanced analytics

analytics include: -behavioral analysis -correlation analysis -exploratory data analysis -pattern recognition analysis -social media analysis -speech analysis -text analysis -web analysis

5. Evaluation

analyze the trends & patterns to assess the potential for solving the business problem & activities include: -evaluate results -review process -determine next steps

social media anlysis

analyzes text flowing across the internet, including unstructured text from blogs & messages

web analysis

analyzes unstructured data associated w/ websites to identify consumer behavior & website navigation

fast data

application of big data analytics to smaller data sets in near-real/ real-time in order to solve a problem/ create business value

4. Data modeling

apply mathematical techniques to identify trends & patterns in the data & activities include: -select modeling technique -design tests -build models

data artist

business analytics specialist who uses visual tools to help ppl understand complex data

pattern recognition analysis

classification/labeling of an identified pattern in the machine learning process

cube

common term for the representation of multidimensional info

virtualization

creation of a virtual version of computing resources, such as operating system, a server, a storage device, or network resources

outlier

data value that is numerically distant from most of the other data points in a set of data

6. Deployment

deploy the discoveries tot he org for work in everyday business & activities include -plan deployment -monitor deployment -analyze results -review final reports

data visualization

describes technologies that allow users to see/visualize data to transform info into a business perpective

correlation analysis

determines a statistical relationship btw variables, often for the purpose of identifying predictive factors among the variables

estimation analysis

determines values for an unknown continuous variable behavior/estimated future value

market basket analysis

evaluates such items as websites & checkout scanner info to detect to customers' buying behavior & predict future behavior by identifying affinities among customers' choices of products & services

data scientist

extracts knowledge from data by performing statistical analysis, data mining, & advanced analytics on big data to identify trends, market changes, & other relevant indo

1. data

foundation for data-directed decision making

1. Business understanding

gain a clear understanding of the business problem that must be solved & how it impacts the company & activities include: -identify business goals -situation assessment -define data-mining goals -create project plan

3. Data preparation

gather & organize data in the correct formats & structures for analysis & activities include: -select data -cleanse data -integrate data -format data

exploratory data analysis

identifies patterns in data, including outliers, uncovering the underlying structure to understand relationships btw the variables

infographics

information graphics-present the results of data analysis, displaying the patterns, relationships, & trends in a graphical format

M2M

machine-to-machine communication

algorithms

mathematical formulas placed in software that performs an analysis on a data set

data visualization tools

move beyond Excel graphs & charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, & more

analysis paralysis

occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision/action is never taken

Data mining modeling techniques for predictions

prediction models (3) 1. optimization model 2. forecasting model 3. regression model

data mining

process of analyzing data to extract info not offered by the raw data alone & uncovers patterns & trends for business analysis such as -analyzing customer buying patterns to predict future marketing & promotion campaigns -building budgets & other financial info -detecting fraud by identifying deceptive spending patterns -finding the best customers who spend the most money -keeping customers from leaving/migrating to competitors -promoting & hiring employees to ensure success for both the company & the individual

data profiling

process of collecting statistics & info about data in an existing source

2. discovery

process of identifying new patterns, trends, & insights

anomaly detection

process of identifying rare/unexpected items/events in a data set that do not conform to other items in the data set

3. deployment

process of implementing discoveries to drive success

classification analysis

process of org. data into categories of groups for its most effective & efficient use (groups of political affiliation & charity donors)

data replication

process of sharing info to ensure consistency btw multiple data sources

distributed computing

processes & manages algorithms across many machines in a computing environment

affinity grouping analysis

reveals the relationship btw variables along w/ the nature & frequency of the relationships

analytics

science of fact-based decision making

Distributed Computing Environment

servers connect to the internet<-->distributed computing environment<-->computer desktops

prediction

statement abt what will happen/might happen in the future, for ex, predicting future sales/employee turnover

optimization model

statistical process that finds the way to make a design, system/decision as effective as possible, for ex, finding the values of controllable variables that determine maximal productivity/minimal waste

cluster analysis

technique used to divide an info set into mutually exclusive groups such that the members of each group are as close together as possible to one another & the diff groups are as far apart as possible

Data Mining Process Model Activities

the 6 phases

business intelligence dashboards

track corporate metrics such as critical success factors & key performance indicators & include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis

data mining tools

use a variety of techniques to find patterns & relationships in large volumes of info that predict future behavior & guide decision making

behavioral analysis

using data abt people's behaviors to understand intent & predict future actions


संबंधित स्टडी सेट्स

Data Collection, Behavior, and Decisions

View Set

Chapter 8: Everyday Memory and Memory Errors (short quiz)

View Set

Mood Disorder Questions from Class

View Set

History-Guyana, French Guiana and Suriname

View Set

Marketing Exam 3: Digital and Social Marketing

View Set