Bus Adm Exam 2 Chapter 5

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

variety, veractiy, volume, velocity

4 common characteristics of big data

estimation analysis, affinity grouping analysis, cluster analysis, classification analysis

4 common data-mining techniques used to perform advanced analytics such as Netflix's Cinematch:

social media analysis

A big data advanced analytics technique. Analyzes text flowing across the Internet, including unstructured text from blogs and messages

web analysis.

A big data advanced analytics technique. analyzes unstructured data associated with websites to identify consumer behavior and website navigation

text analysis

A big data advanced analytics technique. analyzes unstructured data to find trends and patterns in words and sentences. Text mining a firm's customer support email might identify which customer service representative is best able to handle the question, allowing the system to forward it to the right person.

correlation analysis

A big data advanced analytics technique. determines a statistical relationship between variables, often for the purpose of identifying predictive factors among the variables

exploratory data analysis

A big data advanced analytics technique. identifies patterns in data, including outliers, uncovering the underlying structure to understand relationships between the variables

pattern recognition analysis

A big data advanced analytics technique. the classification or labeling of an identified pattern in the machine learning process

speech analysis

A big data advanced analytics technique. the process of analyzing recorded calls to gather information; brings structure to customer interactions and exposes information buried in customer contact center interactions with an enterprise. This is heavily used in the customer service department to help improve processes by identifying angry customers and routing them to the appropriate customer service representative.

behavioral analysis

A big data advanced analytics technique. using data about people's behaviors to understand intent and predict future actions

false

Anomaly detection does not help to identify outliers in the data that can cause problems with mathematical modeling.

classification analysis vs cluster analysis

Classification analysis is similar to cluster analysis because it segments data into distinct segments called classes; however, unlike cluster analysis, a classification analysis requires that all classes are defined before the analysis begins. Cluster analysis is exploratory analysis and classification analysis is much less exploratory and more grouping.

data

Data mining element. Foundation for data-directed decision making.

discovery

Data mining element. Process of identifying new patterns, trends, and insights.

deployment

Data mining element. Process of implementing discoveries to drive success.

analyzing customer buying patterns and future marketing and promotion campaigns, building budgets and other financial information, detecting fraud by identifying deceptive spending patterns, finding the best customers who spend the most money, keeping customers from leaving or migrating to competitors, promoting and hiring employees to ensure success for both the company and the individual.

Data mining uncovers patterns and trends for business analysis such as:

distributed computing and vitalization

The 2 primary computing models that have shaped the collection of big data:

data, discovery, deployment

The 3 elements of data mining include:

optimization model, forecasting model, regression model

The 3 primary data mining modeling techniques for predictions

Business understanding, data understanding, data preparation, data modeling, evaluation, deployment

The 6 primary phases in the data-mining process

velocity

The analysis of streaming data as if it travels around the Internet. Analysis necessary of social media messages spreading globablly.

to quickly gather and mine structured and unstructured data so action can be taken

The goal of fast data

False - decrease

The increase in price of storage and computer memory allow companies to leverage data that would have been inconceivable to collect only 10 years ago.

true

With big data information is multidimensional, meaning it contains layers of columns and rows.

data artist

a business analytics specialist who uses visual tools to help people understand complex data

big data

a collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools. One of the latest trends emerging from the convergence of technological factors.

recommendation engine

a data-mining algorithm that analyzes a customer's purchases and actions on a website and then uses the data to recommend complementary products

dimension

a particular attribute of information

prediction

a statement about what will happen or might happen in the future.

outlier

data value that is numerically distant from most of the other data points in a set of data

data visualization

describes technologies that allow users to see or visualize data to transform information into a business perspective. It is a powerful way to simplify complex data sets by placing data in a format that is easily grasped and understood far quicker than the raw data alone.

petabyte

equivalent to 20 million four drawer file cabinets with text files or 14 years of HDTV content

satellite images, photographic data, video data, social media data, text messages, voice mail data

examples of unstructured data:

data scientist

extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information

infographics

present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format. Exciting and quickly convey a story users can understand.

distributed computing

processes and manages algorithms across many machines in a computing environment. Allows individual computers to be networked together across geographical areas and work together to execute a workload or computing processes as if they were a single computing environment.

multidimensional analysis

slice-and-dice techniques for viewing multidimensional information

behavioral analysis, correlation analysis, exploratory data analysis, pattern recognition analysis, social media analysis, speech analysis, text analysis, web analysis

techniques a data scientist will use to perform big data advanced analytics

fast data

the application of big data analytics to smaller data sets in near-real or real-time in order to solve a problem or create business value. Often associated with business intelligence.

virtualization

the creation of a virtual (rather than actual) version of computing resources, such as an operating system, a server, a storage device, or network resources.

data mining

the process of analyzing data to extract information not offered by the raw data alone. Can also begin at a summary information level (coarse granularity) and progress through increasing levels of detail (drilling down) or the reverse (drilling up). Companies use these techniques to compile a complete picture of their operations, all within a single view, allowing them to identify the trends and improve forecasts.

anomaly detection

the process of identifying rare or unexpected items or events in a data set that do not conform to other items in the data set.

analytics

the science of fact-based decision making. Uses software-based algorithms and statistics to derive meaning from data.

business intelligence dashboards

track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis

data mining tools

use a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making.

advanced analytics

uses data patterns to make forward-looking predictions to explain to the organization where it is headed.

regression model

A data mining model technique for predictions. a statistical process for estimating the relationships among variables These models include many techniques for modeling and analyzing several variables when the focus is on the relationship between a dependent variable and one or more independent variables.

optimization model

A data mining modeling technique for predictions. A statistical process that finds the way to make a design, system, or decision as effective as possible, for example, finding the values of controllable variables that determine the maximal productivity or minimal waste.

forecasting model.

A data mining modeling technique for predictions. Time-series information is time-stamped information collected at a particular frequency. Forecasts are predictions based on time-series information allowing users to manipulate the time series for forecasting activities.

cluster analysis

A data mining technique. a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible. Identifies similarities and differences among data sets. (ex: target marketing based on zip code)

Affinity grouping analysis

A data mining technique. reveals the relationship between variables along with the nature and frequency of the relationships. Many people prefer these algorithms as association rule generators because they create rules to determine the likelihood of events occurring together at a particular time or following each other in a logical progression. Percentages usually reflect the patterns of these events (55% of the time, events A and B occurred together.)

classification analysis

A data mining technique. the process of organizing data into categories or groups for its most effective and efficient use. The primary goal of this is not to explore data to find interesting segments, but to decide the best way to classify records.

data profiling

A data-mining analysis technique. The process of collecting statistics and information about data in an existing source. Insights extracted using this method can determine how easy or difficult it will be to use existing data for other purposes along with providing metrics on data quality.

data replication

A data-mining analysis technique. The process of sharing information to ensure consistency between multiple data sources.

data understanding

A data-mining phase. Analysis of all current data along with identifying any data quality issues. Activities: gather data, describe data, explore data, verify data quality

evaluation

A data-mining phase. Analyze the tends and patterns to assess the potential for solving the business problem. Activities: evaluate results, review process, determine next steps

deployment

A data-mining phase. Deploy the discoveries to the organization for work in everyday business. Activities: plan deployment, monitor deployment, analyze results, review final reports

business understanding

A data-mining phase. Gain a clear understanding of the business problem that must be solved and how it impacts the company. Activities: identify business goals, situation assessment, define data-mining goals, create project plan

data preparation

A data-mining phase. Gather and organize the data in the correct formats and structures for analysis. Activities: select data, cleanse data, integrate data, format data

data modeling

A data-mining phase. apply mathematical techniques to identify trends and patterns in the data. Activities: select modeling technique, design tests, build models

estimation analysis

A data-mining technique. determines values for an unknown continuous variable behavior or estimated future value. These models can predict numeric outcomes based on historical data. (ex: the percentage of students that will graduate high school based on income level.

false

Big data is one of the least promising technology trends occurring today.

data mining, data analysis, data visualization

Business focus areas of big data:

variety

Different forms of structured and unstructured data. Data from spreadsheets and databases as well as from email, videos, photos, and PDFs, all of which must be analyzed.

true

Each layer in big data represents information according to an additional dimension.

False

Estimation analysis is one of the most expensive modeling techniques.

web visits per hour, sales per month, customer service calls per day

Examples of forecasting model:

determine which products to produce given a limited amount of ingredients, choose a combination of projects to maximize overall earnings.

Examples of optimization model:

sensor data, weblog data, financial data, click-stream data, point of sale data, accounting data

Examples of structured data:

True

Improvements in network speed and network reliability have removed the physical limitations of being able to manage massive amounts of data at an acceptable pace.

true

One of the key advantages of performing advanced analytics is to detect anomalies in the data to ensure they are not used in models creating false results.

market basket analysis

One of the most common forms of association detection analysis. Evaluates such items as websites and checkout scanner information to detect the customers' buying behavior and predict future behavior by identifying affinities among consumers' choices of products and services. Frequently used to develop marketing campaigns for cross-selling products and services and for inventory control, shelf-product placement, and other retail and marketing applications.

volume

The scaled of data. Includes enormous volumes of data generated daily. Massive volume created by machines and networks. Big data tools necessary to analyze zettabytes and brontobytes.

veracity

The uncertainty of data, including biases, noise, and abnormalities. Uncertainty of untrustworthiness of data. Data must be meaningful to the problem being analyzed. Must keep data clean and implement processes to keep dirty data from accumulating in the system.

Time-series information

Time-stamped information collected at a particular frequency

true

Traditional bar graphs and pie charts are boring and at best confusing and at worst misleading.

cube

common term for the representation of multidimensional information

algorithms

mathematical formulas placed in software that performs an analysis on a data set

data visualization tools

move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more. Can help uncover correlations and trends in data that would otherwise go unrecognized.

analysis paralysis

occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome. This is a growing problem in the time of big data.


Set pelajaran terkait

Courts and Criminal Procedures Final Exam

View Set

NCLEX- leadership and management

View Set

Accounting 285 Final Exam Tophat Questions

View Set

Chapt 15 Network Troubleshooting

View Set

Nursing Reading Guide - Chapter 19:

View Set

Lewis Chapter 45: Renal and Urological Problems

View Set

Managerial accounting chapter 6

View Set