MIS Exam #1 Chapter 8
data artist
a business analytics specialist who uses visual tools to help people understand complex data
big data
a collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools
recommendation engine
a data-mining algorithm that analyzes a customer's purchases and actions on a website and then uses the data to recommend complementary products
prediction
a statement about what will happen or might happen in the future
regression model
a statistical process for estimating the relationships among variables. include many techniques for modeling and analyzing several variables when the focus is on the relationship between a dependent variable and one or more independent variable
optimization model
a statistical process that finds the way to make a design, system, or decision as effective as possible, for example, finding the values of controllable variables that determine maximal productivity or minimal waste.
cluster analysis
a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible
behavioral analysis, correlation analysis, exploratory data analysis, pattern recognition analysis, social media analysis, speech analysis, text analysis, web analysis
advanced data analytics
data understanding
analysis of all current data along with identifying any data quality issues
evaluation
analyze the trends and patterns to assess the potential for solving the business problem
social media analysis
analyzes text flowing across the Internet, including unstructured text from blogs and messages
web analysis
analyzes unstructured data associated with websites to identify consumer behavior and website navigation
text analysis
analyzes unstructured data to find trends and patterns in words and sentences
data modeling
apply mathematical techniques to identify trends and patterns in the data
outlier
data value that is numerically distant from most of the other data points in a set of data
deployment
deploy the discoveries to the organization for work in everyday business
correlation analysis
determines a statistical relationship between variables, often for the purpose of identifying predictive factors among the variables
estimation analysis
determines values for an unknown continuous variable behavior or estimated future value
variety
different forms of structured and unstructured data
market basket analysis
evaluates such items as websites and checkout scanner information to detect customers' buying behavior and predict future behavior by identifying affinities among customers' choices of products and services
data scientist
extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information
variety, veracity, volume, velocity
four common characteristics of big data
business understanding
gain a clear understanding of the business problem that must be solved and how it impacts the company
data preparation
gather and organize the data in the correct formats and structures for analysis
exploratory data analysis
identifies patterns in data, including outliers, uncovering the underlying structure to understand relationships between the variables
distributed computing
processes and manages algorithms across many machines in a computing environment
affinity grouping analysis
reveals the relationship between variables along with the nature and frequency of the relationships
sensor data, weblog data, financial data, click-stream data, point of sale data, accounting data
structured data
velocity
the analysis of streaming data as it travels around the internet
fast data
the application of big data analytics to smaller data sets in near-real or real-time in order to solve a problem or create business value
pattern recognition analysis
the classification or labeling of an identified pattern in the machine learning process
data mining
the process of analyzing data to extract information not offered by the raw data alone
speech analysis
the process of analyzing recorded calls to gather information; brings structure to customer interactions and exposes information buried in customer contact center interactions with an enterprise
data profiling
the process of collecting statistics and information about data in an existing source
anomaly detection
the process of identifying rare or unexpected items or events in a data set that do not conform to other items in the data set
classification analysis
the process of organizing data into categories or groups for its most effective and efficient use
data replication
the process of sharing information to ensure consistency between multiple data sources
volume
the scale of data
analytics
the science of fact-based decision making
veracity
the uncertainty of data, including biases, noise, an dabnormalities
data (foundation for data-directed decision making), discovery (process of identifying new patterns, trends, and insights), deployment (process of implementing discoveries to drive success)
three elements of data mining
forecasting model
time-series information is time-stamped information collected at a particular frequency. Forecasts are predictions based on time-series information allowing users to manipulate the time series for forecasting activities.
business intelligence dashboards
track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis
satellite images, photographic data, video data, social media data, text messages, voice mail data
unstructured data
behavioral analysis
using data about people's behaviors to understand intent and predict future actions
optimization model, forecasting model, and regression model
data mining model for techniques
estimation analysis, affinity grouping analysis, cluster analysis, and classification analysis
data mining techniques
cube
common term for the representation of multidimensional information
virtualization
creation of a virtual version of computing resources, such as an operating system, a server, a storage device, or network resource
algorithms
mathematical formulas placed in software that performs an analysis on a data set
data visualization tools
move beyond excel graphs and charts into sophisticated analysis techniques such as pie charts, controls, instruments, maps, time series graphs, and more
analysis paralysis
occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome
infographics (information graphics)
present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format