ISOM210_6B
Data warehouses go even a step further by standardizing information. Gender, for instance can be referred to in many ways (Male, Female, M/F, 1/0), but it should be standardized on a data warehouse with one common way of referring to each data element that stores gender (M/F).
Standardization of data elements allows for greater accuracy, completeness, and consistency and increases the quality of the information in making strategic business decisions.
Data mart
contains a subset of data warehouse information.
Machine-generated data
created by a machine without human intervention. Machine-generated structured data includes sensor data, point-of-sale data, and web log (blog) data.
Data visualization
describes technologies that allow users to see or visualize data to transform information into a business perspective.
Data scientist
extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information.
Advanced analytics
focuses on forecasting future trends and producing insights using sophisticated quantitative methods, including statistics, descriptive and predictive data mining, simulation, and optimization.
Structured data
has a defined length, type, and format and includes numbers, dates, or strings such as Customer Address. Structured data is typically stored in a traditional system such as a relational database or spreadsheet and accounts for about 20 percent of the data that surrounds us.
Data artist
is a business analytics specialist who uses visual tools to help people understand complex data.
Big data
is a collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools. The four common characteristics of big data are detailed in Figure 6.28. Big data requires sophisticated tools to analyze all the unstructured information from millions of customers, devices, and machine interactions.
Data warehouse
is a logical collection of information, gathered from many operational databases, that supports business analysis activities and decision-making tasks.
Extraction, transformation, and loading (ETL)
is a process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse. The data warehouse then sends portions (or subsets) of the information to data marts.
Information cleansing or scrubbing
is a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.
Human-generated data
is data that humans, in interaction with computers, generate. Human-generated structured data includes input data, click-stream data, or gaming data.
Dirty data
is erroneous or flawed data
Unstructured data
is not defined, does not follow a specified format, and is typically free-form text such as emails, Twitter tweets, and text messages. Unstructured data accounts for about 80 percent of the data that surrounds us.
Information cube
is the common term for the representation of multidimensional information.
Data mining
is the process of analyzing data to extract information not offered by the raw data alone.
Data visualization tools
move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more.
Analysis paralysis
occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome.
Infographics
present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format.
Distributed computing
processes and manages algorithms across many machines in a computing environment.
Data quality audits
to determine the accuracy and completeness of its data. Most organizations determine a percentage of accuracy and completeness high enough to make good decisions at a reasonable cost, such as 85 percent accurate and 65 percent complete.
Business intelligence dashboards
track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis.
Data-mining tools
use a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making.
Here are a few examples of how managers can use BI to answer tough business questions:
•Where has the business been? Historical perspective offers important variables for determining trends and patterns. •Where is the business now? Looking at the current business situation allows managers to take effective action to solve issues before they grow out of control. •Where is the business going? Setting strategic direction is critical for planning and creating solid business strategies.