Chapter 6

¡Supera tus tareas y exámenes ahora con Quizwiz!

Attributes

(also called columns or fields) are the data elements associated with an entity.

entity

(also referred to as a table) stores information about a person, place, thing, transaction, or event.

INFORMATION CLEANSING OR SCRUBBING

A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information

Prediction

A statement about what will happen or might happen in the future, for example, predicting future sales or employee turnover.

Regression

A statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables when the focus is on the relationship between a dependent variable and one or more independent variables.

Optimization

A statistical process that finds the way to make a design, system, or decision as effective as possible, for example, finding the values of controllable variables that determine maximal productivity or minimal waste

Cluster analysis

A technique used to divide information sets into mutually exclusive groups such that the members of each group are as close as possible to one another and the different groups are as far apart as possible. Cluster analysis segments customer information to help organizations identify customers with similar behavioral traits, such as clusters of best customers or onetime customers. Cluster analysis also can uncover naturally occurring patterns in information. A great example of using cluster analysis in business is to create target-marketing strategies based on zip codes. Evaluating customer segments by zip code allows a business to assign a level of importance to each segment. Zip codes offer valuable insight into such things as income levels, demographics, lifestyles, and spending habits. With target marketing, a business can decrease its costs while increasing the success rate of the marketing campaign.

Market basket analysis

Analyzes such items as websites and checkout scanner information to detect customers' buying behavior and predict future behavior by identifying affinities among customers' choices of products and services. Market basket analysis is frequently used to develop marketing campaigns for cross-selling products and services (especially in banking, insurance, and finance) and for inventory control, shelf product placement, and other retail and marketing applications.

Social media analytics

Analyzes text flowing across the Internet, including unstructured text from blogs and messages.

Web analytics

Analyzes unstructured data associated with websites to identify consumer behavior and website navigation

Text analytics

Analyzes unstructured data to find trends and patterns in words and sentences. Text mining a firm's customer support email might identify which customer service representative is best able to handle the question, allowing the system to forward it to the right person.

CLASSIFICATION

Assigns records to one of a predefined set of classes

DATA MART

Contains a subset of data warehouse information

Easy to store large amounts of data

Data-driven websites can keep large volumes of information organized. Website owners can use templates to implement changes for lay- outs, navigation, or website structure. This improves website reliability, scalability, and performance.

Easy to eliminate human errors

Data-driven websites trap data-entry errors, eliminating inconsistencies while ensuring that all information is entered correctly

ESTIMATION

Determines values for an unknown continuous variable behavior or estimated future value

AFFINITY GROUPING

Determines which things go together

Inconsistent Data Definitions

Every department had its own method for recording data so when trying to share information, data did not match and users did not get the data they really needed.

Lack of Data Standards

Managers needed to perform cross-functional analysis using data from all departments, which differed in granularities, formats, and levels

Ineffective Direct Data Access

Most data stored in operational databases did not allow users direct access; users had to wait to have their queries or questions answered by MIS professionals who could code SQL

Association detection

Reveals the relationship between variables along with the nature and frequency of the relationships. Many people refer to association detection algorithms as association rule generators because they create rules to determine the likelihood of events occurring together at a particular time or following each other in a logical progression. Percentages usually reflect the patterns of these events. For example, "55 percent of the time, events A and B occurred together," or "80 percent of the time that items A and B occurred together, they were followed by item C within three days."

CLUSTERING

Segments a heterogeneous population of records into a number of more homogeneous subgroups

Poor Data Quality

The data, if available, were often incorrect or incomplete. Therefore, users could not rely on the data to make decisions

DATA MINING

The process of analyzing data to extract information not offered by the raw data alone

Speech analytics

The process of analyzing recorded calls to gather information; brings structure to customer interactions and exposes information buried in customer contact center interactions with an enterprise. Speech analytics is heavily used in the customer service department to help improve processes by identifying angry customers and routing them to the appropriate customer service representative.

Forecasting

Time-series information is time-stamped information collected at a particular frequency. Formally defined, forecasts are predictions based on time-series information. Examples of time-series information include web visits per hour, sales per month, and calls per day. Forecasting data-mining tools allow users to manipulate the time series for forecasting activities

Inadequate Data Usefulness

Users could not get the data they needed; what was collected was not always useful for intended purposes

Easy to manage content

Website owners can make changes without relying on MIS professionals; users can update a data-driven website with little or no training

data artist

a business analytics specialist who uses visual tools to help people understand complex data

Big data

a collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools

record

a collection of related data elements

Structured data

a defined length, type, and format and includes numbers, dates, or strings such as Customer Address

primary key

a field (or group of fields) that uniquely identifies a given record in a table

data warehouse

a logical collection of information, gathered from many operational databases, that supports business analysis activities and decision-making tasks.

Information integrity

a measure of the quality of information

foreign key

a primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables

Extraction, transformation, and loading (ETL)

a process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse

Information cleansing or scrubbing

a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information

relational database management system

allows users to create, read, update, and delete data in a relational data- base

dynamic catalog

an area of a website that stores information about products in a database

data-driven website

an interactive website kept constantly updated and relevant to the needs of its customers using a database

structured query language (SQL)

asks users to write lines of code to answer questions against a database

data dictionary

compiles all of the metadata about the data elements in the data model

Machine-generated data

created by a machine without human intervention. Machine- generated structured data includes sensor data, point-of-sale data, and web log (blog) data

database management system (DBMS)

creates, reads, updates, and deletes data in a database while controlling access and security.

Dynamic information

data that change based on user actions.

Human-generated data

data that humans, in interaction with computers, generate. Human-generated structured data includes input data, click-stream data, or gaming data.

physical view of information

deals with the physical storage of information on a storage device

business rule

defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer

Data visualization

describes technologies that allow users to see or visualize data to transform information into a business perspective

Metadata

details about data

data quality audits

determine the accuracy and completeness of its data

Business-critical integrity constraints

enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints.

Dirty data

erroneous or flawed data

data scientist

extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information

Advanced analytics

focuses on forecasting future trends and producing insights using sophisticated quantitative methods, including statistics, descriptive and predictive data mining, simulation, and optimization

logical view of information

focuses on how individual users logically access information to meet their own particular business needs

query-by-example (QBE) tool

helps users graphically design the answer to a question against a database

Real-time information

immediate, up-to-date information

Static information

includes fixed data incapable of change in the event of a user action

Information redundancy

is the duplication of data, or the storage of the same data in multiple places.

Data models

logical data structures that detail the relationships among data elements by using graphics or pictures

database

maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)

Data visualization tools

move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more

Unstructured data

not defined, does not follow a specified format, and is typically free- form text such as emails, Twitter tweets, and text messages.

Analysis paralysis

occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome

Infographics

present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format

Distributed computing

processes and manages algorithms across many machines in a computing environment.

Real-time systems

provide real-time information in response to requests

Relational integrity constraints

rules that enforce basic and fundamental information-based constraints.

Integrity constraints

rules that help ensure the quality of information

relational database model

stores information in the form of logically related two-dimensional tables.

information cube

the common term for the representation of multidimensional information

Data governance

the overall management of the availability, usability, integrity, and security of company data

content creator

the person responsible for creating the original website content

content editor

the person responsible for updating and maintaining website content

Master data management (MDM)

the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems.

Data mining

the process of analyzing data to extract information not offered by the raw data alone.

data element (or data field)

the smallest or basic unit of information

Information granularity

to the extent of detail within the information (fine and detailed or coarse and abstract)

Business intelligence dashboards

track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis

Data-mining tools

use a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making.

Information integrity issues

when a system produces incorrect, inconsistent, or duplicate data

Information inconsistency

when the same data element has different values


Conjuntos de estudio relacionados

Treasures of the Earth: Power Video Questions // Physics Honors

View Set

Which of the following is true regarding pKa?

View Set

Term 1: Lesson 14 - Booleans and Truth Tables

View Set

Med Surg Ch 42 Coordinating Care for Patients with Adrenal Disorders

View Set