MIS chapter 6
The four primary reasons for low-quality information are:
-Online customers intentionally enter inaccurate information to protect their privacy. -Different systems have different information entry standards and formats. -Data-entry personnel enter abbreviated information to save time or erroneous information by accident. -Third-party and external information contains inconsistencies, inaccuracies, and errors
regression model
A statistical process for estimating the relationships among variables. Regression models include many techniques for modeling and analyzing several variables when the focus is on the relationship between a dependent variable and one or more independent variables.
optimization model
A statistical process that finds the way to make a design, system, or decision as effective as possible, for example, finding the values of controllable variables that determine maximal productivity or minimal waste.
Data-driven decision management
An approach to business governance that values decisions that can be backed up with verifiable data.
what is the relationship between entities and attributes
Each record in an entity occupies one row in its respective table.
forecasting model
Time-series information is time-stamped information collected at a particular frequency. Forecasts are predictions based on time-series information allowing users to manipulate the time series for forecasting activities.
identity management
a broad administrative area that deals with identifying individuals in a system (such as a country, a network, or an enterprise) and controlling their access to resources within that system by associating user rights and restrictions with the established identity
data artist
a business analytics specialist who uses visual tools to help people understand complex data
data broker
a business that collects personal information about consumers and sells that information to other organizations
repository
a central location in which data is stored and managed
record
a collection of related data elements
recommendation engine
a data-mining algorithm that analyzes a customer's purchases and actions on a website and then uses the data to recommend complementary products
primary key
a field that uniquely identifies a given record in a table
data warehouse
a logical collection of information—gathered from many different operational databases—that supports business analysis activities and decision-making tasks
foreign key
a primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables
extraction, transformation, and loading (ETL)
a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse. The data warehouse then sends subsets of the information to data marts
Information cleansing or scrubbing
a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.
prediction
a statement about what will happen or might happen in the future, for example, predicting future sales or employee turnover
data lake
a storage repository that holds a vast amount of raw data in its original format until the business needs it
data map
a technique for establishing a match, or balance, between the source data and the target data warehouse
cluster analysis
a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible
dynamic catalog
an area of a website that stores information about products in a database.
data-driven website
an interactive website kept constantly updated and relevant to the needs of its customers using a database
data set
an organized collection of data
structured query language (SQL)
asks users to write lines of code to answer questions against a database
comparative analysis
can compare two or more data sets to identify patterns and trends
data dictionary
compiles all of the metadata about the data elements in the data model
data mart
contains a subset of data warehouse information.
outlier
data value that is numerically distant from most of the other data points in a set of data
physical view of information
deals with the physical storage of information on a storage device
business rule
defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer.
data visualization
describes technologies that allow users to see or visualize data to transform information into a business perspective
estimation analysis
determines values for an unknown continuous variable behavior or estimated future value. Estimation models predict numeric
variety
different forms of structured and unstructured data
transactional information
encompasses all of the information contained within a single business process or unit of work, and its primary purpose is to support daily operational tasks
analytical information
encompasses all organizational information, and its primary purpose is to support the performing of managerial analysis tasks
Business-critical integrity constraints
enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints
dirty data
erroneous or flawed data (see Figure 6.15). The complete removal of dirty data from a source is impractical or virtually impossible
market basket analysis
evaluates such items as websites and checkout scanner information to detect customers' buying behavior and predict future behavior by identifying affinities among customers' choices of products and services
data scientist
extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information
three terms that represent the data elements associated with an entity
fields attributes columns
three terms associated with entities
fields, columns, attributes
logical view of information
focuses on how individual users logically access information to meet their own particular business needs
creates, reads, updates, deletes
four functions that a database management system can perform on data in a database
query-by-example
helps users graphically design the answer to a question against a database
source data
identifies the primary location where data is collected
dynamic information
includes data that change based on user actions
static information
includes fixed data incapable of change in the event of a user action
data validation
includes the tests and evaluations used to determine compliance with data governance polices to ensure correctness of data. Data validation helps to ensure that every data value is correct and accurate.
The Four Primary Traits of the Value of Information
info type info timeliness info quality info governance
information integrity
is a measure of the quality of information
Information redundancy
is the duplication of data, or the storage of the same data in multiple places
data models
logical data structures that detail the relationships among data elements using graphics or pictures
database
maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
algorithms
mathematical formulas placed in software that performs an analysis on a data set
real-time information
means immediate, up-to-date information
data visualization tools
move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more they can help uncover correlations and trends in data that would otherwise go unrecognized
Information integrity issues
occur when a system produces incorrect, inconsistent, or duplicate data.
data gap analysis
occurs when a company examines its data to determine if it can meet business expectations, while identifying possible data gaps or where missing data might exist.
Information inconsistency
occurs when the same data element has different values
analysis paralysis
occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome. In the time of big data, analysis paralysis is a growing problem
Infographics (information graphics)
present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format
distributed computing
processes and manages algorithms across many machines in a computing environment
real-time systems
provide real-time information in response to requests
metadata
provides details about data. For example, metadata for an image could include its size, resolution, and date created
information granularity
refers to the extent of detail within the information (fine and detailed or coarse and abstract)
data governance
refers to the overall management of the availability, usability, integrity, and security of company data.
data steward
responsible for ensuring the policies and procedures are implemented across the organization and acts as a liaison between the MIS department and the business
affinity grouping analysis
reveals the relationship between variables along with the nature and frequency of the relationships
Relational integrity constraints
rules that enforce basic and fundamental information-based constraints
integrity constraints
rules that help ensure the quality of information
entity
stores information about a person, place, thing, transaction, or event
relational database mode
stores information in the form of logically related two-dimensional tables
velocity
the analysis of streaming data as it travels around the internet
fast data
the application of big data analytics to smaller data sets in near-real or real-time in order to solve a problem or create Page 179business value
data aggregation
the collection of data from various sources for the purpose of data processing
virtualization
the creation of a virtual (rather than actual) version of computing resources, such as an operating system, a server, a storage device, or network resources
attributes
the data elements associated with an entity
data stewardship
the management and oversight of an organization's data assets to help provide business users with high-quality data that is easily accessible in a consistent manner.
content creator
the person responsible for creating the original website content
content editor
the person responsible for updating and maintaining website content
Master data management (MDM)
the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems
data mining
the process of analyzing data to extract information not offered by the raw data alone
data profiling
the process of collecting statistics and information about data in an existing source
anomaly detection
the process of identifying rare or unexpected items or events in a data set that do not conform to other items in the data set
classification analysis
the process of organizing data into categories or groups for its most effective and efficient use
volume
the scale of data
data element
the smallest or basic unit of data
data element
the smallest or basic unit of information
data latency
the time it takes for data to be stored or retrieved
veracity
the uncertainty of data, including biases, noise, and abnormality
Time-series information
time-stamped information collected at a particular frequency.
business intelligence dashboards
track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis
T/F: Business-critical integrity constraints tend to mirror the very rules by which an organization achieves success.
true
T/F: databases today scale to exceptional levels, allowing all types of users and programs to perform information processing and information-searching tasks
true
data mining tools
use a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making
why does a database offer increased information security
various security features of databases ensure that individuals have only certain types of access to certain types of information
competitive monitoring
when a company keeps tabs of its competitors' activities on the web using software that automatically tracks all competitor website activities such as discounts and new products