CIS Chapter #6

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

five common characteristics of high quality information

1. accuracy 2. completeness 3. consistency 4. timeliness 5. uniqueness

collect information from multiple systems in a common location that uses a universal querying tool

A key idea within data warehousing is to:

optimization model

A statistical process that finds the way to make a design, system, or decision as effective as possible; for example, finding the values of controllable variables that determine maximal productivity or minimal waste.

social media analysis

Analyzes text flowing across the Internet, including unstructured text from blogs and messages

web analysis

Analyzes unstructured data associated with websites to identify consumer behavior and website navigation.

text analysis

Analyzes unstructured data to find trends and patterns in words and sentences

outliers

Anomaly detection helps to identify ___________ in the data that can cause problems with mathematical modeling

1. increased flexibility 2. increased information integrity 3. increased scalability and performance 4. increased information security 5. reduced information redundance

Business Advantages of a Relational Database

heirarchal, network, relational (most impt)

DBMS use three primary data models for organizing information:

What is the difference between data governance and data stewardship?

Data governance focuses on enterprisewide policies and procedures, while data stewardship focuses on the strategic implementation of the policies and procedures

coarse granularity; drilling down; drilling up

Data mining can also begin at a summary information level (_____________________) and progress through increasing levels of detail (_____________________) or the reverse (______________________)

large

Data-driven capabilities are especially useful when a firm needs to offer ________ amounts of information, products, or services

https://html1-cluster-e.mheducation.com/smartbook2/data/156737/highlighted_epubmhe/OPS/img/chapter06/bal04716_0607.png

IMPT CHART

age, profession, or income (can include totals, counts, averages, and the like)

One example of a data aggregation is to gather information about particular groups based on specific variables such as:

only one

One primary goal of a database is to eliminate information redundancy by recording each piece of information in __________________ place in the database.

unstructured

Organizational data includes far more than simple structured data elements in a database; the set of data also includes _______________________ data such as voice mail, customer phone calls, text messages, and video clips, along with numerous new forms of data, such as tweets from Twitter

impossible

The complete removal of dirty data from a source is impractical or virtually ______________

costs

The more complete and accurate a company wants its information to be, the more it ________.

speech analysis

The process of analyzing recorded calls to gather information; brings structure to customer interactions and exposes information buried in customer contact center interactions with an enterprise

think of data warehouses as having a more organizational focus and data marts as having a functional focus

To distinguish between data warehouses and data marts:

flat architecture

While a traditional data warehouse stores data in files or folders, a data lake uses a _____________ to store data

managers MIS professionals

________________ typically interact with QBE tools, and _____________________ have the skills required to code SQL

data artist

a business analytics specialist who uses visual tools to help people understand complex data

data broker

a business that collects personal information about consumers and sells that information to other organizations

big data

a collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools

recommendation engine

a data mining algorithm that analyzes a customer's purchases and actions on a website and then uses the data to recommend complementary products e.g. Netflix uses this to analyze each customer's film-viewing habits to provide recommendations for other customers with Cinematch, its movie recommendation system

dimension

a particular attribute of information

extraction, transformation, and loading (ETL)

a process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse. The data warehouse then sends portions (or subsets) of the information to data marts

information cleansing/scrubbing

a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.

prediction

a statement about what will happen or might happen in the future; for example, predicting future sales or employee turnover

regression model

a statistical process for estimating the relationships among variables

data lake

a storage repository that holds a vast amount of raw data in its original format until the business needs it

data map

a technique for establishing a match, or balance, between the source data and the target data warehouse identifies data shortfalls and recognizes data issues; can also alert managers to inconsistencies or help determine the cause and effects of enterprise-wide business decisions

cluster analysis

a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible groups similar attributes together to discover segments or clusters and then examines the attributes and values that define the clusters or segments

1. web browsers are much easier to use than directly accessing the database by using a custom-query tool 2. the web interface requires few or no changes to the database model 3. it costs less to add a web interface in front of a DBMS than to redesign and rebuild the system to support changes 4. easy to manage content b/c website owners can make changes without relying on MIS professionals and users can update a data-driven website with little or no training 5. easy to store large amounts of data b/c data-driven websites can keep large volumes of information organized. Website owners can use templates to implement changes for layouts, navigation, or website structure 6. easy to eliminate human errors b/c data-driven websites trap data-entry errors, eliminating inconsistencies while ensuring that all information is entered correctly

advantages to using data-driven websites

data-driven decision management

an approach to business governance that values decisions that can be backed up with verifiable data the success of this approach is reliant upon the quality of the data gathered and the effectiveness of its analysis and interpretation

data point

an individual item on a graph or a chart

data set

an organized collection of data

encompasses all organizational information, and its primary purpose is to support the performance of managerial analysis tasks

analytical information

the data elements associated with an entity

attributes (columns, fields)

defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer

business rule

enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints

business-critical integrity constraints

comparative analysis

can compare two or more data sets to identify patterns and trends employees can base their decisions on data sets, experience, or knowledge and, preferably a combination of all three

information cube

common term for the representation of multi-dimensional information

data mart

contains a subset of data warehouse information

the person responsible for creating the original website content

content creator

the person responsible for updating and maintaining website content

content editor

compiles all of the metadata about the data elements in the data model

data dictionary

the smallest or basic unit of information e.g. customer's name, address, email, discount rate, preferred shipping method, product name, quantity ordered

data element (data field)

occurs when a company examines its data to determine if it can meet business expectations, while identifying possible data gaps or where missing data might exist

data gap analysis

refers to the overall management of the availability, usability, integrity, and security of company data

data governance

the time it takes for data to be stored or retrieved

data latency

logical data structures that detail the relationships among data elements using graphics or pictures

data models

responsible for ensuring the policies and procedures are implemented across the organization and acts as a liaison between the MIS department and the business

data steward

the management and oversight of an organization's data assets to help provide business users with high-quality data that is easily accessible in a consistent manner

data stewardship

includes the tests and evaluations used to determine compliance with data governance polices to ensure correctness of data

data validation

outlier

data value that is numerically distant from most of the other data points in a set of data

a logical collection of information - gathered from many different operational databases - that supports business analysis activities and decision-making tasks

data warehouse

an interactive website kept constantly updated and relevant to the needs of its customers using a database

data-driven website

maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)

database

creates, reads, updates, and deletes data in a database while controlling access and security

database management system (DBMS)

data visualization

describes technologies that allow users to see or visualize data to transform information into a business perspective

correlation analysis

determines a statistical relationship between variables, often for the purpose of identifying predictive factors among the variables

variety

different forms of structured and unstructured data

an area of a website that stores information about products in a database

dynamic catalog

includes data that change based on user actions

dynamic information

stores information about a person, place, thing, transaction, or event

entity (table)

dirty data

erroneous or flawed data

market basket analysis

evaluates such items as websites and checkout scanner information to detect customers' buying behavior and predict future behavior by identifying affinities among customers' choices of products and services

data scientist

extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information

a primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables

foreign key

1. variety 2. veracity 3. volume 4. velocity

four common characteristics of big data

1. information type 2. information timeliness 3. information quality 4. information governance

four primary traits that determine the value of information

classification analysis

groups similar attributes together to discover segments or clusters and then examines the attributes and values that define the clusters or segments

Where has the business been? Historical perspective offers important variables for determining trends and patterns. Where is the business now? Looking at the current business situation allows managers to take effective action to solve issues before they grow out of control. Where is the business going? Setting strategic direction is critical for planning and creating solid business strategies.

how managers can use BI to answer tough business questions:

Exploratory Data Analysis

identified patterns in data, including outliers, uncovering the underlying structure to understand relationships between the variables.

source data

identifies the primary location where data is collected e.g. invoices, spreadsheets, time sheets, transactions, and electronic sources such as other databases

a broad administrative area that deals with identifying individuals in a system (such as a country, a network, or an enterprise) and controlling their access to resources within that system by associating user rights and restrictions with the established identity

identity management

refers to the extent of detail within the information (fine and detailed or coarse and abstract)

information granularity

occurs when the same data element has different values

information inconsistency

a measure of the quality of information

information integrity

occur when a system produces incorrect, inconsistent, or duplicate data

information integrity issues

the duplication of data, or the storage of the same data in multiple places

information redundancy

rules that help ensure the quality of information; the database design needs to consider these

integrity constraints

focuses on how individual users logically access information to meet their own particular business needs

logical view of information

the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems

master data management (MDM)

algorithms

mathematical formulas placed in software that performs an analysis on a data set

provides details about data e.g. matadata for an image could include size, resolution, and date created

metadata

data visualization tools

move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more

analysis paralysis

occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome

deals with the physical storage of information on a storage device

physical view of information

forecasting model

predictions based on time-series information allowing users to manipulate the time series for forecasting activities

infographics (information graphics)

presents the results of data analysis, displaying the patterns, relationships, and trends in a graphical format

a field (or group of fields) that uniquely identifies a given entity in a table

primary key

to combine information, more specifically, strategic information, throughout an organization into a single repository in such a way that the people who need that information can make decisions and undertake business analysis

primary purpose of a data repository:

distributed computing

processes and manages algorithms across many machines in a computing environment

helps users graphically design the answer to a question against a database

query-by-example (QBE) tool

immediate, up-to-date information

real-time information

provide real-time information in response to requests

real-time systems

a collection of related data elements

record

allows users to create, read, update, and delete data in a relational database

relational database management system

stores information in the form of logically related two-dimensional tables

relational database model

rules that enforce basic and fundamental information-based constraints

relational integrity constraints

a central location in which data is stored and managed

repository

1. business understanding 2. data understanding 3. data preparation 4. data modeling 5. evaluation 6. deployment

six primary phases in the data mining process

Data warehouses go even a step further by __________________ information EX?

standardizing Gender, for instance can be referred to in many ways (Male, Female, M/F, 1/0), but it should be standardized on a data warehouse with one common way of referring to each data element that stores gender (M/F).

includes fixed data incapable of change in the event of a user action

static information

business rule

stating that merchandise returns are allowed within 10 days of purchase is an example of a ________________________.

asks users to write lines of code to answer questions against a database

structured query language (SQL)

velocity

the analysis of streaming data as it travels around the internet

fast data

the application of big data analytics to smaller data sets in near-real or real-time in order to solve a problem or create business value

pattern recognition analysis

the classification or labeling of an identified pattern in the machine learning process

data aggregation

the collection of data from various sources for the purpose of data processing

virtualization

the creation of a virtual (rather than actual) version of computing resources, such as an operating system, a server, a storage device, or network resources

data mining

the process of analyzing data to extract information not offered by the raw data alone

data profiling

the process of collecting statistics and information about data in an existing source

anomoly detection

the process of identifying rare or unexpected items or events in a data set that do not conform to other items in the data set

data replication

the process of sharing information to ensure consistency between multiple data sources

volume

the scale of data

analytics

the science of fact-based decision making; uses software-based algorithms and statistics to derive meaning from data

veracity

the uncertainty of data, including biases, noise, and abnormalaties

1. optimization model 2. forecasting model 3. regression model

three data mining modeling techniques for predictions:

data, discovery, deployment

three elements of data mining:

1. data mining 2. data analysis 3. data visualization

three focus areas business are using to dissect, analyze, and understand organizational data

time-series information

time-stamped information collected at a particular frequency

business intelligence dashboards

track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis

encompasses all of the information contained within a single business process or unit of work, and its primary purpose is to support daily operational tasks

transactional information

transactional and analytical

two primary types of information

1. relational 2. business critical

two types of integrity constraints

data mining tools

use a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making

data quality audits

used to determine the accuracy and completeness of a firm's data

behavioral analysis

using data about people's behaviors to understand intent and predict future actions.

competitive monitoring

where a company keeps tabs of its competitor's activities on the web using software that automatically tracks all competitor website activities such as discounts and new products


Kaugnay na mga set ng pag-aaral

Chapter 7: The American Revolution

View Set

Rn online resource - Pregnancy-Induced Hypertension

View Set

BECN Money, Banking, Finance (Ch.10.2 & 10.4) Ch. 13 5th Ed.

View Set

Anatomy & Physiology Test from AMA

View Set