Chapter 9 MIS

Ace your homework & exams now with Quizwiz!

cluster analysis

Unsupervised data mining using statistical techniques to identify groups of entities that have similar characteristics. A common use for cluster analysis is to find groups of similar customers in data about customer orders and customer demographics.

MapReduce

a technique for harnessing the power of thousands of computers working in parallel a 2 step technique for which massive data sets are analyzed leveraging 1000's of computers workings together by breaking down said data sets into smaller groups than combining the results Map Phase: Google search log broken into thousands of pieces, and hundreds or thousands of independent processors search these pieces for something of interest Reduce Phase: results combined; result is a list of all the terms searched for on a given day and the count of each Written in Java.

"Any" device

computer mobile devices office and other applications cloud services to anything

project management

create a partnership program between PRIDE competitors and local health clubs expand geographically

Acquire Data

first step in BI process obtain, cleanse, organize and relate, catalog

KM benefits

improve process quality and increase team strength

BigData Analysis

involve both reporting and data mining techniques

KM goals

Enable employees to use organization's collective knowledge. BUT: Not many companies can afford

RFM Example

How recently (R) a customer has ordered How frequently (F) a customer ordered How much money (M) the customer has spent

confidence

In market-basket terminology, the probability estimate that two items will be purchased together.

informing

In what ways are clients using the new system? How do sales compare to our sales forecast?

Hadoop

Open-source program supported by Apache Foundation2. Manages thousands of computers. Implements MapReduce. written in Java; can be run from server farms or in the cloud supported by Amazon.com as part of the EC3 cloud Query language entitled Pig technical skills needed to run and use

Components of BI

Operational DBMS, Social data, Purchased data, and employee knowledge connected to business intelligence application, which is connected via business intelligence to knowledge workers

Email or collaboration tool

Report Type:Static, Push Options:manual, Skill level needed: Low

BI Software

A Crowded Space •Epicor •Sisense •SalesForce Analytics Cloud •Birst •QlikView •Looker •Datameer •Board All-in-One •Infor •IQMS •DOMO •Pentaho •Yellowfin •MicroStrategy •TIBCO Plus IBM Oracle Microsoft SAP

clickstream data

Data collected about user behavior and browsing patterns by monitoring users' activities when they visit a Web site.

BI application

RFM application, OLAP, other reports, market basket, decision tree, other data mining, context indexing, RSS feed, expert system

variety

different forms of data

Content Management System (CMS)

information systems that support the management and delivery of documents including reports, web pages, and other expressions of employee knowledge typical users are companies that sell complicated products and want to share their knowledge of those products with employees and customers

Publish results

last activity in BI process the process of delivering business intelligence to the knowledge workers who need it print, web servers, report servers, automation divided into Push Publishing and Pull Publishing can mean placing BI results on servers for publication to knowledge workers over the Internet or other networks or making results available via web service for use by other applications, or creating PDFs or powerpoint presentations for communicating to colleagues or management, or reporting results to management in a team meeting

5 basic reporting operations

1. Sorting 2. Filtering 3. Grouping 4. Calculating 5. Formatting not particularly sophisticated, but can be accomplished using SQL and basic HTML or a simple writing tool used to produce complex and highly useful reports ex. RFM analysis and online analytical processing

disadvantages of expert systems

1. They are difficult and expensive to develop, requiring many labor hours from experts in domain under study and designers of expert systems 2. Difficult to maintain. 3. Unable to live up to their high expectations.

unsupervised data mining

Apply application to data, observe results, and create hypothesis AFTER the analysis (Cluster Analysis, Market-basket Analysis, Decision Tree) cluster analysis findings obtained solely by data analysis

static reports

BI documents that are fixed at the time of creation and do not change; mostly published as PDF documents; little skill needed; just creating content, and the publisher attaches it to an email or puts it on the web or a sharepoint site

dynamic reports

BI documents that are updated at the time they are requested; publishing requires the BI application to access a database or other data source at the time the report is delivered to the user, requiring high skills

Entertainment

BI produced from data on certain habits determines what people actually want Video - Legendary Pictures, "Persuade-ables", ie Godzilla & Analytics

Knowledge Management (KM)

Creating value from intellectual capital and sharing knowledge with those who need that capital, such as with employees, customers, and other partners Preserving organizational memory Scope of KM same as SM in hyper-social organizations enabled employees to better achieve an organization's strategy, solve problems more quickly, and accomplish work in less time and other resources

Defining elements

Data sets are at least a Petabyte in size, and usually larger Data is generated rapidly (constantly and from many sources) Data consists of structured data, free-form text, log files, graphics, audio and video

lift

In market-basket terminology, the ratio of confidence to the base probability of buying an item. Lift shows how much the base probability changes when other products are purchased. If the lift is greater than 1, the change is positive; if it is less than 1, the change is negative

Acquire, Analyze, Publish

Process of obtaining, cleaning, organizing, relating and cataloging source data

BI as a publishing challenge

Publish Process of delivering BI to those who need it Data Visualization

Web Server

Report type: static/dynamic Push options: Alert/RSS Skill level needed:low for static, high for dynamic

SharePoint

Report type: static/dynamic Push options: Alert/RSS, workflow Skill level needed:low for static, high for dynamic

If/Then Rules

Statements that specify that if a particular condition exists, then a particular action should be taken. Used in different ways, by both expert systems and decision tree data mining.

BI publishing alternatives

Static Reports Dynamic Reports *Pull options are the same *Push options vary email or collaboration tool web server sharepoint BI server

BI analysis

The process of creating business intelligence. The four fundamental categories of BI analysis are reporting, data mining, BigData, and knowledge management.

deciding

Which competitions generate the most ad revenue? Develop more of the best competitions Which drones and related equipment are in need of maintenance?

reporting application

a BI application that inputs data from one or more sources and applies reporting operations to that data to produce business intelligence

BI server

a Web server application that is purpose-built for the publishing of business intelligence Report type: dynamic Push options: Alert/RSS, subscription Skill level needed: high

Data Marts

a data collection, smaller than the data warehouse, that addresses the needs of a particular department or functional area of the business -Subset of data warehouse -Summarized or highly focused portion of firm's data for use by specific population of users -Typically focuses on single subject or line of business divided into data and BI tools for analysis and management ie retail store users obtain data pertaining to a particular business function from the data warehouse, but do not have the data management expertise that data warehouse employees have though they are knowledgeable analysts for a given business function

Preserving organizational memory

capturing and storing lessons learned and best practices of key employees

Using Business Intelligence to find candidate parts

create a team to examine past sales data to determine which part designs can be sold by identifying quality parts and compute how much revenue potential those parts represent obtain an extract of sales data from IS department, then create five criteria for parts that might qualify for this new program

data acquisition

the process of obtaining, cleaning, organizing, relating, and cataloging source data

cross-selling

the sale of related products to customers based on salesperson knowledge, market-basket analysis, or both

business intelligence application

the software component of a BI system analyze data through reporting, data mining, BigData, and Knowledge management divided into BI data source, BI application, BI application result

too many data points

too many rows of data = NOT HELPFUL! In order to meaningfully analyze such data, we need to reduce the amount of data! One good solution to this problem is statistical sampling.

veracity

uncertainty of data

subscriptions

user requests for particular BI results on a particular schedule or in response to particular events

The "V's" of Big Data

volume, variety, velocity, veracity

drill down

with an OLAP report, to further divide the data into more detail

reporting applications

•Create meaningful information from disparate data sources. •Deliver information to user on time.

loan portfolio

a group of loans

velocity

analysis of streaming data

use of BI

project management, problem solving, deciding, informing

Too many attributes

too many columns; can be problematic

Perform Analysis

reporting, data mining, Big Data, and knowledge management

Resistance to Knowledge Sharing

-Employees can be reluctant to exhibit their ignorance out of fear of appearing incompetent, employees may not submit entries to blogs or discussion groups; such reluctance can sometimes be reduced by the attitude and posture of managers one strategy for employees in this situation is to provide private media that can be accessed only by a smaller group of people who have an interest in a specific problem, who discuss the issue in a less-inhibiting form -Employee competition

Hyper-organization theory

-framework for understanding KM -focus shifts from knowledge and content to fostering authentic relationships among knowledge creators and users

Components of a Data Warehouse

-physical storage location for data- the warehouse -software to copy original databases and transfer them to warehouse -interactive software to allow processing of inquiries -a directory for the categories of information kept in the warehouse operational databases, other internal data, external data connect to data extracting/cleaning/preparation programs, which connect to database warehouse (DBMS) (stores prepared data, and extracts and provides data to BI applications), which interact with data warehouse metadata (which stores metadata concerning the data [ie source, format, assumptions and constraints, etc])and data warehouse database and business intelligence tools, and business intelligence tools interact with business intelligence users

Drawbacks

1. Difficult and expensive to develop. •Labor intensive. •Ties up domain experts. 2.Difficult to maintain. •Changes cause unpredictable outcomes. •Constantly need expensive changes. 3.Don't live up to expectations. •Can't duplicate diagnostic abilities of humans

dimension

A characteristic of an OLAP measure. Purchase date, customer type, customer location, and sales region are examples of dimensions.

market basket analysis

A data mining technique for determining sales patterns. A market-basket analysis shows the products that customers tend to buy together. Can estimate the probability that a customer will purchase an item

Online Analytical Processing (OLAP)

A dynamic type of reporting system that provides the ability to sum, count, average, and perform other simple arithmetic operations on groups of data. Such reports are dynamic because users can change the format of the reports while viewing them. has measures and dimensions

OLAP cube

A presentation of an OLAP measure with associated dimensions. The reason for this term is that some products show these displays using three axes, like a cube in geometry. Same as OLAP report. data taken from a sample database provided with SQL server, can be displayed in many ways with Excel, format can be altered, can change order of dimensions, drill down into the data, and view data from different perspectives; can come with a cost, including substantial computing power to do necessary calculating, grouping, and sorting for dynamic displays standard commercial DBMS products do have the functions and features required to create OLAP reports but are not designed for such work; are instead designed to provide rapid response to transaction-processing applications

expert system shell

A program in an expert system that processes a set of rules, typically many times, until the values of the variables no longer change, at which point the system reports the results. Process IF side of rules, report values of all variables, knowledge gathered from human experts

RFM analysis

A technique readily implemented with basic reporting operations to analyze and rank customers according to their purchasing patterns. •To produce an RFM score: •Sort customer purchase records by date of most recent (R) purchase. •Divide sorts into quintiles. •Give customers a score of 1 to 5. •Process is repeated for Frequently and Money (amount spent on orders) recency, frequency, monetary

regression analysis

A type of supervised data mining that estimates the values of parameters in a linear equation. Used to determine the relative influence of variables on an outcome and also to predict future values of that outcome.

the Singularity

According to Ray Kurzweil, the point at which computer systems become sophisticated enough that they can create and adapt their own software and hence adapt their behavior without human assistance.

Data Brokers

Acxiom "Database contains information on about 500M consumers...with about 1500 data points per person"

Challenges of Content Management

Databases are huge Content dynamic Documents do not exist in isolation to each other (Documents refer one to another, and when one changes, others must change as well, CMS must maintain linkages among documents so that content dependencies are known and used to maintain document consistency) Contents are perishable (documents become obsolete and need to be altered, removed, or replaced) In many languages

Functions of a data warehouse

Extract data from operational, internal and external databases Cleanse data Organize, relate data warehouse Catalog data using metadata

data warehouse

a facility for managing an organization's BI data; includes data purchased from outside sources, which is not unusual or concerning from a privacy standpoint; distributor; takes data from data manufacturers (operational systems and other sources), cleans and processes the data, and locates the data; data analysts that work there are experts at data management, data cleaning, data transformation, data relationships, etc, but are not usually experts in a given business function

supervised data mining

a form of data mining in which data miners develop a model prior to the analysis and apply statistical techniques to data to estimate values of the parameters of the model data miners develop a model prior to the analysis and apply statistical techniques to data to estimate parameters of the model regression analysis neural networks equation formed created by regression tool, but considerable skill required to interpret the model's quality which depends on statistical factors

decision tree

a hierarchical arrangement of criteria that predict a classification or a value; an unsupervised data model technique; analyst sets up the computer program and provides the data to analyze, and the decision tree program produces the tree can classify loans by likelihood of default; organizations analyze data from past loans to produce a decision tree that can be converted to loan-decision tree that can be converted to loan-decision rules. A financial institution could use such a tree to assess the default risk on a new loan or sell a group of loans to one another or consider purchasing a loan portfolio to use the results of a decision tree program to evaluate the risk of a given portfolio easy to understand and implement using decision rules. Can work with many types of variables, and deal well with partial data. Organizations can use decision trees by themselves or combine them them with other techniques or, in some cases, to select variables that are then used by other types of data mining tools

neural networks

a popular supervised data mining technique used to predict values and make classifications, such as "good prospect" or "poor prospect"

BigData

a term used to describe data collections that are characterized by huge volume, rapid velocity, and great variety that far exceed those of traditional reporting and data mining "A massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques." Defining elements sets are at least a petabyte in size, and usually larger, generated rapidly, has structured data, free-form text, log files, possible graphics, audio, and video

rich directory

an employee directory that includes not only the standard name, email, phone, and address, but also organizational structure and expertise possible to determine where in an organization works, who is the first common manager between the two people, and what past projects and expertise an individual has, and languages spoken for international organizations particularly useful in large organizations where people with particular expertise are unknown

Data Visualization

any effort to help people understand the significance of data by placing it in a visual context. Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier

Hyper-social KM alternative media

blogs (either public or private, best for defender or belief) discussion groups (including FAQ, either public or private, best for problem solving) wikis( either public or private, best for either) surveys (either public or private, best for problem solving) rich directories (e.g. active directory; private, best for problem solving) standard SM (Facebook, Twitter, etc; public, best for defender of belief) YouTube (public, best for either)

Data Warehouses Versus Data Marts

data producers deliver data to database warehouse (DBMS), which interact with data warehouse metadata and data warehouse database and different types of data marts, which produces design features, analysis, or layout

BI Primary Activities

data sources, acquire data, perform analysis, publish results, use feedback results to get back to data sources or push to knowledge workers, pull from knowledge workers to publish results

push publishing

delivers business intelligence to users without any request from the users; BI results delivered according to schedule or as a result of an event or particular data condition

Possible Problems with Source Data

dirty data (or problematic data), missing values, inconsistent data (ie values difficult to obtain; occur from the nature of business activity), data not integrated, wrong granularity (too fine, not fine enough), too much data (too many attributes, too many data points) although data that is critical for successful operations must be complete and accurate, marginally necessary data need not be

Data Mining Techniques

emerged from the combined discipline of statistics, mathematics, artificial intelligence, and machine learning can be sophisticated or difficult to use well valuable to organizations and some business professionals have become expert in their use unsupervised and supervised

how BI is Used

identifying changes in Purchasing Patterns Entertainment Just-in-Time Medical Reporting

Purchasing patterns

important life events cause custoemrs to change what they buy and, for a short interval, to form new loyalties to new brands Amazon & Predictive Behavior (Many YouTube Videos) Target Video

support

in market-basket terminology, the probability that two items will be purchased together

content management alternatives

in-house custom off-the-shelf public search engine

knowledge workers

individuals valued for their ability to interpret and analyze information include analysts in home office, operations and field personnel who use BI to approve loans, order goods, and decide when to prescribe

Business Intelligence Systems

information systems that process operational, social, and other data to identify patterns, relationships, and trends for use by business professionals and other knowledge workers have five standard components: hardware, software, data, procedures, and people boundaries are blurry

Two Functions of a BI Server

management and delivery BI application provides data to BI server, which interacts with metadata and "any" device through push/pull, which interacts with BI users maintains metadata about the authorized allocation of BI results to users. BI server tracks what results are available, what users are authorized to view those results, and the schedule upon which the results are provided to the authorized users, and adjusts allocations as available results change and users come and go all management data needed by any of the BI servers is stored in metadata. The amount and complexity of such data depends, of course, on the functionality of the BI server BI servers use metadata to determine what results to send to which users and, possibly, on which schedule; expect BI results to be delivered to "any" device

Consumer data that can be purchased

name, address, phone age gender ethnicity religion income education voter registration home ownership vehicles magazine subscriptions hobbies catalog orders marital status, life stage height, weight, hair and eye color spouse name, birth date children's names and birth dates

BI data source

operational data, data warehouse, data mart, content material, human interviews

data sources

operational databases, social data, purchased data, employee knowledge

problem solving

problem is a perceived difference between what is and what ought to be; BI can be used to determine what the problem is as well as what should be how can we save money by rerouting drone flights? how can we increase ad revenue from competitions?

Five Criteria

provided by certain vendors (starting with just a few vendors that had already agreed to make part design files available for sale) purchased by larger customers (individuals and small companies would be unlikely to have 3D printers or the needed expertise to use them) frequently ordered (popular products) ordered in small quantities (3D printing is not suited for mass production) simple in design (easier to 3D print; difficult to evaluate since company doesn't store data on part complexity per se)

Just-in-Time Medical Reporting

provides injection notification services to doctors during exams enter data, software analyzes patient records, and recommends injection prescriptions when needed

Pig

query language platform for large dataset analysis Easy to master. Extensible. Automatically optimizes queries on map-reduce level.

Granularity

refers to the level of detail in the model or the decision-making process; can be too fine or too coarse; too fine data can be made coarser by summing and combining

pull publishing

requires the user to request BI results

Expert Systems

rule-based systems that encode human knowledge in the form of if/then rules created by interviewing human experts in the domain of interest

volume

scale of data

analyze data

second step of BI process; combines data into single table; filters criteria diversely; helps answer certain questions from business intelligence

decision support system

some authors define business intelligence (BI) systems as supporting decision making only, in which case they use this older term as a synonym for decision-making BI systems

business intelligence users

specialists in data analysis

report servers

specialized web servers

Hyper-social knowledge management

the application of social media and related applications for the management and delivery of organizational knowledge resources open airing of product use issues may make traditional marketing personnel uncomfortable, but this KM technique does insert the company in the middle of customer conversations about possible product problems, and, while it does lose control, the organization is at least a party to those conversations

data mining

the application of statistical techniques to find patterns and relationships among data for classification and prediction knowledge discovery in databases

measures

the data item of interest on an OLAP report. It is the item that is to be summed, averaged, or otherwise processed in the OLAP cube. Ex. Total sales, average sales, and average cost

curse of dimensionality

the more attributes there are, the easier it is to build a model that fits the sample data but that is worthless as a predictor

Business Intelligence

the patterns, relationships, and trends identified by BI systems Analytics "Business Intelligence is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insights and decision-making." used everywhere, especially in the realm of Digital Marketing, where is estimated to grow from $12B in 2014 to $120B by 2026, and will exceed IT Budgets


Related study sets

Starting Out With Python Chapter 3

View Set

audit chapter 7: The revenue and collection cycle

View Set

MKTG 380: Topic 5 - Market Research

View Set

PrepU Chapter 42: Loss and Grieving

View Set