Chapter 9: Business Intelligence Systems

Ace your homework & exams now with Quizwiz!

difference between BI users and knowledge workers

-BI users: specialists in data analysis -knowledge workers: nonspecialists users of BI results

problems with operational data

-dirty data -missing values -inconsistent data -nonintegrated data -wrong granularity (level of detail) -too much data (too many attributes/data points)

3 primary activities in the BI process

1. acquire data 2. perform analysis 3. publish results

3 disadvantages of expert systems

1. difficult and expensive to develop 2. difficult to maintain 3. unable to live up to the high expectations set by their name *note: few expert systems have been successful

2 reasons for resistance to hyper-social KS

1. employees can be reluctant to exhibit their ignorance (i.e. may not submit blog entries out of fear of appearing incompetent) 2. employee competition *endorsement can be effectivei n curbing these types of resistance, especially if follower up by strong positive feedback

3 alternatives for content management applications

1. in-house custom (expensive) 2. off-the-shelf (more functionality, less expensive) 3. public search engine (i.e. Google, Bing; not everything is publically accessible)

challenges of CMS

1. most databases are huge 2. CMS content is dymanic 3. documents do not exist in isolation from each other (refer to each other, when one changes the others change as well) 4. document contents are perishable (become obsolete, need changing) 5. content is provided in many languages

functions of a data warehouse

1. obtain data 2. cleanse data 3. organize and relate data 4. catalog data

organizational use of BI (heirarchy, top of pyramic is #1)

1. project management 2. problem solving 3. deciding 4. informing note: informing needed to decide; deciding needed prior to problem solving; problem solving necessary for project management

static reports

BI documents that are fixed at the time of creation and do not change; static content requires only low skill i.e. PDF documents

dynamic reports

BI documents that are updated at the time they are requested; publishing this requires BI application to access a database or other data source at the time the report is delivered to the user; dynamic content requires more skill i.e. sales report that is current at the time accessed

content management systems (CMS)

IS that support the manaagement and delivery of documents including reports, Web pages, and other expressions of employee knowledge; users are companies that sell complicated products and want to share their knowledge of those products with employees and customers

Hadoop is written in ______ and originally ran on ______

Java; Linux

synonym for OLAP report

OLAP cube

Hadoop includes a query lanquage titled ______

Pig

T or F: business intelligence is used to predict purchasing patterns & changes in purchasing patterns

T

T or F: placing BI applications on operational servers can dramatically reduce systems performance

T

T or F: publishing media include print as well as online content delivered via Web servers, specialized Web servers known as "report servers," and BI results that are sent via automation to other programs

T

T or F: hyper-organization theory focus moves from the knowledge and content to the fostering of authentic relationships among the creators and the users of that knowledge

T; in other words, they move from controlled processes to messy ones

reminder: the most important part of an IS is...

YOU! the "people" portion

BI server

a Web server application that is purpose-built for the publishing of business intelligence i.e. Microsoft SQL Server Report manager

data mart

a data collection, smaller than a data warehouse, that addresses the needs of a particular department or functional area of the business

data warehouse

a facility for managing an organization's BI data; include data purchased from outside sources

MapReduce

a technique for harnessing the power of thousands of computers working in parallel; BigData is broken into pieces and hundreds of thousands of indepenend processors search these pieces for something of interest

rich directory

an employee directory that includes not only one standard name, email, phone, and address but also organizational structure and expertise

unsupervised data mining

analysts do not create a model or hypothesis before running the analysis; apply a data mining application to the data and observe the results; hypothesis created AFTER the analysis

data mining

application of statistial techniques to find patterns and relationships among data for classification and prediction; also called knowledge discovery in databases (KDD)

dimeision

characteristic of measure

neural networks

common supervised application used to predict values and make classifications such as "good prospect" or "poor prospect" customers; complicated set of possibly nonlinear equations

regression analysis

common supervised technique which measures the effect of a set of variables on another variable

cluster analysis

common unsupervised technique where statistical techniques identify groups of entities that have similar characteristics

confidence

contidional probability estimate that consideres additional probabilities such as the proportion of customers who bought a swim mask that also bought fins; decimal number

supervised data mining

data miners develop a model PRIOR to the analysis and apply statistical techniques to data to estimate parameters of the model

organizations use ______________________ to select variables that are then used by other types of data mining tools

decision trees

push publishing

deliveres BI to users without any request from the users; BI results are delivered according to a schedule or as a result of an event or particular data condition

It is better to have too __________ a granularity than too ____________

fine; coarse

market-basket analysis

first typical data mining tool; an unsupervised data mining technique for determining sales patterns; shows products that customers tend to buy together

drill down

further divide the data into more detail

BI system's 5 components

hardware software data procedures people

cross-selling opportunity

idea that customerss who buy product X also tend to buy product Y; related to market-based applications

business intelligence (BI) systems

information systems that process operational, social, and other data to identify patterns, relationships, and trends for use by business professionals

2 major functions of BI servers

management and delivery management: maintains metadata about the authorized allocation of BI results to users

all management data needed by any of the BI servers is stored in ___________

metadata

Hadoop

open source program supported by the Apache Foudation that implements MapReduce on potentially thousands of computers; began as part of Cassandra; deep technical skills/experts needed to use this

business intelligence

patterns, relationships, trends, and predictions in a BI system

predictive policing

police departments analyze data on past crimes, including location, date, time, day of week, type of crime, and related data, to predict where crimes are likely to occur; they then station police personnel in the best locations for preventing those crimes

publish results

process of delivering business intelligence to the knowlege workers who need it; last activity in the BI process

expert system shells

programs that process a set of rules; typically, it processes rules until no value changes

online analytical processing (OLAP)

provides the ability to sum, count, average, and perform other simple arithemtic operations on groups of data; has measures and dimension

Push and pull options for static/dynamic reports

pull: same for each of the servers push: vary by server type i.e. email/collaboration is manual, while Web servers/SharePoint may create alerts and RSS feeds to have a server push content when content is created/changed (see "subscriptions")

4 fundamental categories of BI analysi

reporting data mining BigData knowledge management

pull publishing

requires the user to request BI results

expert systems

rule-based systems that encode human knowledge in the form of If/Then rules

decision tree

second typical data mining tool; hierarchical arrangement of criteria that predict a classification or a value; easy to understand and implement using decision rules; work with many types of variables as well as partial data

If/Then rules

statements that specify if a particular condition exixts, then to take some actions

OLAP reports often require....

substantial computing power

decision support systems

synonym for "decision-making BI systems"; not used in the rest of the chapter

RFM analysis

technique readily implemented with basic reporting operations used to analyze andrank customers according to their purchasing patterns; order of importance: 1. recent 2. frequent 3. money (amount) spent

hyper-social knowledge management

the application of social media and related applications for the management and delivery of organizational knowledge resources; provides a framework for understanding KM

measures

the datat item of intrest

curse of dimensionality

the more attributes there are, the easier it is to build a model thata fits the sample data but that is worthless as a predictor

the Singularity

the point at which computer systems become sophisticated enough that they can adapt and create their own software and hence adapt their behavior without human assistance

support

the probability that two items will be purchased together

BI analysis

the process of creating business intelligence

knowledge management (KM)

the process of cretaing value from intellectual capital and sharing that knowledge with employees, managers, suppliers, customers, and others who need that capital; goal is to prevent problems

data acquisition

the process of obtaining, cleansing, organizing, relating, and cataloging source data

lift

the ratio of confidence to the base probability of buying an item

BigData

used to describe data collections that are characterized by huge volume, velocity and variety -at least a petabyte in size -generated rapidly -has structured data, free-form text, log files, graphics, audio, video

subscriptions

user requests for particular BI results on a particular schedule or in response to particualr events i.e. daily sales report

could a value of zero in the analysis stage be problematic?

yes; such problematic data is common in data extracts


Related study sets

Pathfinder Bible Experience Esther Chapter 8

View Set

Chapter 4: Validating and Documenting Data

View Set

Science - Unit 1: Nature of Science | Self-Quiz

View Set