Business Intelligence Midterm

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

a user interface

(e.g., dashboard)

•Enterprise data warehouse (EDW)

- A data warehouse for the enterprise.

business performance management

-(BPM) for monitoring and analyzing performance

•Centralized System

-1 super computer, several 'dumb' terminals -Limited geographical range

•Data warehouse

-A pool of data produced to support decision making -A repository of current and historical data of potential interest to mangers throughout the organization

•Dimensional Modeling

-A retrieval-based system that supports high-volume query access

•Operational data stores (ODS)

-A type of database often used as an interim area for a data warehouse that tends to provide fairly recent information -Good for short term mission critical decisions

•Important criteria in selecting an ETL tool

-Ability to read from and write to an unlimited number of data sources/architectures -Automatic capturing and delivery of metadata -A history of conforming to open standards -An easy-to-use interface for the developer and the functional user

•Snowflakes schema

-An extension of star schema where the diagram resembles a snowflake in shape

•OLTP (online transaction processing)

-Capturing and storing data from ERP, CRM, POS, ... -The main focus is on efficiency of routine tasks

•OLAP (Online analytical processing)

-Converting data into information for decision support -Data cubes, drill-down / rollup, slice & dice, ... -Requesting ad hoc reports -Conducting statistical and other analyses -Developing multimedia-based applications

Kimball Model

-Data mart approach (bottom-up)

•The major components of data warehouse concept

-Data sources: -Data extraction and transformation -Data loading -Comprehensive database -Metadata -Middleware tools

•Issues affecting the purchase of an ETL tool

-Data transformation tools are expensive -Data transformation tools may have a long learning curve -It is difficult to measure how the IT organization is doing until it is learned to use in data transformation tools

•Degree of Structuredness (Simon, 1977)

-Decisions are classified as •Highly structured (a.k.a. programmed) •Semi-structured •Highly unstructured (i.e., nonprogrammed)

•Need for current data (1970s)

-Direct access in storage medium server -Data independence -Concurrent access

Inmon Model

-EDW approach (top-down)

•Stone tablets

-First known "data record" dated back to around 4000 BC -Tablets used to record assets and taxes -Archives

• •Motivations for Distributed Databases

-Global competition: immediate and efficient data access -Locate data where it is gathered and/or shared, but also facilitate data sharing to other locations -Data accessibility and reliability

•Sponsorship Level

-Level of sponsorship for the data warehousing initiative

•Punch card systems

-Origins around 1800s -US Census of 1890s -Great for counting, tabulating, etc. -Eventual became more generic

•Types of Control (Anthony, 1965)

-Strategic planning (top-level, long-range) -Management control (tactical planning) -Operational control

•Perceived ability of IT staff

-The extent of the perceived ability of the IT staff in terms of technical skills, successful experiences and confidence in developing data warehouses

•Resource Constraints

-The extent to which IT personnel, business unit personnel, and monetary resources were unavailable for building the data

•Strategic View

-The extent to which implementing a data warehouse was viewed as being important to supporting strategic initiatives

•Information Interdependence

-The extent to which tasks and their outcomes are contingent upon information from one or more other organizational units

•Urgency

-The extent to which there was an urgent need to build the data warehouse

•Task Routineness

-The extent to which users' jobs required non-routine data analyses

•Star schema

-The most commonly used and the simplest style of dimensional modeling -Contain a fact table surrounded by and connected to several dimension tables

•Issues to consider when deciding which architecture to use:

-Which database management system (DBMS) should be used? -Will parallel processing and/or partitioning be used? -Will data migration tools be used to load the data warehouse? -What tools will be used to support data retrieval and analysis?

business analytics

-a collection of tools for manipulating, mining, and analyzing the data in the data warehouse

Output

-attainment of goals

Time

-daily, weekly, monthly, quarterly, or yearly

Measures

-money, sales volume, head count, inventory profit, actual versus forecast

Measure of success

-outputs / inputs

Dimensions

-products, salespeople, market segments, business units, geographical locations, distribution channels, country, or industry

Inputs

-resources

a data warehouse

-with its source data

- 5 levels of maturity:

1)Ad hoc 2)Discovered 3)Managed 4)Optimized 5)Automated

Interpersonal

1. Figurehead 2. Leader 3. Liaison

•Three-tier architecture

1.Data acquisition software (back-end) 2.The data warehouse that contains the data & software 3.Client (front-end) software that allows users to access and analyze data from the warehouse

Ten factors that potentially affect the architecture selection decision

1.Information interdependence between organizational units 2.Upper management's information needs 3.Urgency of need for a data warehouse 4.Nature of end-user tasks 5.Constraints on resources 6.Strategic view of the data warehouse prior to implementation 7.Compatibility with existing systems 8.Perceived ability of the in-house IT staff 9.Technical issues 10.Social/political factors

Evolution of Data Management

1.Manual systems 2.File systems 3.Centralized database systems 4.Distributed database systems 5.Client/server databases -Review of Relational Model 6.Current Data Management Trends 7.Data warehouses/Data marts

Informational

4. Monitor 5. Disseminator 6. Spokesperson

Decisional

7. Entrepreneur 8. Disturbance handler 9. Resource allocator 10. Negotiator

Data Warehouses/DATA MARTS

A data warehouse is an integrated, subject-oriented, time-variant, nonvolatile database that provides support for decision making. - William H. Inmon

Data Mart

A departmental small-scale "DW" that stores only limited/relevant data -Subset of a data warehouse typically consisting of a single subject area -A departmental small-scale "DW" that stores only limited/relevant data

-Independent data mart

A small data warehouse designed for a strategic business unit or a department

-Dependent data mart

A subset that is created directly from a data warehouse

•Enterprise application integration (EAI)

A technology that provides a vehicle for pushing data from source systems into a data warehouse

•Enterprise information integration (EII)

An evolving tool space that promises real-time data integration from a variety of sources, such as relational or multidimensional databases, Web services, etc.

A concise definition of BI (& Analytics)

Business Intelligence & Analytics are the techniques, technologies, systems, practices, methodologies and applications that analyze critical business data to help an enterprise better understand its business and market and make timely business decisions.

Metadata

Data about data. - In a data warehouse, this describes the contents of a data warehouse and the manner of its acquisition and use

Now:

Everybody's Information System (BI)

Then

Executive Information System

•Two-tier architecture

First two tiers in three-tier architecture is combined into one

•Data integration

Integration that comprises three major processes: data access, data federation, and change capture.

DSS-BI Connections

Similarities and differences? Similar architectures, data focus, ... Direct vs. indirect support Different target audiences Commercially available systems versus in-house development of solutions Origination - Industry vs. Academia So, is DSS = BI ?

Characteristics of DWs

Subject oriented Integrated Time-variant (time series) Nonvolatile

Multidimensionality

The ability to organize, present, and analyze data by several dimensions, such as sales by region, by product, by salesperson, and by time (four dimensions)

Hoffer et al. (2007) distinguish the datawarehouse into 3 parts

The data warehouse itself Data acquisition (back-end) software Client (front-end)software

Alternative DW architectures

·Independent data marts architecture ·Data mart bus architecture with linked dimensional data marts ·Hub-and-spoke architecture (corporate information factory) ·Centralized data warehouse architecture ·Federated architecture

Importance of BI

•#1 Priority for CIOs • •Businesses that are evidence-based generally outperform their competition, in terms of growth, by a factor of two or more •Data is an organizational resource like capital, labor, and land -Basis for several existing businesses "You can't be analytical without data, and you can't be really good at analytics without really good data" - Tom Davenport, PhD

Random Access

•1956 IBM 305 RAMAC (Random Access Method of Accounting and Control)

The Architecture of BI

•A BI system has four major components

Technical

•A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format

Analytics Overview

•Analytics? -Something new or just a new name for ... •A Simple Taxonomy of Analytics (proposed by INFORMS) -Descriptive Analytics -Predictive Analytics -Prescriptive Analytics • •Analytics or Data Science?

A Framework for Business Intelligence (BI)

•BI is an evolution of decision support concepts over time •BI systems are enhanced with additional visualizations, alerts, and performance measurement capabilities •The term BI emerged from industry

What is Business Intelligence (BI)?

•BI is an umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies •BI is a content-free expression, so it means different things to different people •BI's major objective is to enable easy access to data (and models) to provide business managers with the ability to conduct analysis •BI helps transform data, to information (and knowledge), to decisions, and finally to action

Organizational Responses

•Be Reactive, Anticipative, Adaptive, and Proactive •Managers may take actions, such as -Employ strategic planning -Use new and innovative business models -Restructure business processes -Participate in business alliances -Improve corporate information systems -Improve partnership relationships

Additional DW Considerations Hosted Data Warehouses

•Benefits: -Requires minimal investment in infrastructure -Frees up capacity on in-house systems -Frees up cash flow -Makes powerful solutions affordable -Enables solutions that provide for growth -Offers better quality equipment and software -Provides faster connections -Enables users to access data from remote locations -Allows company to focus on core business -Meets storage needs for large volume of data

Distributed Databases: Goals

•Create a single-image, transparent environment where the database users and application programs can work at the local level and still be able to access and share data with other sites in a network •Hide performance complexities of the distributed databases from users and application programs

Why Data Warehouses?

•Data integration through the subject areas with a company-wide view of high-quality information (from disparate databases) •Data integration through time to provide a time-perspective view of organizational data •Data integration of internal and external data to provide a complete view of the business performance

MAIN Types of data warehouses

•DataMart •ODS •EDW

Distributed Systems

•Database is not just in one physical location; network connections

The Concept of DSS

•Decision support systems couple the intellectual resources of individuals with the capabilities of the computer to improve the quality of decisions. •DS as an Umbrella Term •Evolution of DS into Business Intelligence

File Systems

•Electronic storage medium invented in 1950s •High level programming languages (Cobol) •Batch processing •Many applications -Accounting •General-ledger •Payroll -Finance •Banking -Operations •Inventory •Shipping Invoices -HR •Contract Management

ETL

•Extract Transform Load

Information Systems Support For Decision Making

•Group communication and collaboration •Improved data management •Managing data warehouses and Big Data •Analytical support •Overcoming cognitive limits in processing and storing information •Knowledge management •Anywhere, anytime support

Managerial Decision Making

•Management is a process by which organizational goals are achieved by using resources

Decision-making Process

•Managers usually make decisions by following a four-step process (a.k.a. the scientific approach) 1.Define the problem (or opportunity) 2.Construct a model that describes the real-world problem. 3.Identify possible solutions to the modeled problem and evaluate the solutions. 4.Compare, choose, and recommend a potential solution to the problem.

Centralized System

•Need for current data (1970s) •Centralized System

ANother alternative?

•One alternative is the hosted warehouse •Hosted warehouse has much of the same •It uses DW functionalities without consume computer resources •Alternative?

Closing the Strategy Gap

•One of the major objectives of computerized decision support is to facilitate closing the gap between the current performance of an organization and its desired performance, as expressed in its mission, objectives, and goals, and the strategy to achieve them.

What are some factors in influencing choice?

•Organizations with high interdependence are more likely to select an EDW than an IDM architecture. •Organizations that view the implementation of a data warehouse as a short-term point solution rather than a strategic infrastructure project are more likely to select an IDM than an EDW architecture. •Organizations with low resource availability are more likely to select an IDM than an EDW architecture. •Organizations that have an IT staff with low perceived ability are more likely to select an IDM than an EDW architecture. •Organizations with upper management sponsorship are more likely to select an EDW than an IDM architecture.

Client-Server Database Architecture

•Servers handle the majority of the processing work -Software installed on a computer that typically has more RAM, CPU, storage -Independent of clients -Common types: Application layer and Database layer -Accept SQL commands through standards such as Open Database Connectivity (ODBC), Java Database Connectivity (JDBC) •Clients send requests to server(s) -Presentation layer

Current Data Management Trends

•Specialized needs/applications •Relax the ACID assumptions -BigData -NoSQL •No need for the full overhead of the relational model •There are different data tools available based on your needs

1. Manual Systems

•The need for data has a long history

A Brief History of BI

•The term BI was coined by the Gartner Group in the mid-1990s •However, the concept is much older -1970s - MIS reporting - static/periodic reports -1980s - Executive Information Systems (EIS) -1990s - OLAP, dynamic, multidimensional, ad-hoc reporting -> coining of the term "BI" - 2010s - Inclusion of AI and Data/Text Mining capabilities; Web-based Portals/Dashboards, Big Data, Social Media, Analytics -2020s - yet to be seen

How to differentiate Datawarehouse from database?

•a data warehouse is a database, albeit with certain characteristics to facilitate its role in decision support •Remember definition, 4 main characteristics: "integrated, time-variant, nonvolatile, subject-oriented repository"

Simple

•a pool of data produced to support decision making

Slice

•a subset of a multidimensional array

Oper marts

•an operational data mart.

Integrated

•data warehouse must place data from different sources into consistent form.

Distribution Transparency

•execute global queries and transactions as though the database is centralized single database, the user can ignore: -Data location -Data fragmentation -Data replication

Time-variant (time series):

•have historical data, provide trends, deviations (may have current status but not necessary)

DSS

•interactive computer-based systems, which help decision makers utilize data and models to solve unstructured problems

Subject oriented

•organized according to the subjects such as business unite(s)

Decision making

•selecting the best solution from two or more alternatives

Nonvolatile

•users cannot change or update the data


संबंधित स्टडी सेट्स

Ms.Girl I know you did not leak 12/25 questions for the test

View Set

CPE Use of English 1 ( 1 word can be used in all 3 sentences)

View Set

Unfinished Nation Chapter 10, U.S. Chapter 10 Review

View Set