MIS 420 - Exam 1 Study Set
*Plan (Closed Loop BPM Cycle)*
Budgets, plans, forecasts, models, initiatives, targets.
*Business Pressures-Responses-Support Model:* The Business Environment
Businesses operate in a very complex environment due to globalization, which brings about fierce competition. Business Environment Factors can be divided into four major categories: Markets Consumer Demands Technology Societal
Sources of Big Data
Click streams from websites, postings on social media sites, data from traffic - sensors - or weather.
Decision Support System (DSS)
Conceptual framework for a process of supporting managerial decision making, usually by modeling problems and employing quantitative models for solution analysis.
Oper Mart
Created when operational data needs to be analyzed multidimensionally. The data for an oper mart come from an ODS.
Traditional Reporting
Creating a report from scratch, sometimes using several different data sources. This is a timely endeavor.
Major Components of BI
Date Warehouse Business Analytics Business Performance Management (BPM) User Interface (Dashboard)
Integration Technologies That Enable Data and Metadata Integration
Enterprise Application Integration (EAI) Service-Oriented Architecture (SOA) Enterprise Information Integration (EII) Extraction, Transformation, and Load (ETL)
Multidimensional Analysis
Informational Analysis on data which takes into account many different relationships, each of which represents a dimension.
Data Integration
Integration that comprises three major processes: *data access* (the ability to access and extract data from any data store), *data federation* (the integration of business views across multiple data stores), and *change capture* (based on the identification , capture, and delivery of the changes made to enterprise data sources). When these three processes are correctly implemented, data can be accessed and made accessible to an array of ETL, analysis tools, and data warehousing environments.
*Act/Adjust (Closed Loop BPM Cycle)*
Interpret, collaborate, assess, decide, act, adjust, track.
How Can Data Warehouse Tech Help Enable Analytics?
It is separated from the data to day OLTP applications that drive the core business, thereby reducing contention for both operational transactions and analytical queries.
Web-Based Data Warehousing
Three tiered and includes the PC client, Web server, and application server. On the client side, the user needs an Internet connection and a web browser through the familiar GUI.
Features of KPI's
* Strategy * Targets * Ranges * Encodings * Time frames * Benchmarks
10 Factors That Affect Architecture Selection
1. Information interdependence between organizational units. 2. Upper managements information needs. 3. Urgency of need for a data warehouse. 4. Nature of end user tasks 5. Constraints on resources. 6. Strategic view of the data warehouse prior to implementation. 7. Compatibility with existing systems. 8. Perceived ability of the in house IT staff. 9. Technical issues. 10. Social/political factors.
Extraction, Transformation and Load (ETL)
A data warehousing process that consists of *extraction* (reading data from a database), *transformation* (converting the extracted data from its previous form into the form in which it needs to be so that it can be placed into a data warehouse or simply another database), and *load* (putting the data into the data warehouse). ETL tools also transport data between sources and targets, document how data elements (metadata) change as they move between source and target, exchange metadata with other applications as needed, and administer all runtime processes and operations.
Data Model
A description of how data should be used to meet the requirements given by the end user.
Balanced Scorecard
A management system that maps an organizations's strategic objectives into performance metrics in four perspectives: * Financial * Internal Processes * Customers * Learning & Growth These perspectives provide relevant feedback as to how well the strategic plan is executing so that adjustments can be made as necessary.
Lean Manufacturing
Production methodology focused on the elimination of waste or non-value-added features in a process.
Major Components of Data Warehousing Process
*Data Sources*: Data are sourced from multiple independent operational "legacy" systems and possibly from external data providers. *Data Extraction and Transformation*: Data are extracted and properly transformed using custom written or commercial software called ETL. *Data Loading*: Data are loaded into a staging area, where they are transformed and cleansed. The data are then ready to load into the data warehouse and/or data marts. *Comprehensive Database*: The EDW to support all decision analysis by providing relevant summarized and detailed information originating from many different sources. *Metadata*: Maintained so they can be assessed by IT personnel and users. Metadata include software programs about data and rules for organizing data summaries that are easy to index and search, especially with web tools. *Middleware Tools*: Enable access to the data warehouse.
DMAIC
*Define* the goals, objectives, and boundaries of the improvement activity. *Measure* the existing system. *Analyze* the system to identify ways to eliminate the gap between the current performance of the system or process and the desired goal. *Improve* - initiate actions to eliminate the gap by finding ways to do things better, cheaper, or faster. *Control* Institutionalize the improved system by modifying compensation and incentive systems, policies, procedures, manufacturing resource planning, budgets, operation instructions, or other management systems.
Characteristics of Data Warehousing
*Subject Orientated*: Data are organized by detailed subject, such as sales, products, or customer, containing only information relevant for decision support. *Integrated*: Place data from different sources into a consistent format. *Time Variant*: Maintains historical data. Detects trends, deviations, and long-term relationships for frecasting and comparisons, leading to decision making. *Nonvolatile*: Users cannot change or update the data. Obsolete data are discarded, and changes are recorded as new data. *Web Based* *Relational/multidimensional* *Client/Server*: Easy access for end users. *Real Time*: Newer DW's provide real time, or active, data access and analysis capabilities. *Include Metadata*
3 Tier Architecture
*Tier 1: Client Workstation* Client (front end) software, which allows users to access and analyze data from the warehouse. *Tier 2: Application Server* Data acquisition (back end) software , which extracts data from legacy systems and external sources, consolidates and summarizes them, and loads them into the data warehouse. *Tier 3: Database Server* The data warehouse itself, which contains the data and associated software.
Characteristics of Big Data
*Volume* - Big Data doesn't sample, it just observes and tracks what happens. *Velocity* - Often available in real time. *Variety* - Draws from text, images, and audio.
When Choosing an Architecture, Ask These Questions
*Which DBMS should be used?* *Will parallel processing and/or partitioning be used?* Parallel processing enables multiple CPUs to process data warehouse query requests simultaneously and provides scalability. *Will data migration tools be used to load the data warehouse?* *What tools will be used to support data retrieval and analysis?*
Six Sigma
A performance management methodology aimed at reducing the number of defects in a business process to as close to zero Defects Per Million Opportunities (DPMO) as possible.
Data Warehouse
A physical repository where relational data are specially organized to provide enterprise wide, cleansed data in a standardized format.
Data Mining
A process that uses statistical, mathematical, artificial intelligence, and machine learning techniques to extract and identify useful information and subsequent knowledge from large databases.
Dependent Data Mart
A subset that is created from the data warehouse.
Operational Data Store (ODS)
A type of database often used as an interim area for a data warehouse, especially for customer information files (CIF's).
Business Performance Management (BPM)
An advanced performance measurement and analysis approach that embraces planning and strategy. It encompasses 3 components: 1) A set of integrated, closed-loop management and analytic processes that addresses financial as well as operational activities. 2) Tools for businesses to define strategic goals and then measure and manage performance against those goals. 3) A core set of processes, including financial and operational planning, consolidation, and reporting, modeling, analysis and monitoring of key performance indicators (KPI's), linked to organizational strategy. Also Known As - * Corporate Performance Management (CPM) * Enterprise Performance Management (EPM) * Strategic Enterprise Management (SEM).
Enterprise Information Integration (EII)
An evolving tool space that promises real time data integration from a variety of sources, such as relational databases, Web services, and multidimensional databases.
Business Intelligence (BI)
An umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies. *BI means different things to different people* BI's major objective is to enable interactive access to data, to enable manipulation of data, and to give business managers and analysts the ability to conduct appropriate analysis. The process of BI is based on the transformation of data to information (descriptive analytics), then to decisions (predictive analytics?), and finally to actions (prescriptive analytics?).
Performance Measurement Systems
Assist managers in tracking the implementations of business strategy by comparing actual results against strategic goals and objectives.
Enterprise Data Warehouse (EDW)
Large-Scale data warehouse that is used across the enterprise for decision support.
Key Performance Indicator (KPI)
Measure of performance against a strategic objective and goal. There are "outcomes", AKA lagging drivers, and "drivers", AKA leading indicators or value drivers. Operational areas covered by these metrics: * Customer Performance * Service Performance * Sales Operations * Sales plan/forecast
*Strategize (Closed Loop BPM Cycle)*
Mission, values, goals, objectives, incentives, strategy maps.
*Business Pressures-Responses-Support Model:* Organizational Responses
Organizations need to be Reactive, Anticipative, Adaptive, and Proactive using different actions to counter the pressures of today's business environment.
*Monitor/Analyze (Closed Loop BPM Cycle)*
Performance dashboards, reports, analytical tools.
*Enterprise Application Integration (EAI)*
Provides a vehicle for pushing data from source systems into a data warehouse. TEST QUESTION Can be used to facilitate data acquisition directly into a near-real-time data warehouse or to deliver decisions to the OLTP systems
Interactive Reporting
Provides executives, business users, and analysts with intuitive user-directed relational query capabilities. From interactive dashboards in a readily available context, to complex data modeling for fully ad hoc access to source data, Interactive Reporting can pull together data from disparate sources to produce easy to use charts, pivot tables, highly formatted reports, and more.
Cold Chain
Set of safe handling practices that ensure vaccines and immunologicals requiring refrigeration are maintained at the required temperature from the time of manufacture until the time of administration to patients
Centralized Data Warehouse Architecture
Similar to Hub-and-Spoke except that there are no dependent data marts; instead, there is a gigantic enterprise data warehouse that serves the needsof all organizational units.
Independent Data Mart
Small warehouse designed for a strategic business unit (SBU) or a department, but its source is not an EDW. Difficult to get the "one version of the truth". Simplest and the least costly architecture alternative. Poorest architectural solution (pg. 56)
Data Federation
Something that joins data from different sources distributed around the company without actually moving it from the original source.
*Closed Loop BPM Cycle*
Strategize Plan Monitor/Analyze Act and Adjust
2 Tier Architecture
The DSS engine physically runs on the same hardware platform as the data warehouse. More economical than 3 Tier. May have performance problems for large data warehouses that work with data intensive applications for decision support.
Prescriptive Analytics *KNOW THIS*
The area of business analytics dedicated to finding the best course of action for a given situation. Provides a decision or recommendation for a specific action.
Chain of Custody
The chronological documentation or paper trail, showing the seizure, custody, control, transfer, analysis, and disposition of physical or electronic evidence.
Descriptive Analytics *KNOW THIS*
The examination of data or content, usually manually performed, to answer the question "what happened?" characterised by traditional Business Intelligence and visualizations such as pie charts, bar charts, line graphs, tables, or generated narratives.
Hub-and-Spoke Architecture
The most famous data warehousing architecture today. Includes a centralized data warehouse and several dependent data marts. Allows for easy customization of user interfaces and reports. May lead to data redundancy and data latency.
Denormalization
The process of attempting to optimize the read performance of a database by adding redundant data or by grouping data. In some cases, denormalization helps cover up the inefficiencies inherent in relational database software. A relational normalized database imposes a heavy access load over physical storage of data even if it is well tuned for high performance.
Normalization
The process of efficiently organizing data in a database. Two goals of the process: 1. Eliminating redundant data 2. Ensuring data dependencies make sense (only storing related data in a table). This reduces the amount of space a database consumes and ensures that data is logically stored.
Data Warehouse vs Database
To effectively perform analytics, you need a data warehouse. A data warehouse is a database of a different kind: an OLAP (online analytical processing) database. A data warehouse exists as a layer on top of another database or databases (usually OLTP databases). The data warehouse takes the data from all these databases and creates a layer optimized for and dedicated to analytics.
Operational Plan
Translates an organization's strategic objectives and goals into a set of well defined tactics and initiatives, resource requirements, and expected results for some future time period, usually, but not always, a year.
Predictive Analytics *KNOW THIS*
Used to make predictions about unknown future events. Using techniques from data mining, statistics, modeling, machine learning, and artificial intelligence to analyze current data to make predictions about the future.
(Con)Federated Data Warehouse Architecture
Uses all possible means to integrate analytical resources from multiple sources to meet changing needs or business conditions. Involves integrating disparate systems. Supported by middleware vendors that propose distributed query and join capabilities.
Data Mart
Usually smaller than a DW and focuses on a particular subject or department. It is a subset of a DW, typically consisting of a single subject area (eg, Marketing or Operations). Usually linked to each other via some kind of middleware, maintaining the consistency of data across the enterprise.
*OLAP* (Online Analytical Processing)
__________ __________ __________ is an information system that enables the user, while at a PC, to query the system, conduct an analysis, and so-on. The result is generated in seconds.
Big Data Analytics
__________ __________ __________ is the process of examining large data sets to uncover hidden patterns, unknown correlations, market trends, customer preferences, and other useful business info.
*OLTP* (Online Transaction Processing)
____________ __________ __________ is a system that is primarily responsible for capturing and storing data related to day to day business functions. Most operational data in ERP, SCM, and CRM systems are stored in an __________ __________ __________.