Chapter 15 - OLAP in the Data Warehouse
What are the reasons for why you can't build an OLAP system on top of the operational source system adn extract data direcctly?
1. an OLAP system needs transformed and integrated data 2. the operatioanl systems keep historical data only to a limited extent 3. an OLAP system requires data in multidimensional representations 4. there may be multiple OLAP systems needed in the environment
What are the significant characteristics of data in the OLAP system?
1. an OLAP system stores and uses much less data compared to a data warehouse 2. Data in the OLAP system is summarized. You will rarely find data at the lowest level of details as in the data warehouse 3. OLAP data is more flexible for processing and analysis partly because there is much less data tow ork with 4. every instance of the OLAP system in your environment is customized for the purpose that instance of the system serves
Describe the underlying technologies in MOLAP.
1. creation of pre-fabricated data cubes by MOLAP engine. Propriety technology to store multidimensional views in arrays, not tables. high speed matrix data retrieval 2. sparse matrix technology to manage data sparsity in summaries
Describe data storage in ROLAP systems.
1. data stored as relational tables in the data warehouse 2. detailed and light summary data available 3. very large data volumes 4. all data access from the warehouse storage
Describe data storagae in MOLAP.
1. data stored as relational tables in the warehouse 2. various summary data kept in proprietary databases (MDDBs) 3. moderate data volumes 4. summary data access from MDDB, detailed data access from warehouse
What are the steps for preparing OLAP data for a specific group of users?
1. define subset 2. summarize 3. denormalize 4. calculate and derive 5. index
What are the major implementation steps for an OLAP system?
1. dimensional modeling 2. design and building of the MDDB 3. selection of the data to be moved into the OLAP system 4. data acquisition or extraction for the OLAP system 5. data loading into the OLAP server 6. computation of data aggregation and derived data 7. implementation of application on the desktop 8. provision of user training
What are the extra requirements for OLAP not specified by Dr. Codd.
1. drill through to detail level 2. OLAP analysis models 3. treatment of nonnormalized data 4. storing OLAP results 5. missing values 6. SQL interface
What are the basic virtues of OLAP?
1. enables analysts, executives, and managers to gain useful insights from the presentation of data 2. can reorganize metrics along several dimensions and allow data to be viewed from different perspectives 3. supports multidimensional analysis 4. is able to drill down or roll up within each dimesnsion 5. is capable of applying mathematical formulas and calculations to measures 6. provides fast response, faciltiating speed-of-thought analysis 7. complements the use of other information delivery techniques such as data mining 8. improves the comprehension of result sets through visual presentations using graphs and charts 9. can be implemented on the web 10. designed for highly interactive analysis
What are the key administration considerations
1. expectations on what data will be accessed and how 2. selection of the right business dimensions 3. selection of the right filters for loading the data from the data warehouse 4. methods and techniques for moving data into the OLAP system (MOLAP model) 5. choosing the aggregation, summarization, and recalculation 6. developing application programs using the proprietary software of the OLAP vendor 7. size of the multidimensional database 8. handling of the sparse-matrix feature of multidimensional structures 9. drill down to the lowest level of detail 10. drill through to the data warehouse or to the source systems 11. drill across among OLAP system instance 12. access and security privileges 13. backup and restore facilities
Describe functions and features of MOLAP.
1. faster access 2. large library of functions for complex calculations 3. easy analysis irrespective of the number of dimensions 4. extensive drilld-own and slice and dice capabilities
What are the benefits of OLAP systems?
1. increased productivity of business managers, executives, and analysts 2. inherent flexibility of OLAP systems means that users may be self-sufficient in running their own analysis without IT assistance 3. benefit for IT developers bc using software specifically designed for the system development results in faster delivery of applications 4. self-sufficiency for users, resulting in reduction in backlog 5. more efficient operations through reducing times on query executions and in network traffic 6. ability to model real-world challenges with business metrics and dimensions 7. faster delivery of applications following from the previous benefits
Describe the functions and features of ROLAP.
1. known environment and availability of many tools 2. limitations on complex analysis functions 3. drill-through to lowest level easier. Drill-across not always easy
What are the two key issues when considering MOLAP model running under MDDBMS?
1. lack of standardization 2. scalability (good for summary, but not detailed)
What are hte most fundamental characteristics of OLAP systems?
1. let business users have a multidimensional and logical view of the data in the data warehouse 2. facilitate interactive query and complex analysis for the users 3. allow users to drill down for greater details or roll up for aggregation of metrics along a single business dimension or across multiple dimensions 4. provide the ability to perform intricate calculations and comparisons 5. present results in a number of meaningful ways, including charts and graphs
what are a few broad guidelines for the evaluation fo OLAP products?
1. let your apps and users drive selection 2. determine scalability 3. consider how easy to administer it is 4. performance and flexibility are key 5. don't worry about difference between ROLAP and MOLAP
What are the basic features of OLAP?
1. multidimensional analysis 2. drill down or roll up 3. multiple view modes 4. consistent performance 5. navigation in and out of details 6. easy scalability 7. fast response times for interactive queries 8. slice and dice or rotation 9. time intelligence
What are the initial twelve guidelines for an OLAP system?
1. multidimensional conceptual view 2. transparency 3. accessibility 4. consistent reporting performance 5. client/server architecture 6. generic dimensionality 7. dynamic sparse matrix handling 8. multiuser support 9. unrestricted cross-dimensional operations 10. intuitive data manipulation 11. flexible reporting 12. unlimited dimension and aggregation levels
What is the checklist for rating OLAP tools against eachother?
1. mutlidimensioanl representation of data 2. aggregation, summarization, precalculation, and derivations 3. formulas and complex calculations in an extensive library 4. cross-dimensional calculations 5. time intelligence such as year-to-date, current and past fiscal periods, moving averages, and moving totals 6. pivoting, cross-tabs, drill-down, and roll-up along single or multiple dimensions 7. interface of OLAP with applications and software such as spreadsheets, proprietary client tools, third-party tools, and 4GL environments
What are the advanced features of OLAP?
1. powerful calculations 2. drill through across dimensions or details 3. derived data values through formulas 4. cross-dimensional calculations 5. sophisticated presentation & displays 6. application of alert technology 7. pre-calculation or pre-consolidation 8. collaborative decision making 9. report generation with agent technology
What is a list of the typical calculations that get included in query requests?
1. roll ups to provide summaries and aggregations along the hierarchies of the dimensions 2. drill downs from the top level to the lowest along the hierarchies of the dimensions, in combinations among the dimensions 3. simple calculations, such as computation of margins (sales minus costs) 4. share calculations to compute the percentage of parts to the whole 5. algebraic equations involving key performance indicators 6. moving averages and growth percentages 7. trend analysis using statistical methods
What are the three distinct characteristics fo ROLAP?
1. supports all of the basic OLAP features and functions 2. stores data in a relational form 3. supports some form of aggregation
Report writers provide two key functions:
1. the ability to point and click for generating and issuing SQL calls 2. the capability to format the output repots
how does local hypercubing work?
1. the user issues a query 2. the results of the query get stored in a small, local, multidimensional database 3. the user performs analysis against this local database 4. if additional data is required to continue the analysis, the user issues another query and the analysis continues
Describe the underlying technologies of ROLAP.
1. use of complex SQL to fetch data from warehouse. 2. ROLAP engine in analytical server creates data cubes on the fly. 3. multidimensional views by presentation layer
What is the guidance on when to know if you should separate platforms for data warehouses and OLAP systems?
1. when the size and usage of the main data warehouse escalates and requires all of the resources on the platform 2. if too many departments need the OLAP system 3. if the daily transactions for the data warehouse disrupt the stability and performance of the OLAP system 4. decentralized enterprises with OLAP users spread geographically 5. if users of one instance of OLAP want to stay away from users of another 6. if the chosen OLAP tools need a different configuration from the platform of the main data warehouse
static summary data
Most of the OLAP summary data is static. This is the data summarized from the data retrieved from the data warehouse.
Describe the ROLAP model.
OLAP engine resides on the desktop. Prefabricated multidimensional cubes are not created beforehand and stored in special databases. the relational data is presented as virtual multidimensional data cubes
Local hypercubing is a variation of _______________ provided by vendors.
ROLAP
dynamic summary data
This type of summary data is very rare in the OLAP environment although this happens because of new business rules.
On-line analytical processing
a category of software technology that enables analysts, managers, and executives to gain insight into data through fast, consistent, interactive access in a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of hthe enterprise as understood by the user
As growth in a data warehouse happens you may want to think about moving yoru OLAP system to ____
a separate platform
unlimited dimensions and aggregation levels
accommodate at least 15, preferable 20, data dimensions within a common analytical model. each of these generic dimensions must allow a practically unlimited number of user-defined aggregation levels within any given consolidation path
dynamic sparse matrix handling
adapt the physical schema to the specific analytical model being created and loaded that optimizes sparse matrix. when encountering a sprase matrix, the system must be able to dynamically deduce the distribution of the data and adjust the storage and access to achieve and maintain consistent level of performance
drill-through to detail level
allow a smooth transition from the multidimensional, preaggregated database to the detail record level of the source data warehouse repository
Multidimensional views are inherently representative of
any business model
In multidimensional analysis, decision makers must be able to analyze data along
any number of dimensions, at any level of aggregation, with the capability of viewing results in a variety of ways
index
choose those attributes that are appropriate for marketing to build indexes
denormalize
combine relational tables in exactly the same way the marketing department needs denormalized data
An effective analytical processing environment must not only be fast and flexible, but it must also support ____________
complex and powerful calculations
client/server architecture
conform the system to the principles of clinet/server architecture for optimum performance, flexibility, adaptability, and interoperability. make the server component sufficiently intelligent to enable various clients to be attached with a minimum of effort and integration progamming
Whether a user's request is for monthly sales of all products along all geographical regions or for year-to-date sales in a region for a single product, the query and analysis system must have____________
consistent response times.
OLAP data tends to be more __________-, whereas data in teh data warehouse serves corporate-wide needs
departmentailzed
DOLAP
desktop online analytical processing. DOLAP is meant to provide portability to users of online analytical processing. multidimensional datasets are created and transferred to the desktop machine, requiring only the DOLAP software to exist on that machine (variation of ROLAP)
transient detailed data
detailed data brought in from the data warehouse repository on temporary, one-time basis for special purpsoes
permanent detailed data
detailed data retrieved from the data warehouse repository and stored in the OLAP system
Storing OLAP results
do not deploy write-capable OLAP tools on top of transactional systems
what is drill-across?
drill across to another OLAP instance
What is drill-through
drill through to detail
intuitive data manipulation
enable consolidation path reorientation (pivoting), drill down and roll up, and other manipulations to be accomplished intuitively and directly via point-and-click and drag-and-drop actions on the cells of the analytical model. avoid the use of a menu or multiple trips to a user interface
generic dimensionality
ensure that every data dimension is equivalent in both structure and operational capabilities. have one logical structure for all dimensions. the basic data structure or the access techniques must not be biased toward any single data dimension
consistent reporting performance
ensure that the users do not experience any significant degradation in reporting performance as the number of dimensions or the size of the database increases
The OLAP system must map its own logical schema to the ________________ and perform any necessary transformations.
heterogeneous physical data stores
HOLAP
hybrid online analytical processing. this model attempts to combine the strengths and featurs of ROLAP and MOLAP
How can you do multidimensional analysis when there are more than three dimensions?
hypercubes
calculate and derive
if some calculations and derivations of the metrics are department-specific use those
missing values
ignore missing values, irrespective of their source
If you have three business dimensions then how can these be displayed?
in a three-dimensional cube or on a spreadsheet with two dimensions as rows and the third as columns
OLAP is an __________________ for the data warehouse
information delivery system
Spreadsheets have enhanced functionality using add-ins, but are still cumbersome. Why?
it'll take a lot of effort and has clear limitiations in terms of multidimensional analysis and complex calculations
In MOLAP, the array values indicate hte ____
location of the cells
transparency
make the technology, underlying data repository, computing architecture, and the diverse nature of source data totally transparent to users (enhance efficiency and productivity)
MOLAP
multidimensioanl online analytical processing. OLAP system is implemented through a specialized multidimensional database
In MOLAP, large ________________form the storage structures.
multidimensional arrays
What method is used when a user is asking for further comparisons to similar products, comparisons among territories, or views of the results by rotating the presentation between columns and rows?
multidimensional view
teh MOLAP engine in the application layer pushes a _____________ of the data frmo the MDDBs to the users
multidimensional view
____________ is the core of OLAP systems.
multidimensionality
OLAP is a fancy term for
multidimesional analysis
Do report writers provide for aggregate navigation?
no
Does report writing support multidimensionality?
no
With basic report writers, can you drill down to lower levels in the dimensions?
no
__________ and _________ multidimensional data cubes are stored in multidimensional databases.
precalculated and prefabricated
treatment of nonnomralized data
prohibit calculations made within an OLAP system from affecting the external data serving as the source
multidimensional conceptual view
provide a multidimensional data model that is intuitively analytical and easy to use (conforms to how users perceive business problems)
unrestricted cross-dimensional operations
provide ability for the system to recognize dimensional hierarchies and automatically perform roll up and drill down operations within a dimension or across dimensions. have the interface language allow calculations and data manipulations across any number of data dimensions, without restricting any relations between data cells, regardless of the number of common data attributes each cell contains
accessibility
provide access only to the data that is needed to perform the specific analysis, presenting a single, coherent, consistent view to the users
flexible reporting
provide capabilities to the business user to arrange columns, rows, and cells in a manner that facilitates easy manipulation, analysis, and synthesis of information. every dimension, including any subsets, must be able to be displayed with equal ease
incremental database refresh
provide for incremental refreshes of the extracted and aggregated OLAP data
multiuser support
provide support for end users to work concurrently with either the same analytical model or to create different models from the same data. in short, provide concurrent data access, data integrity, and access security
when such queries get directed to the OLAP system, the workload on the main data warehouse becomes substantially ____
reduced
Web OLAP.
refers to online analytical processing where OLAP data is accessible from a Web browser.
ROLAP
refers to relational online analytical processing. OLAP system is built on top of a relational database
database OOLAP
relational database management system (RDBMS) designated to support OLAP structures and to perform OLAP calculations
the metadata layer supports hte mapping of dimensions to the ______________ in ROLAP.
relational tables
In the ROLAP model, data is stored as ________________ in a relational database
rows adn columns
Usually, the data warehouse and the OLAP system start out on the_____ platform. When both are small, it is cost-justifiable to do so.
same
SQL interface
seamlessly integrate the OLAP system into the existing enterprise environment
define subset
select the subset of detailed data the marketing department is interested in
In order to hide the storage structure to the user and present data multidimensionally in ROLAP, a ______________-- is created
semantic layer of metadata
Time is a unique dimension because of its _________ nature
sequential
What are the principle elements of OLAP?
speed, consistency, interactive access, multiple dimensional views
summarize
summarize and prepare data structures in the way the marketing department needs for summarizing
OLAP analysis models
support Dr. Codd's four analysis models: exegetical (or descriptive), categorical (or explanatory), contemplative, and formulaic
Decribe the MOLAP model.
the OLAP engine resides on a special server. proprietary multidimensional databases stored data in the form of multidimensional hypercubes. you have to run special extraction adn aggregation jobs from the relational database of the data warehouse to create these multidimensional data cubes in the MDDBs. the special server presents teh data as OLAP cubes for processing by the users
multidimensional domain structure (mds)
the data is displayed showing metrics as columns, time as rows, and products as pages. the three straight lines, two of which represent business dimensions and the third is metrics. you can independently move up or down along the straight lines
What is the relationship between OLAP and data warehouses?
the data warehouse provides the best opportunity for analysis and OLAP is the vehicle for carrying out involved analysis
Why is SQL ill-suited for analyzing data?
the language is too abstruse, the queries will be too complex
How do ROLAP, MOLAP, etc differ from each other?
the storage methodology is different
In an analytical system _______ is a critical dimension
time
slice
two-dimensional plane of the cube