Data and Info. Management Chapter 7 continued (Marist College Eitel Lauria)
Operational queries typically process much smaller or larger amounts of data than analytical queries?
smaller
Data mart
A data store based on the same principles as a data warehouse, but with a more limited scope
Metadata: The Yellow Pages
A directory with data about the institutions
Top-Down Approach advantages
A truly corporate effort, an enterprise view of data Inherently architected - not a union of disparate data marts Single, central storage of data about the content Centralized rules and control May see quick results if implemented with iterations
One DW architecture always exists, but there are two broad types, which are what? And you should always choose the what approach as it is the best way to achieve your goals of a what DW?
Ad hoc or Planned planned well functioning, extendable, robust and maintainable
Data Transformation: Perform 8individual tasks:
Clean Standardization Combine Purging and separating out Sorting and merging Assignment of surrogate keys
Bill Inmon is a co-creator of the what?
Corporate Information Factory (hub-and-spoke architecture)
Dependent data mart: Does or does not have its own source systems?
Does not
3 parts of Refresh cycles
Extract the changes to the source data Transform the data revisions And feed the incremental data revisions on an ongoing basis
How to change database owner?
In a file in System Stored Procedures folder Exec sp_changedbowner 'sa'
Tools for data extraction
Purchasing outside tools - Developing in-house programs
How many subjects does a data mart have? Does it have more data sources than a data warehouse?
Single No
Independent data mart
Stand-alone data mart, created in the same fashion as the data warehouse
ETL infrastructure
The infrastructure that facilitates the retrieval of data from operational databases into the data warehouses
Data Loading: Two distinct groups of tasks
The initial loading of the data into the data warehouse Refresh cycles
Information Delivery Component: Who are the users?
The novices, the casual users, the business analysts, and the power users
Is the data in a data warehouse stable?
Yes
Information Delivery Component: Are there different methods of information delivery? If so, what are they?
Yes Ad hoc reports, complex queries, multidimensional analysis, statistical analysis, EIS feed, data-mining applications
Data Warehouse (The Data Storage Component): Is there a separate repository? If so, why?
Yes To keep large volume of historical data for analysis To keep the data in structures suitable for analysis
Data Transformation Results:
a collection of integrated data that is cleaned, standardized, and summarized
Extract the source data into what or what or what?
a group of flat files, or a data-staging relational database, or a combination of both
operational databases are referred to as what oriented?
application-oriented.
analytical databases can contain both detailed and summarized data or just one of the two? That data is physically or digitally stored?
both detailed and summarized data physically
Operational data typically represents the current or past state of affairs in the real world, while analytical data can represent both the current situation and snapshots from the past or just one of the two?
current both the current situation and snapshots from the past
Dependent data mart: The data comes from the what?
data warehouse
Operational data typically reflects detailed or not detailed data?
detailed
Data makeup differences are what?
differences in the characteristics of the data that comprises the operational and analytical information.
Functional differences
differences in the rationale for the use of operational and analytical data
Centralized Data Warehouse: Seeks to overcome the what? Highly or not highly variable with many or not many individual approaches?
limitations of previous architectures Highly Many
The metadata is the source of information for the what?
management module
Information Delivery Component: Information delivery what? Examples?
mechanism Online, internet, intranet, e-mail, mobile
Management and Control: Monitor the what into the what and from there into the what?
movements of data staging area data warehouse storage
In contrast to the constituency of operational data, analytical data is used by a narrower or larger set of users for decision-making purposes?
narrower
Eliminating redundancy is more critical or not as critical in analytical databases as it is in operational databases?
not as critical
Data Warehouse (The Data Storage Component): A typical data warehouse periodically retrieves what from the what?
selected analytically useful data operational data sources
Analytical databases are referred to as what oriented?
subject-oriented
Data Warehouse (The Data Storage Component): The data warehouse is sometimes referred to as the what? To indicate the fact that it is a destination for the data from the what?
target system source system
Technical differences
the differences in the way that operational and analytical data is handled and accessed by the DBMS and applications
analytical information
the information collected and used to support decisions involving tasks that require data analysis.
Data in operational systems is regularly updated by who?
the users
ETL includes the following task: Loading the what into the what?
transformed and quality assured data into the target data warehouse
The different types of DW architecture: The basic types are what?
Independent Data Marts Dependent Data Marts / Hub and Spoke Bus Architecture Centralized Data Warehouse (no dependent data marts) Federated With big data, Data Lakes
Independent Data Marts: What were developing solutions into what? Is there a program / enterprise perspective?
Individual projects functional silos No
The database in a data warehouse must be open to what like what?
Must be open to different tools RDBMSs or MDDBs
Focus of a data mart: Is often narrower or wider than organization-wide?
Narrower
Independent Data Marts: Are there conformed dimensions?
No
DW Architecture: Before deciding to build a data warehouse, you need to ask:
Top-down or bottom-up approach? Enterprise-wide or department? Which first - data warehouse or data mart? Build pilot or go with a full-fledged implementation? Dependent or independent data marts?
4 Types of Metadata
Operational Metadata Extraction and Transformation Metadata End-user Metadata
Bill Inmon
The "Father of Data Warehousing" a known expert, speaker, and author on data warehousing
Management and Control: Coordinate the what and what? Control the what and what into the data warehouse storage? Moderate the what to the users?
services and activities data transformation and the data transfer information delivery
Data Staging: Provide a place and an area with a what to what for what and what?
set of functions clean, change, combine, convert, duplicate, and prepare source data for storage and use in the data warehouse
Operational systems have a shorter or longer time horizon of data than analytical systems?
shorter
Independent data mart has its own what and what?
source systems and ETL infrastructure
Is a data mart bigger than a data warehouse? Is its implementation time not as long as a data warehouse's?
No Yes
Top-Down Approach Disadvantages
Takes longer to build even with an iterative method High exposure/risk to failure Needs high level of cross-functional skills High outlay without proof of concept
DW front-end (BI) applications (information delivery component): Used to provide what to the data warehouse for users who are engaging in what use?
access indirect use
ETL includes the following tasks: Extracting what from the operational data sources? Doing what to such data so that it what?
analytically useful data Transforming conforms to the structure of the subject-oriented target data warehouse model (while ensuring the quality of the transformed data)
Data Extraction: Deal with one or numerous data sources?
numerous
Management and Control: Sit where compared to all the other components?
on top of all the other components
The database in a data warehouse must be what?
open
Firstly, a DW architecture basically what?
outlines how the DW components fit together