Quiz 3: Business Info
Data Warehouse Components: Front-end (Bl) applications
oftern refered to as BI (business intellegence) applications used to provide acces to the data warehouse for users who are engaging in indirect use
Data Mining Techniques
one widely used data mining technique is Market basket analysis
operational database
collects and presents operational info in support of daily operational procedures and processes
data mining
data mining is an intersection of database management, artificial intelligence (machine learning), and statistics data warehouse is one of the primary sources of data mining
Data Warehouse
data warehouses are modeled are structured differently than operational databases data warehouses use different types of technologies for storage and retrival than operational databases datawarehouses are used different (much smaller) set of users than operational databases operational database can be used by thousands of employees and customers vs datawarehouses are used by data analysts and decision makers
dependent data mart
does not have its own source systems the data comes from the data warehouse provide users with subset of data from the datawarehouse, in cases when users or applications do not want, need, or are allowed to have access to all the data in the entire data warehouse
accessing data in data warehouses and data marts: executive dashboards
for a quick glance at data intended for use by higher level decision makers within an organization contains an organized easy to read display of a number of critically impoprtant queries describing the performance of the organization in general the usage of exectutive dashboards should require little or no effort or training can web-based
accessing data in data warehouses and data marts: OLAP/BI tools
for direct access
Drill down example
for each individual store, show separately for male and female shoppers the number of product units sold for each individual product in each category
accessing data in data warehouses and data marts: front-end/BI applications
for indirect access front end applications accesing analytical data are also known as BI applications most users who need to access analytical data in data warehouses and data marts can not engage in direct access (as they do not know how to use OLAP/BI tools) instead they are given access to fron end/BI applicaitons
Slice and Dice Examples
for stores 1 and 2, show separately for male and female shoppers the number of product units sold for each product category
Data Warehouse Components: Source Systems (operational databases)
in the context of data warehousing, data source systems are operational databases that provide anaylitical useful info for the data warehouses subject of analysis every operational database that is used as a source system for the data warehouse has two purposes: the original operational purpose as a source system for the datawarehouse
analytical information examples
include info reflecting trends sales product statistics future growth projects
operational information (transactional information) examples
include information reflection withdrawing cash for ATM making an airline reservation, purchasing stocks
reasons for the creation of a data warehouse as a separate analytical database: Difference in Structuring of Data
it is often impossible to structure a database which can be used in an efficient manner for both operational and analytical purposes
OLAP/BI tools: drill up
makes the granularity of the data in the query result closer
OLAP/BI tools: Drill down
makes the granularity of the data in the query result finer
data mining
mining transactional (operational) databases, containing data related to current day-to-day organizational activities can be of use in certain situations however, the most appropriate and fertile source of data for meaningful and effective data mining is the enterprise wide corporate data warehouse (if it exists), which contains all the information from the operational data sources that has analytical value
accessing data in data warehouses and data marts (3)
OLAP/BI tools front-end/BI applications executive dashboards
Data Warehouse Components (4)
Source systems (operational databases Extraction-transformation-load (ETL) infrastructure Data Warehouse Front-end (Bi) applications
reasons for the creation of a data warehouse as a separate analytical database (2)
Time Horizon Difference Performance Reasons Difference in Structuring of Data
Data marts
a data store based on the same principles as a data warehouse, but with a more limited scope
Data Mining
discovering novel and interesting patterns in large amount of data
data mining uses
predictive analysis- using past data to predict future events
OLAP/BI tools (3)
slice and dice pivot (rotate) Drill down/drill up
analytical database
collects and presents analytical info in support of analytical tasks
OLAP/BI tools: Slice and Dice
adds, replace, or eliminates specified attributes (or particular values) from the already displayed result
data warehouse
a typical organization maintains and utilizes a number of operational data sources (I.e. operational databases) a data warehouse is created within an organization as a separate data store whose primary purpose is data analysis
data mining patterns
accurate meaningful understandable actionable
independent data mart
standalone data mart created in the same fashion as data ware house has its own source systmes and ETL infrastrucutre
Referential integrity
states that every non-null foreign key value must match an existing primary key value When one table has a foreign key to another table, the concept of referential integrity states that a record to the table that contains the foreign key can not be added unless there is a corresponding record in the linked table
Data Warehouse Components: Data warehouse
target system- to indicate that it is a destination for the data from the source systems a typical data warehouse periodically retrieves selected analytical data from the operational data sources
reasons for the creation of a data warehouse as a separate analytical database: Time horizon difference
the difference between how long the data is useful for transactional vs analyitical needs sc pic
analytical information
the information collected and used in support of analytical tasks based on operational (transactional) info
operational information (transactional information)
the information collected and used in support of day to day operational needs in businesses and other organizations
Data Warehouse Components: Extraction-transformation-load (ETL) infrastructure
the infrastructure thatt facilitates the retrieval of data from operational databases into the data warehouses includes th following tasks: extraxting analytical useful data from the operational data source tranforming such data so it conforms to the structure of the target data warehouse model, while ensuring the quality of transformed data loading the transformed and quality assured data into the targert data warehouse
reasons for the creation of a data warehouse as a separate analytical database: Performance Reasons
the performance of operational day to day tasks involving data use can be severely diminished if such tasks have to compete for computing resources with analytical tasks
operational vs analytical data
the sames fact can have both operational and analytical purpose for example data describing that customer x bought product y in z store can be stored as operational data in an operational database for day to day operational purposes such as inventory monitoring or financial transaction record keeping the same fact can also be stored as analytical data in a data warehouse where, combined with a avast number of similar facts accumulated over a period of time, it serves to reveal important trends, such as sale patterns or customer behavior