INFS 346 Chapter 8
type 1
-changes the value in the demission's record -the new value replaces the old one -no history preserved -the simplest approach, used most often when a change in a dimension is the result of an error
transaction-level detailed fact table
each row represents a particular transaction
timestamps
columns that indicates the time interval for which values in the records are applicable
detailed fact tables
each record refers to a single fact
most common approaches to dealing with slowly changing dimensions
type1 type2 type3
row indicator
-column that provides a quick indicator of whether the record is currently valid
dimension tables (dimensions)
-contain descriptions of th business, organization, or enterprise to which the subject of analysis belongs -colums in dimension tables contain descriptive information that is often textual (product brand, product color, customer gende, customer education level ) -this information provides a basis for analysis of the subject
fact tables
-contain measures related to the subject of analysis and foreign keys (associating fact tables with dimension tables) -the measures in the fact tables are typically numeric and are intended for mathematical computation
Type 2
-creates a new additional demension record using a new value for the surrogate key every time a value in a dimension record changes -used in cases where history should be preserved -can be combined with the use of timestamps and rowindicators
in aggregated fact table
-each record summarizes multiple facts
line-item detailed fact table
-each row represents a line item of a particular transaction
multiple facts in a dimensional model
-when multiple subjects of anlysis can share the same dimensions, a dimensional model contains more than one fact table -a dimensional model with multiple fact tables is referred to as a constellation or a galaxy -this approach enables: Quicker development of analytical databases for multiple subjects of analysis, because dimensions are re-used instead of duplicated -straightforward cross-fact analysis
for every dimension under consideration 2 questions must be answered
1. can the dimension table be useful for the analysis of the chosen subject? 2. can the dimension table be created based on the existing data sources ?
Type 3
involves creating a "previous" and "current" column in the demension table for each column where changes are anticipated -applicable in cases in which there is a fixed number of changes possible per column of a dimension, or in cases when only a limited history is recorded -can be combined with the use of timestamps.
relational modeling
standard method for logical modeling of operational databases
dimensional modeling continued
-data design methodology used for designing subject-oriented analytical databases, such as data warehouses or data marts -commonly, dimensional modeling is employed as a relational modeling technique -in addition to using the regular relational concepts (primary keys, foreign keys, integrity constraints, etc.) -distingusihes 2 types of tables: -dimension -facts
granularity of the fact tables
-describes what is depicted by one row in the fact table -detailed fact tables have fine level of granularity because each record represents a single fat -aggreagated face tables have a coarser level of granularity than detailed fact tables as records in aggregated fact tables always represent summarizations of multiple facts
surrogate key
-dimension tables typically given a simple, non-composite system-generated key -values or the keys are typically simple auto-increment integer values -surrogate key values have no meaning or purpose except to give each dimension a new column that serves as a primary key within the dimensional table instead of the operational key
granularity of fact tables cont..
-due to their compactness, coarser granularity aggregated fact tables are quicker to query than detailed fact tables -coarser granularity tables are limited in terms of what information can be retrieved from hem -one way to take advantage of the query performance imporvement provided by aggregated fact tables, is to have both types within the same dimensional model - constellation
detailed vs. aggregated fact tables
-fact tables in a dimensional model can contain either detailed or aggregated data
slowly changing dimension
-typical dimension in a star schema contains attributes whose values do not change or change extremely rarely such as store size and customer gender -attributes whose values change occasionally and sporadically over time, such as customer zip and employee salary
ER modeling
-a predominant technique for visualizing database requirements, used extensively for conceptual modeling of operational databases
snowflake model
-a star schema that contains the dimensions that are normalized -snowflaking is usually not used in dimensional modeling -not-normalized (not snowflakes) dimensions provide simpler analysis -normalization is usually not necessary for analytical databases
characteristics of dimensions and facts
-a typical dimension contains relatively static data, while in a typical fact table, records are added continually, and the table rapidly grows in size -in a typical dimensionally modeled analytical database, dimension tables have orders of magnitude fewer records than fact tables
dimensional modeling
-modeling technique tailored specifically for analytical database design purposes -regularly used in practice for modeling data warehouses and data marts
Star schema cont...
-the chosen subject of analysis is represented by a fact table -designing the star schema involves considering which dimensions to use wit the fact table representing the chosen subject -
star schema
the result of dimensional modeling is a dimensional schema containing facts and dimensions -the dimensional schema is often referred to as the star schema
additional possible fact attributes
a fact table contains .. - foreign keys connecting the fact table to the dimension tables -the measures related to the subject of analysis -In addition to subject of analysis, in certain cases fact tables contain other attributes that can appear in the fact table such as ... -transaction identifier and -transaction time