Dimensional Modeling
Overview and Fundamentals
What Is Dimensional Modeling?
Dimensional Modeling
An Overview
Structure
Terminology
Benefits
The Fundamentals
Facts and Dimensions
The 4 Step Design Process
What does Data Warehouse look like
Kimball Data Warehouse
Back Room Staging Area (The kitchen)
Data Presentation Area (The Dinning Room)
Star Schema
Corporate Information Factory (CIF)
Normalized Dimensions
Dual ETL Loading, Warehouse and Data Marts
Operational Data Store (ODS)
Hybrid Data Warehouse
Kimball Data Warehouse
Corporate Information Factory
Terminology
Dimensions
The time independent,
textual and descriptive
attributes by which users
describe objects.
Combining all the
attributes including
hierarchies, rollups and
sub-references into a
single dimension is
denormalization.
Often the by word in a
query or report
Not time dependent
Facts
Business Measurements
Most Facts are Numeric
Additive, Semi-Additive,
Non-Additive
Built from the lowest level
of detail (grain)
Very Efficient
Time dependent
Star Schema
Singe data (fact)
table surrounded by
multiple descriptive
(dimension) tables
Dim
Dim Dim
Dim
Dim
Fact
Benefits
Performance (Integer relationships, natural
partitioning, Single joins benefit SQL optimizer)
Source system independence and multiple
integration
Supports Change management
Usability/Simplicity (easy to read, interpret, join,
calculate)
Presentation (Consistency, Taxonomy, Labeling)
Reuse (Conformed dimensions reduce redundancy,
Role-plays)
Dimension Change Strategy
Type 1: Is used when the old value of the attribute
has no significance or can be discarded.
Easy and Fast
Type2: Partitions history so that fact tables properly
reflect original values.
Requires use of Surrogate Keys
Causes table growth due to additional history rows
Users must be aware of the added complexity
Effective Dates used secondary to cleaner fact joins
Dimension Change Strategy
Type 3: Additional attribute used to capture
changes.
Used less frequently then Type 1 or 2.
Simultaneously supports two views of the world.
Does not trend changes over time.
Current and Prior or Current and Original Attributes
Hybrid Type: Combination 1, 2 & 3 changes
New attribute for predictable series (such as yearly
changes)
Type 2 changes with prior or original attributes included
Expanded dimension table for durable key inclusion in fact
Added complexity to users
Dimension Role Playing
A single table that plays multiple roles
(using views) to create synonym dimension
attributes.
Most common role playing dimension is the
Date Dimension. i.e. separate role playing
dimensions for order date and ship date.
Fact Table Types
Characteristic Transaction Periodic Snapshot Accumulating Snapshot
Time period Point in time Regular, predictable
intervals
Indeterminate time span,
typically short-lived.
Grain One row per transaction
event
One row per period One row per life
Fact table loads Insert Insert Insert and Update
Fact row updates Not revisited Not revisited Revisited whenever
activity
Date dimension Transaction date End-of-period date Multiple dates for multiple
milestones
Facts Transaction activity Performance for a
predefined time
interval.
Performance over finite
lifetime
Modeling Design Process
1. Identify the Business Process
Source of measurements
2. Identify the Grain
What does 1 row in the fact table represent or
mean?
3. Identify the Dimensions
Descriptive context, true to the grain
4. Identify the Facts
Numeric additive measurements, true to the grain
Step 1 - Identify the Business Process
This is a business activity typically tied to a
source system.
Not to be confused with a business
department or function. An Orders
dimensional model should support the
activities of both Sales and Marketing.
If we establish departmentally bound
dimensional models, well inevitably
duplicate data with different labels and
terminology.
Step 2 - Identify the Grain
The level of detail associated with the fact
table measurements.
A critical step necessary before steps 3 and
4.
Preferably it should be at the most atomic
level possible.
How do you describe a single row in the
fact table?
Step 3 - Identify the Dimensions
The list of all the discrete, text-like
attributes that emanate from the fact table.
They are the by words used to describe
the requirements.
Each dimension could be though of as an
analytical entry point to the facts.
How do business people describe the data
that results from the business process?
Step 4 - Identify the Facts
Must be true to the grain defined in step 2.
Typical facts are numeric additive figures.
Facts that belong to a different grain belong
in a separate fact table.
Facts are determined by answering the
question, What are we measuring?
Percentages and ratios, such as gross
margin, are non-additive. The numerator
and denominator should be stored in the
fact table.
For More Information
Articles, Design Tips and Newsletters
http://www.kimballgroup.com
Designing A Scalable DW/BI System
http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032297070&EventCategory=4&cult
ure=en-US&CountryCode=US
Microsoft BI Using the Kimball Method
http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032297084&EventC
ategory=4&culture=en-US&CountryCode=US
Using SSIS to Populate a Kimball Method
Data Warehouse
http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032297072&EventC
ategory=5&culture=en-US&CountryCode=US