0% found this document useful (0 votes)
10 views26 pages

DWDM Unit 2 - Part I

Data modeling in data warehousing organizes data to meet business needs, ensuring it is easily understood and can answer queries efficiently. Dimensional data modeling involves structuring data into facts and dimensions, with schemas like star and snowflake guiding the organization. Proper data modeling is crucial for effective analysis, akin to a library system that allows users to find information quickly.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views26 pages

DWDM Unit 2 - Part I

Data modeling in data warehousing organizes data to meet business needs, ensuring it is easily understood and can answer queries efficiently. Dimensional data modeling involves structuring data into facts and dimensions, with schemas like star and snowflake guiding the organization. Proper data modeling is crucial for effective analysis, akin to a library system that allows users to find information quickly.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Data Modelling for Data

Warehousing Part I

1
Data Modelling
● Think of a data warehouse like a supermarket for information.
Just like products in a supermarket are neatly organized (fruits in one aisle, dairy
in another, snacks in another), in a data warehouse, data must also be organized
so that people can easily find and analyze what they need.

● This “organization” process is called Data Modelling.

2
Data Modelling in a Data Warehouse
It is the process of structuring and arranging data in the warehouse so that it:

1. Matches business needs.

2. Can be easily understood by users (business analysts, managers).

3. Can answer queries quickly and accurately.

3
Why is it important ?
● If data is not modelled properly, it will feel like a warehouse full of boxes
scattered randomly.

● Data modelling makes the warehouse like a library, where books (data) are
classified by subject, author, and genre so you can find them easily.

4
Dimensional Data Modelling
● Dimensional data modeling is a data modeling technique used in data
warehousing that allows businesses to structure data to optimize analysis and
reporting.

● Essential Components of Dimensional Modelling:


○ Facts & Fact Table
○ Dimensions & Dimension Table
○ Keys

5
Facts
● Facts = measurable, numeric data that the business wants to analyze.

● They are usually quantitative values stored in a Fact Table.

● Examples:

○ Sales Amount (₹500)


○ Quantity Sold (10 units)
○ Profit Margin (%)
○ Number of Calls Received

📌 Facts answer “How much? How many?”

6
Dimensions
● Dimensions = descriptive data that gives context to facts.

● They are usually textual, categorical values stored in Dimension Tables.

● Examples:

○ Time Dimension: Year, Month, Day


○ Customer Dimension: Name, Gender, Age, City
○ Product Dimension: Product ID, Category, Brand
○ Location Dimension: Store, City, Region

📌 Dimensions answer “Who? What? Where? When? Why?”

7
How Facts & Dimensions Work Together ?
Imagine a sales transaction:

● Fact: 50 units sold for ₹25,000

● Dimensions:

○ Time → January 2025


○ Product → iPhone 15
○ Customer → Rahul, Age 30, Male
○ Store → Delhi

👉 Together, they give a complete story:


"Rahul bought 50 units of iPhone 15 at the Delhi store in January 2025, generating ₹25,000
sales."
8
9
10
Schema in Data Warehouse
● A schema is like a blueprint or plan of how the data in a data warehouse is organized.

● It tells us:
1. What tables exist (Fact Tables + Dimension Tables).
2. How those tables are connected (relationships).
3. How data flows for analysis and reporting.

Think of it like the map of a shopping mall:

● The fact table is the main center (where transactions happen = sales).
● The dimension tables are like the stores around it (product details, customers, time,
stores).
● The schema shows how everything is connected together.

11
Types of Schemas
There are mainly 2 common types used in data warehousing:

1. Star Schema
○ Fact table in the center.
○ Dimension tables around it.
○ Simple, fast for queries.

2. Snowflake Schema
○ Extension of Star.
○ Dimension tables broken into smaller tables (normalized).
○ Less redundancy, but more complex.

12
Star Schema
Definition:

● A schema where the Fact Table is in


the center.
● Surrounding it are Dimension Tables
(not normalized).
● It looks like a ⭐ star.

Characteristics:

● Simple, intuitive, fast queries.


● Dimension tables can have redundancy
(denormalized).

13
Snowflake Schema
Definition:

● An extension of Star Schema where


Dimension Tables are normalized.
● Dimensions are broken into
sub-dimensions.
● Looks like ❄ a snowflake.

Characteristics:

● Reduces redundancy (normalized).


● More complex joins, slightly slower
queries.

14
Example: Sales Retail Store (Star Schema)

15
Example: Sales Retail Store (Snowflake Schema)

16
Example: Automobile Data Warehouse
Star Schema Snowflake Schema

17
Normalized vs Denormalized Data Model

18
Star vs Snowflake Schema

Feature Star Schema ⭐ Snowflake Schema ❄

Structure Fact + Denormalized Fact + Normalized Dimensions


Dimensions

Simplicity Easy, intuitive More complex

Query Faster (fewer joins) Slower (more joins)


Speed

Storage More redundancy Less redundancy

Use Case Best for quick reporting Best when storage efficiency & data
& OLAP consistency are critical

19
Hierarchies & Levels in Data Modelling
🔹 Hierarchies

● A hierarchy represents the relationship of data


from higher-level summaries to lower-level
details.
● It’s a drill-down path for analysis.
● For example: Year → Quarter → Month → Day

🔹 Levels

● Levels are the individual stages within a


hierarchy.
● Example in a Time hierarchy:
○ Year (highest level)
○ Quarter
○ Month
○ Day (lowest level of detail) 20
Facts & Measures in Data Modelling
1. Facts

● Facts are business events or transactions you want to analyze.


● They represent the quantitative data stored in a fact table.
● Example: A sale, an online order, a payment.

👉 Think of facts as the numbers your business cares about.

2. Measures

● Measures are the numerical values stored in the fact table, usually tied to facts.
● They are the metrics you want to analyze or aggregate (sum, average, max, min).

👉 Measures = "What you want to measure about a fact."

21
Example: Retail Sales Data Warehouse

22
23
24
Case Study: Retail Company Data Warehouse
Background
A retail chain, ShopSmart, operates across multiple cities. It wants to improve decision-making by
analyzing sales trends, customer preferences, and inventory movements. To do this, it decides to
build a data warehouse.

Steps

25
26

You might also like