0% found this document useful (0 votes)

10 views11 pages

Cs 614

Data warehousing involves collecting and managing large data sets for efficient analysis and decision-making. It highlights the importance of structured data, the relationship between data, information, knowledge, and intelligence, and discusses the evolution of data processing systems. The document also covers the types of data warehouses, normalization, denormalization, and the differences between OLTP and OLAP systems.

Uploaded by

bc210402586 HAFIZA ANIQA IRFAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views11 pages

Cs 614

Uploaded by

bc210402586 HAFIZA ANIQA IRFAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Lecture No:1

Data warehousing:
Data Warehousing is the process of collecting, storing, and managing large amounts of data from different
sources in one place. This data is organized in a way that makes it easy to analyse and make better decisions.
Need for data warehousing:
A data warehouse is important because it helps businesses and organizations store and manage large
amounts of data efficiently. Here are some key reasons why we need it:
 Better decision making
 Fast access to data
 Improved data quality
 Historical data analysis
 Combing data from multiple resources
 Competitive Advantage
Relationship between Data, Information, Knowledge and Intelligence:
Data:
Data refers to raw facts and figures that are recorded or stored in a way that computers can process. It often
comes from transactions or operations, such as sales records or customer details, but by itself, data has no
meaning until processed.
Information:
Information is processed data that is structured and meaningful. Information helps answer basic questions
like "who," "what," "where," and "when."
Knowledge:
Knowledge is the application of information to understand patterns, make predictions, or solve problems. It
helps answer "how" questions by providing deeper insights based on experience and learning. For example,
understanding customer buying trends from sales data is knowledge.
Intelligence:
Intelligence is the ability to interpret knowledge and understand the reasons behind patterns and behaviors. It
answers the "why" questions and allows individuals or organizations to make strategic decisions.
Intelligence helps in problem-solving and innovation.
Note:
Data is raw facts, information is meaningful data, knowledge is applying information for insights, and
intelligence is understanding why things happen.
Historical overview:
Early days of data processing (1960s):
Computers used master files stored on magnetic tapes. Data access was slow and unreliable, requiring
sequential scanning. Redundant data became a major issue as master files grew.
Introduction of Direct Access Storage Device (DASD) & Databases (1970s):
DASD (Disk Storage) allowed faster, direct access to data, unlike magnetic tapes. DBMS (Database
Management Systems) were introduced, making data storage and retrieval more efficient. A database
became a "single source of truth", solving redundancy problems.
Rise of PCs & 4th Generation Languages (1980s):
PCs and 4GLs (4th Generation Programming Languages) gave users direct control over data. Management
Information Systems (MIS) were developed to help managers make decisions. This marked the early stages
of Decision Support Systems (DSS).
Data Ware House:
A complete repository of historical corporation data extracted from transaction systems that is available for
ad-hoc access by knowledge workers.
Types of data ware house:
Following are some industry-specific data warehouses:
 Financial - A financial data warehouse stores and organizes all the financial data of a business —
like income, expenses, budgets, and profits.
 Telecommunication - A telecom data warehouse stores call records, customer usage data, billing
info, and network performance details.
 Insurance - An insurance data warehouse keeps track of policyholder details, claim history,
premium payments, and risk analysis.
 Human resources - An HR data warehouse stores employee-related data like salaries, attendance,
performance, and hiring trends.
Ad-hoc:
It doesn’t have a certain repeatable pattern and it’s not in advance.
Ad-Hoc Query Example:
Task:
Your manager wants to quickly know:
“What were the total sales on April 1, 2025, broken down by store?”
This is a one-time, unplanned request — no scheduled report exists for it.
OLTP:
OLTP stands for online transection processing, which refers to the type of database system designed to
handle a large volume of short interaction in real time. These transactions often involve inserting, deleting,
updating etc.
DSS:
A Decision Support System (DSS) is a computer-based tool or system used to help business users make
informed decisions by analysing large volumes of data — often pulled from a data warehouse.

OLTP (online transaction A system used to handle day-to- Captures and processes real-time
processing) day operations like sales, data.
bookings, payments
DSS (decision support system) A system used to help users Uses historical data (often from a
analyse data and make decisions data warehouse) for analysis &
reporting.
Data warehouse A centralized repository of Supports reporting, analysis,
historical, cleaned, and structured DSS, and BI tools.
data
Typical Queries:
OLTP = Select tx_data, balance from tx_table where account_id=23785;
DSS =select balance, age, sal, gender from customer_table and tx_table where age between (30 and 40) and
eduction = ‘graduate ‘and CusID.customer_table and Customer_ID.tx_table;
Comparison of OLTP and DSS:
OLTP DSS
Primary key need Primary key not used
No concept of primary key index Primary index used
May use single table Uses multiple tables
Few rows returned Many rows returned
High selectivity of query Low selectivity of query
Indexing on primary key (unique) Indexing on primary index (non-unique)

Primary indexing:
Primary indexing is a way to quickly find data in a database table using the primary key.
Example:
A table of contents in a book, based on page numbers.
Or
An Index is like a starting point for reading data in a database.
Instead of the database scanning every row one by one (which is slow), the index tells it exactly where to
start — just like a bookmark or table of contents in a book.
OLAP:
OLAP (online analytical processing) is a tool that helps you analyse large amounts of data quickly so you
can make smart business decisions. In OLAP we are analysing data step-by -step.
Example:
You check total sales → then zoom into sales by region → then by product → then by month.
Anomalies:
Anomalies are problems or errors that happen when a database is not properly organized (not normalized).
Types of anomalies:
 Insert anomalies – you can’t add new data because other data is missing.
 Update anomalies – you have to update the same data in many places.
 Deletion anomalies – deleting one piece of data removes other important data.
Solution:
These anomalies can be avoided using normalization.
Normalization:
Normalization is the process of efficiently organizing the data in a database by decomposing (splitting) a
relational table into smaller table by projection.
Goals of normalization:
There are basically two goals for normalization:
 Eliminate redundant data (for example, storing the same data in more than one table).
 Ensuring data dependencies make sense (only storing related data in a table).
Need for Normalization:
 To remove duplicate data
 To save storage space
 To make data easier to update and manage
Example:
Imagine a student table like this:
StudentID Name Course1 Course2
1 Ali Math Science
2 JK Math English

Instead of storing courses like this, we can normalize into two separate tables:
Student Table:
StudentID Name
1 Ali
2 JK

Course Table:
StudentID Course
1 Math
1 Science
2 Math
2 English

Type of Normalization:
Ther are five main types of normalization:
 1NF (First Normal form) – Removes repeating groups or multiple values in a single column
 2NF (Second Normal form) – Removes partial dependencies (depends on part of the key)
 3NF (Third Normal form) – Removes transitive dependencies (depends on non-key column)
 BNNF (Boyce- Codd Normal form) – A stronger version of 3NF — handles some special cases
 4NF (Fourth Normal form) – Removes multi-valued dependencies (no column should repeat rows)
 5NF (Fifth Normal form) – Breaks down complex relationships into simpler ones
1NF (First Normal form):
A table is in 1NF if each column has only one value (no lists or multiple items in a cell).
Example (Before 1NF):
Student Courses
Raj Math, Science
Priya Math , English
After 1NF:
Student Courses
Raj Math
Raj Science
Priya Math
Priya English

2NF (Second Normal form):

A table is in 2NF if it's already in 1NF and all non-key columns depend fully on the primary key (no partial
dependency).
Example:
Before 2NF:
StudentID Courses StudentName
1 Math Raj
1 Science Raj
2 Math Priya
2 English Priya

Here, StudentName depends only on StudentID, not on both StudentID + Course.

After 2NF:
Student Table:
StudentID StudentName
1 Raj
1 Raj
2 Priya
2 Priya

Course Table:
StudentID Courses
1 Math
1 Science
2 Math
2 English

3NF (Third Normal Form):

A table is in 3NF if it's in 2NF and no column depends on another non-key column (no transitive
dependency).
Example:
Before 3NF
StudentID StudentName Department DeptLocation
1 Raj CS Block A
2 Piya IT Block B

DeptLocation depends on Department, not directly on StudentID.

After 3NF:
Student Table:
StudentID StudentName Department
1 Raj CS
2 Piya IT

Department Table:
Department DeptLocation
CS Block A
IT Block B

Denormalization:
Denormalization means combining tables back together to make data easier and faster to read, even if it
causes some repetition (redundancy). In denormalization, we join tables again to make data access faster.
Denormalization in DSS:
In DSS, users need fast reports and quick answers. If the data is too split (too many small tables), it will take
more time to join them when running a report. Denormalized tables allow users to get results faster because:
 Less joining of tables
 Faster query performance
 Easier to run big analytical queries
Denormalization technique:
 Collapsing
 Pre-joining tables
 Splitting tables
 Adding redundant columns (Reference data)
 Derived attributes
Collapsing:
Collapsing Tables is the process of merging two or more related tables into a single table. This is done when
there is a clear relationship between the tables, and it makes sense to combine them, typically for
performance optimization purposes, especially when those tables are frequently queried together.
Types of collapsing:
 Two entities with a one-to-one relationship
 Two entities with many-to-many relationship
Two entities with a one-to-one relationship:
In this type of relationship, each row in Table A is linked to exactly one row in Table B. When the
relationship is one-to-one, collapsing the tables simplifies access to data since there is no need to join them
repeatedly.
Example:
Customer table:
CustomerID Name Email
C001 Ali ali@gmail.com
C002 Sara sara@gmail.com

Customer Address table:

CustomerID Address
C001 123 main St
C002 456 main St

After collapsing (Combining the tables):

CustomerID Name Email Address
C001 Ali ali@gmail.com 123 main St
C002 Sara sara@gmail.com 456 main St

Two entities with many-to-many relationship

In a many-to-many relationship, a row in Table A can be related to many rows in Table B and vice versa.
Typically, in a relational database, a junction table is used to manage the many-to-many relationship.
However, if the data is queried frequently, it may be beneficial to pre-join and collapse the data into a single
table to improve query performance.
Example:
Student table:
StudentID Name
S001 Ali
S002 Sara

Course table:
CourseID CourseName
C001 Math
C002 History

Student course table:

StudentID CourseID
S001 C001
S001 C002
S002 C001

After collapsing (Combining the Tables into One Table)

StudentID Name CourseID CourseName

S001 Ali C001 Math

S001 Ali C002 History

S002 Sara C001 Math

Pre-joining tables:
Joining tables in advance (before queries) and saving the joined result as a new table.
Example:
Customer Table:
CutomerID Name
C001 Ali

Order Table:
OrderID CustomerID ProductName
O001 C001 Laptop

After pre joining:

OrderID CustomerID Name ProductName

O001 C001 Ali Laptop

Splitting Tables:
Splitting tables means breaking a large table into smaller tables to make data easier to manage, faster to
search, or more secure.
Types:
There are two main types:
 Horizontal Splitting
 Vertical Splitting
Horizontal Splitting:
We split a table into two or more tables where each new table has the same columns but contains different
rows based on some condition (like date, region, etc.).
Example:

OrderID CustomerName OrderDate Amount

1 Ali 01-01-2025 100

2 Sara 05-01-2025 200

3 Ahmed 10-02-2025 150

4 Hina 20-02-2025 250

After horizontal splitting:

Table 1: Order January
OrderID CustomerName OrderDate Amount

1 Ali 01-01-2025 100

2 Sara 05-01-2025 200

Table 2: Order _February

OrderID CustomerName OrderDate Amount

3 Ahmed 10-02-2025 150

4 Hina 20-02-2025 250

Vertical Splitting:
We split a table into two or more tables where each new table has fewer columns. Rows are connected by a
common key.
Example:

CustomerID Name Email CardNumber BillingAddress

C001 Ali ali@email.com 1234-5678 123 Main St

C002 Sara sara@email.com 9876-5432 456 Elm St

After vertical split:

Table 1: Customer Personal

CustomerID Name Email

C001 Ali ali@email.com

C002 Sara sara@email.com

Table 2: Customer Payment

CustomerID Name BillingAddress

C001 Ali 123 Main St

456 m
C002 Sara
St

Adding redundant columns (Reference data):

We add extra columns inside a table that repeat important information instead of joining with another table
every time. This makes the system faster because it avoids extra joins when you run queries. In other words,
moving all the column form the master table to the detail table, a sub set off column form the master table is
kept in the detail table.
OLAP:
OLAP is used for analysing large amounts of historical data to support decision-making. It helps businesses
answer complex questions like: What were the sales trends last year? or Which product performed best in
the last quarter? OLAP systems are designed for reading data, not writing or updating it frequently. They
support features like multi-dimensional analysis, charts, reports, and dashboards. These systems are slower
than OLTP for day-to-day operations but provide deeper insights.
Example: A company’s sales manager uses an OLAP system to compare sales across different regions and
time periods.
OLTP:
OLTP is used for managing daily transactions like order placement, inventory updates, and customer data
management. These systems are designed to handle many short, fast transactions such as insert, update, and
delete operations. OLTP systems must be fast, reliable, and always available since they support everyday
business functions.
Example: When a customer buys something online and the system processes their payment and updates the
inventory, that’s OLTP in action.
OLAP vs OLTP:

Feature OLTP (Online Transaction Processing) OLAP (Online Analytical Processing)

Purpose Handles day-to-day transactions Performs analysis and reporting

Data Type Current, real-time transactional data Historical, aggregated data

Operations Insert, Update, delete (write-heavy) Read, Analyse, Drill-down (read-heavy)

Speed Very fast for small transactions Slower, optimized for complex queries

Users Front-line staff, clerks, customers Managers, analysts, decision-makers

Data Size Smaller data per transaction Large volumes of data for analysis

Examples Online banking, order processing Sales trends, market analysis

System Design Highly normalized database Denormalized or star/snowflake schema

Relationship between OLAP and DWH:

A Data Warehouse (DWH) is a central storage system that holds large volumes of historical data from
different sources (like sales, customer data, etc.).
OLAP (Online Analytical Processing) is a tool or technique used on top of a data warehouse to analyse this
stored data.
OLAP implementation:
MOLAP: OLAP implementation with a multi dimensional data structure.
ROLAP: OLAP implementation with a relation database.
HOLAP: OLAP implementation as a hybrid of MOLAP and ROLAP.
DOLAP: OLAP implementation for desktop decision support environment.
MOLAP stands for Multidimensional Online Analytical Processing. It is a type of OLAP (Online Analytical
Processing) that stores data in a multidimensional cube format, rather than in relational databases like in
ROLAP.

Cs 614
No ratings yet
Cs 614
10 pages
Cs 614
No ratings yet
Cs 614
12 pages
OBIEE - Quick Guide
No ratings yet
OBIEE - Quick Guide
78 pages
Very Important Waht Is Data Warehouse and Why Required
No ratings yet
Very Important Waht Is Data Warehouse and Why Required
26 pages
Data Warehouse Interview Guide
No ratings yet
Data Warehouse Interview Guide
7 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
9 pages
PDF Document 2
No ratings yet
PDF Document 2
72 pages
Business Intelligence and Databases - Kopie
No ratings yet
Business Intelligence and Databases - Kopie
14 pages
Data Warehousing & DATA MINING (SE-409) : Lecture-2
No ratings yet
Data Warehousing & DATA MINING (SE-409) : Lecture-2
36 pages
Module 1
No ratings yet
Module 1
78 pages
DATA WARE HOUSE in Brief
No ratings yet
DATA WARE HOUSE in Brief
27 pages
Final Term Paper
No ratings yet
Final Term Paper
24 pages
Practicalno: 1 Introduction To Database: Data
No ratings yet
Practicalno: 1 Introduction To Database: Data
33 pages
DMDW - Preprocessing L-6,7
No ratings yet
DMDW - Preprocessing L-6,7
16 pages
Data Warehouse Administration
No ratings yet
Data Warehouse Administration
14 pages
Data Mining and Data Warehouse - Mukesh Prasad Chaudhary
No ratings yet
Data Mining and Data Warehouse - Mukesh Prasad Chaudhary
651 pages
Bipm Module 3 - Business Intelligence
No ratings yet
Bipm Module 3 - Business Intelligence
68 pages
Designing The Data Warehouse Aima Second Lecture
No ratings yet
Designing The Data Warehouse Aima Second Lecture
34 pages
Introduction To Business Intelligence 2
No ratings yet
Introduction To Business Intelligence 2
23 pages
Dmbi Question Bank
No ratings yet
Dmbi Question Bank
21 pages
What's Data at A Warehouse
No ratings yet
What's Data at A Warehouse
37 pages
What Is A Data Warehouse? and Why Are So Many Schools Setting Them Up?
No ratings yet
What Is A Data Warehouse? and Why Are So Many Schools Setting Them Up?
37 pages
DW&DM Material
No ratings yet
DW&DM Material
107 pages
DWH Session1
No ratings yet
DWH Session1
36 pages
Slide 1
No ratings yet
Slide 1
10 pages
Business Computing: 1. Computing:-Computing Applies A Set of
No ratings yet
Business Computing: 1. Computing:-Computing Applies A Set of
35 pages
Crush Your Data Warehousing Interviews
No ratings yet
Crush Your Data Warehousing Interviews
91 pages
Tue, Jan 20, 2009 - 1800: 2100 FAST - NU, Karachi
No ratings yet
Tue, Jan 20, 2009 - 1800: 2100 FAST - NU, Karachi
21 pages
DM HarshQuesAns
No ratings yet
DM HarshQuesAns
183 pages
Data Warehousing & BI Insights
No ratings yet
Data Warehousing & BI Insights
6 pages
Bi 2025
No ratings yet
Bi 2025
106 pages
Introduction to Data Warehousing
No ratings yet
Introduction to Data Warehousing
113 pages
Unit 4
No ratings yet
Unit 4
27 pages
DWDM Questions
No ratings yet
DWDM Questions
8 pages
Session1 2 PPT
No ratings yet
Session1 2 PPT
28 pages
Is Characterized by A Large
No ratings yet
Is Characterized by A Large
3 pages
SE-6C Data Warehousing Assignment #1 Hamza Imtiaz Malik FA13-BSE-039
No ratings yet
SE-6C Data Warehousing Assignment #1 Hamza Imtiaz Malik FA13-BSE-039
4 pages
Advanced Database Management System
50% (2)
Advanced Database Management System
2 pages
Week 02 Part 01
No ratings yet
Week 02 Part 01
15 pages
Database & ETL Testing Essentials
No ratings yet
Database & ETL Testing Essentials
17 pages
Database Testing vs Data Warehouse Testing
100% (2)
Database Testing vs Data Warehouse Testing
17 pages
Big Data Architecture Group 1 PROJECT
No ratings yet
Big Data Architecture Group 1 PROJECT
55 pages
ETL Testing - PPT
No ratings yet
ETL Testing - PPT
77 pages
01-Introduction To BI
No ratings yet
01-Introduction To BI
41 pages
Project
No ratings yet
Project
17 pages
BI Interview Questions 2020
No ratings yet
BI Interview Questions 2020
12 pages
3 BI and Data Warehouse For BI Solutions
No ratings yet
3 BI and Data Warehouse For BI Solutions
43 pages
Data Warehousing & Modeling Guide
No ratings yet
Data Warehousing & Modeling Guide
10 pages
Ch-4 Foundations of Business Intelligence Databases and Information Management
No ratings yet
Ch-4 Foundations of Business Intelligence Databases and Information Management
8 pages
SQL Interview Questions 1725044566
No ratings yet
SQL Interview Questions 1725044566
4 pages
DMDW
No ratings yet
DMDW
28 pages
DWDM
No ratings yet
DWDM
11 pages
SQL Commands
No ratings yet
SQL Commands
23 pages
Unit 1 - Lecture 1.2,3 - Data Science & Big Data
No ratings yet
Unit 1 - Lecture 1.2,3 - Data Science & Big Data
34 pages
Unit 1 - Introduction To Data Mining and Data Warehousing
No ratings yet
Unit 1 - Introduction To Data Mining and Data Warehousing
84 pages
Unit 2 Part1
No ratings yet
Unit 2 Part1
142 pages
001 - OpenEdge Getting Started Database Essentials Gsdbe
No ratings yet
001 - OpenEdge Getting Started Database Essentials Gsdbe
142 pages
Data Warehousing & Dimensional Modeling Concepts !!
No ratings yet
Data Warehousing & Dimensional Modeling Concepts !!
33 pages
Business Intelligence & Data Analysis
No ratings yet
Business Intelligence & Data Analysis
40 pages
Cs614 (Final)
No ratings yet
Cs614 (Final)
29 pages
Front End Development
No ratings yet
Front End Development
7 pages
CS501
No ratings yet
CS501
6 pages
Complier Construction (Final)
No ratings yet
Complier Construction (Final)
8 pages
Operating System
No ratings yet
Operating System
21 pages
Cs 621
No ratings yet
Cs 621
7 pages
Cloud Computing
No ratings yet
Cloud Computing
6 pages
Compiler Basics for CS Students
No ratings yet
Compiler Basics for CS Students
11 pages
Hyderabad IT Companies List
No ratings yet
Hyderabad IT Companies List
6 pages
Obiee Admin Interview Questions
100% (3)
Obiee Admin Interview Questions
13 pages
Engineering Design & CAD/CAM Guide
No ratings yet
Engineering Design & CAD/CAM Guide
25 pages
Online Shopping System Thesis Documentation
100% (1)
Online Shopping System Thesis Documentation
5 pages
Geomatica 2018 Full Install
No ratings yet
Geomatica 2018 Full Install
2 pages
Prompt Engineering 2
No ratings yet
Prompt Engineering 2
6 pages
R300 Series Ref Manual
100% (1)
R300 Series Ref Manual
58 pages
Draftig Competition Brochure
No ratings yet
Draftig Competition Brochure
8 pages
CSEETInstructions Examinees May 202319042023
No ratings yet
CSEETInstructions Examinees May 202319042023
9 pages
Firewall and SELinux Configuration Guide
No ratings yet
Firewall and SELinux Configuration Guide
14 pages
Security and Infant Control Otis
100% (3)
Security and Infant Control Otis
48 pages
Pinakafinal Na Talaga.
No ratings yet
Pinakafinal Na Talaga.
26 pages
Mood Based Music Player Project Expanded
No ratings yet
Mood Based Music Player Project Expanded
12 pages
EB Anrs Android Framework
No ratings yet
EB Anrs Android Framework
19 pages
Srinu Resume PDF 95
No ratings yet
Srinu Resume PDF 95
2 pages
3 - Introduction To Computers
No ratings yet
3 - Introduction To Computers
3 pages
CPP Output Prediction
No ratings yet
CPP Output Prediction
6 pages
whittle优化教程
No ratings yet
whittle优化教程
25 pages
Supplemental Guide On The Procedures For Online Surveys Using Askallo
No ratings yet
Supplemental Guide On The Procedures For Online Surveys Using Askallo
33 pages
Virtual Machine
90% (10)
Virtual Machine
868 pages
ECOSYS M3660idn Series Brochure
No ratings yet
ECOSYS M3660idn Series Brochure
4 pages
CH2 - SOLID Principles in Software Design V1.0
No ratings yet
CH2 - SOLID Principles in Software Design V1.0
27 pages
3MC Bible
No ratings yet
3MC Bible
38 pages
Operating Instructions PC Diagnostics V2.08
No ratings yet
Operating Instructions PC Diagnostics V2.08
36 pages
Popular Now First FRCR Anatomy Questions and Answers PDF
No ratings yet
Popular Now First FRCR Anatomy Questions and Answers PDF
16 pages
Rcfesw 1018
No ratings yet
Rcfesw 1018
1 page
Intrusion Detection Systems
No ratings yet
Intrusion Detection Systems
29 pages
Watershed Delineation Using DEM
No ratings yet
Watershed Delineation Using DEM
7 pages
CCBT Unit 2
No ratings yet
CCBT Unit 2
31 pages
Chapter 6 Interface Python With MYSQL
No ratings yet
Chapter 6 Interface Python With MYSQL
80 pages

Cs 614

Uploaded by

Cs 614

Uploaded by

Lecture No:1

2NF (Second Normal form):

Here, StudentName depends only on StudentID, not on both StudentID + Course.

3NF (Third Normal Form):

DeptLocation depends on Department, not directly on StudentID.

Customer Address table:

After collapsing (Combining the tables):

Two entities with many-to-many relationship

Student course table:

After collapsing (Combining the Tables into One Table)

StudentID Name CourseID CourseName

S001 Ali C001 Math

S001 Ali C002 History

S002 Sara C001 Math

After pre joining:

OrderID CustomerID Name ProductName

OrderID CustomerName OrderDate Amount

1 Ali 01-01-2025 100

2 Sara 05-01-2025 200

3 Ahmed 10-02-2025 150

4 Hina 20-02-2025 250

After horizontal splitting:

1 Ali 01-01-2025 100

2 Sara 05-01-2025 200

Table 2: Order _February

OrderID CustomerName OrderDate Amount

3 Ahmed 10-02-2025 150

4 Hina 20-02-2025 250

CustomerID Name Email CardNumber BillingAddress

C001 Ali ali@email.com 1234-5678 123 Main St

C002 Sara sara@email.com 9876-5432 456 Elm St

After vertical split:

CustomerID Name Email

C001 Ali ali@email.com

C002 Sara sara@email.com

Table 2: Customer Payment

CustomerID Name BillingAddress

C001 Ali 123 Main St

Adding redundant columns (Reference data):

Feature OLTP (Online Transaction Processing) OLAP (Online Analytical Processing)

Purpose Handles day-to-day transactions Performs analysis and reporting

Data Type Current, real-time transactional data Historical, aggregated data

Operations Insert, Update, delete (write-heavy) Read, Analyse, Drill-down (read-heavy)

Users Front-line staff, clerks, customers Managers, analysts, decision-makers

Examples Online banking, order processing Sales trends, market analysis

System Design Highly normalized database Denormalized or star/snowflake schema

Relationship between OLAP and DWH:

You might also like