Data Warehouse & Decision Making
Dr. Mona Mosa
Lecture 4
Agenda
oLearning Objectives.
oThe Core Principles of Data Warehouse.
oNavigating the Architecture of Modern Data Warehouses.
oOLTP Vs. OLAP.
oThe Strategic Role of Data Warehousing in Decision Making.
oCharacteristics of Data Warehousing.
oThe Comprehensive Benefits of Implementing a Data Warehouse.
oUnderstanding Concepts & Navigating Types.
oEssentials of Extract, Transform, Load Processes.
oCommon Issues in Data Warehouse Implementation and Management.
oSummary.
oLecture Exercise.
Learning Objectives
1. Understand the core principles and purpose of Data Warehousing in supporting decision-
making processes.
2. Identify and differentiate between the components of a Data Warehouse architecture,
including Data Sources, ETL Process, Storage, and Data Presentation Layer.
3. Distinguish between OLTP (Online Transaction Processing) and OLAP (Online Analytical
Processing) in terms of data processing and analysis.
4. Recognize the key characteristics of Data Warehousing, such as being subject-oriented,
integrated, time-variant, and nonvolatile.
5. Learn the essentials of the ETL (Extract, Transform, Load) process and common
challenges in Data Warehouse implementation and management.
The Core Principles of Data Warehouse
A Data Warehouse (DWH) is a centralized repository that stores large amounts of data collected from various
sources within an organization.
This data is aggregated, organized, and structured specifically for query and analysis, rather than for
transaction processing.
The primary purpose of a DWH is to enable businesses to perform complex queries and analysis, such as
data mining, on large datasets to extract valuable insights and support decision-making processes.
It is essential component of BI systems, providing the foundational data infrastructure necessary for
analytical reporting, data analysis, and decision support.
Navigating the Architecture of Modern Data
Warehouses
Data Sources: The various databases and external sources from which data is collected.
ETL Process (Extract, Transform, Load): This is the process through which data is extracted from
various sources, transformed into a consistent format, and loaded into the data warehouse.
Storage: The database systems or platforms where the integrated data is stored.
Data Presentation Layer: The interfaces and tools used to query, analyze, and present the data to
the end-users.
OLTP Vs. OLAP
Unlocking Insights: The Strategic Role of Data
Warehousing in Decision Making
Characteristics of Data Warehousing
Subject oriented: Data are organized by detailed subject, such as sales, products, or customers,
containing only information relevant for decision support.
Integrated: DWH must place data from different sources into a consistent format.
Time variant (time series): A DWH maintains historical data. Data do not necessarily provide
current status (except in real-time approach).
Nonvolatile: After data are entered into a data warehouse, users cannot change or update the
data.
Maximizing Value: The Comprehensive Benefits of
Implementing a Data Warehouse
Exploring Data Marts: Understanding Concepts
& Navigating Types
A Data Mart is usually smaller and focuses on a particular subject or department.
It is a subset of a DWH, typically consisting of a single subject area (e.g., marketing and
operations).
A Data Mart can be either dependent or independent.
A dependent Data Mart is a subset that is created directly from the DWA. It has the
advantages of using a consistent data model and providing quality data.
An independent Data Mart is a small warehouse designed for a strategic business unit (SBU)
or a department, but its source is not an DWH.
Mastering ETL: Essentials of Extract, Transform, Load
Processes
Extract, Transfer & Load (ETL) process consists of:
Extraction (i.e., reading data from one or more data source).
Transformation (i.e., converting the extracted data from its previous form into the form
in which it needs to be so that it can be placed into a DWH).
Load (i.e., putting the data into the DWH).
Navigating Challenges: Common Issues in Data Warehouse
Implementation and Management
Data Quality and Consistency: Inconsistent, incomplete, or inaccurate data can significantly
impact the reliability of insights derived from a DWH.
Integration Complexity: Integrating data from disparate sources, each with its own format,
structure, and quality, can be complex and time-consuming.
Performance and Scalability: As data volume grows, maintaining performance for data
processing and query execution becomes challenging.
Data Security and Compliance: Ensuring data security, privacy, and compliance with regulations
is increasingly challenging as DWH aggregate sensitive and personal information from multiple
sources.
Data Governance and Management: Establishing effective data governance and management
practices is challenging but essential for ensuring that data within the warehouse is accurate,
accessible, and secure.
Change Management: Adapting to changes in business processes, data sources, and technology
can require ongoing adjustments to the DWH architecture and processes.
Summary
A Data Warehouse (DWH) serves as a centralized repository for data
aggregated from various organizational sources, specifically for complex
analysis and decision support.
The architecture of a DWH includes multiple components, such as data
sources, ETL processes, storage solutions, and a presentation layer for end-
user interaction.
Data Warehouses differ from OLTP systems, as they are optimized for large-
scale analytical queries rather than transactional processing.
Summary (Cont’d)
Key characteristics of a DWH include subject orientation, integration of
multiple data sources, historical time-variant data storage, and non-volatility.
Common challenges in DWH management include ensuring data quality,
handling integration complexity, scaling performance, maintaining security
compliance, and managing data governance effectively.
Lecture Exercise: Designing a Data Warehouse Scenario
This exercise aims to provide with hands-on experience in conceptualizing and
designing a basic Data Warehouse solution tailored to a specific business scenario. You
will apply theoretical knowledge to practical use, understanding the key components,
architecture, and benefits of a data warehouse.
Proposed Scenario: Each group is assigned a unique business scenario from different
sectors such as retail, healthcare, banking, education, or manufacturing. The scenario
will outline the business's nature, its data sources, reporting needs, and challenges with
current data management practices.