0% found this document useful (0 votes)

24 views6 pages

The Design of A Data Warehouse.: Lesson Introduction

This lesson covers the design and architecture of data warehouses, emphasizing the importance of understanding business needs and the various design approaches. It outlines the three-tier architecture, the significance of a metadata repository, and different OLAP server models. Students will learn about the data warehouse design process, including steps for modeling business processes and selecting appropriate data structures.

Uploaded by

gihanisa.singersl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views6 pages

The Design of A Data Warehouse.: Lesson Introduction

Uploaded by

gihanisa.singersl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

7. The Design of a Data Warehouse.

Lesson Introduction

This lesson will start regarding data warehouse architecture and give an insight into how to design and
construct a data warehouse. To create an active data warehouse, a clear understanding and analysis of
business needs are essential. We will discuss the different views in data warehouse design and what are
the different approaches for data warehouse design. Then the description of the main components in the
three-tier data warehouse architecture and the importance of the metadata repository. The last section
of this lesson presents various types of warehouse servers for OLAP processing.

Learning Outcomes:
After successful completion of this lesson, the students will be able to understand data warehouse
architecture, data warehouse design process and the different types of warehouse server models.

Lesson Outline:
 The design of a data warehouse.
 Data warehouse design process.
 Importance of metadata repository
 OLAP server models.

7.1 The Design of a Data Warehouse:

We can achieve the following advantages in the data warehouse approach.

 A data warehouse may provide a competitive advantage by presenting relevant information from which to measure
performance and make critical adjustments to help win over competitors.
 A data warehouse can enhance business productivity because it can quickly and efficiently gather information that
accurately describes the organization.
 A data warehouse facilitates customer relationship management because it provides a consistent view of customers
and items across all lines of business, all departments, and all markets.
 A data warehouse may bring about cost reduction by tracking trends, patterns, and exceptions over long periods
consistently and reliably.

The construction of a large and complex information system can be viewed as the construction of a large and complex building,
for which the owner, architect, and builder have different views. Four different views regarding the design of a data warehouse
must be considered: the top-down view, the data source view, the data warehouse view, and the business query view.

 The top-down view allows the selection of the relevant information necessary for the data warehouse. This
information matches the current and future business needs.
 The data source view exposes the information being captured, stored, and managed by operational systems. This
information may be documented at various levels of detail and accuracy, from individual data source tables to
integrated data source tables. Data sources are often modeled by traditional data modeling techniques, such as the
entity-relationship model or CASE (computer-aided software engineering) tools.
 The data warehouse view includes fact tables and dimension tables. It represents the information that is stored inside
the data warehouse, including precalculated totals and counts, as well as information regarding the source, date, and
time of origin, added to provide historical context.
 The business query view is the perspective of data in the data warehouse from the viewpoint of the end user.

7.2 The Data Warehouse Design Process

A data warehouse can be built using a top-down approach, a bottom-up approach, or a combination of
both. The top-down approach starts with the overall design and planning. It is useful in cases where the
technology is mature and well known, and where the business problems that must be solved are clear
and well understood. The bottom-up approach starts with experiments and prototypes. This is useful in
the early stage of business modeling and technology development. It allows an organization to move
forward at considerably less expense and to evaluate the benefits of the technology before making
significant commitments. In the combined approach, an organization can exploit the planned and strategic
nature of the top-down approach while retaining the rapid implementation and opportunistic application
of the bottom-up approach.

From the software engineering point of view, the design and construction of a data warehouse may
consist of the following steps: planning, requirements study, problem analysis, warehouse design, data
integration, and testing, and finally deployment of the data warehouse. Large software systems can be
developed using two methodologies: the waterfall method or the spiral method. The waterfall method
performs a structured and systematic analysis at each step before proceeding to the next, which is like a
waterfall, falling from one stage to the next. The spiral method involves the rapid generation of
increasingly functional systems, with short intervals between successive releases.

In general, the warehouse design process consists of the following steps:

1. Choose a business process to model, for example, orders, invoices, shipments, inventory, account administration,
sales, or the general ledger. If the business process is organizational and involves multiple complex object collections,
a data warehouse model should be followed. However, if the process is departmental and focuses on the analysis of
one kind of business process, a data mart model should be chosen.
2. Choose the grain of the business process. The grain is the fundamental, atomic level of data to be represented in the
fact table for this process, for example, individual transactions, individual daily snapshots, and so on.
3. Choose the dimensions that will apply to each fact table record. Typical dimensions are time, item, customer, supplier,
warehouse, transaction type, and status.
4. Choose the measures that will populate each fact table record. Typical measures are numeric additive quantities like
dollars sold and units sold.
7.3 The Three-Tier DataWarehouse Architecture

Data warehouses often adopt a three-tier architecture, as presented in Figure 7-1.

The bottom tier is a warehouse database server that is almost always a relational database system. Back-
end tools and utilities are used to feed data into the bottom tier from operational databases or other
external sources. These tools and utilities perform data extraction, cleaning, and transformation, as well
as load and refresh functions to update the data warehouse. This tier also contains a metadata repository,
which stores information about the data warehouse and its contents.

Figure 7-1: The Three-Tier Data Warehouse Architecture

The middle tier is an OLAP server that is typically implemented using either a relational OLAP (ROLAP)
model, that is, an extended relational DBMS that maps operations on multidimensional data to standard
relational operations; or a multidimensional OLAP (MOLAP) model, that is, a special-purpose server
that directly implements multidimensional data and services.

The top tier is a front-end client layer, which contains query and reporting tools, analysis tools, and data
mining tools. From the architecture point of view, there are three data warehouse models: the enterprise
warehouse, the data mart, and the virtual warehouse.

7.4 Importance of the Metadata Repository

Metadata are data about data. When used in a data warehouse, metadata are the data that define
warehouse objects. Figure 7-1 showed a metadata repository within the bottom tier of the data
warehousing architecture. Metadata are created for the data names and definitions of the given
warehouse. Additional metadata are generated and captured for timestamping any extracted data, the
source of the retrieved information, and missing fields that have been added by data cleaning or
integration processes.

A metadata repository should contain the following:

 A description of the structure of the data warehouse, which includes the warehouse schema, view,
dimensions, hierarchies, and derived data definitions, as well as data mart locations and contents
 Operational metadata, which provides for data lineage (history of migrated data and the
sequence of transformations applied to it), the currency of data (active, archived, or purged), and
monitoring information (warehouse usage statistics, error reports, and audit trails)
 The algorithms used for summarization, which include measure and dimension definition
algorithms, data on granularity, partitions, subject areas, aggregation, summary, and predefined
queries and reports
 The mapping from the operational environment to the data warehouse, which includes source
databases and their contents, gateway descriptions, data partitions, data extraction, cleaning,
transformation rules and defaults, data refresh and purging rules, and security (user authorization
and access control)
 Data related to system performance, which include indices and profiles that improve data access
and retrieval performance, in addition to rules for the timing and scheduling of refresh, update,
and replication cycles
 Business metadata, which include business terms and definitions, data ownership information,
and charging policies
7.5 OLAP Server Models

Mainly there are three different types of OLAP server models they are Relational OLAP (ROLAP) servers,
Multidimensional OLAP (MOLAP) servers, and Hybrid OLAP (HOLAP) servers.

1. Relational OLAP (ROLAP) servers: These are the intermediate servers that stand in between a
relational back-end server and client front-end tools. They use a relational or extended-relational
DBMS to store and manage warehouse data, and OLAP middleware to support missing pieces.
ROLAP servers include optimization for each DBMS back end, implementation of aggregation
navigation logic, and additional tools and services. ROLAP technology tends to have greater
scalability than MOLAP technology. The DSS server of Micro strategy, for example, adopts the
ROLAP approach.

2. Multidimensional OLAP (MOLAP) servers: These servers support multidimensional views of data
through array-based multidimensional storage engines. They map multidimensional views
directly to data cube array structures. The advantage of using a data cube is that it allows fast
indexing to precomputed summarized data. Many MOLAP servers adopt a two-level storage
representation to handle dense and sparse data sets: denser subcubes are identified and stored
as array structures, whereas sparse subcubes employ compression technology for efficient
storage utilization. A comparison of the two models is given in figure 7-2.

Figure 7-2: MOLAP vs ROLAP

3. Hybrid OLAP (HOLAP) servers: The hybrid OLAP approach combines ROLAP and MOLAP
technology, benefiting from the greater scalability of ROLAP and the faster computation of
MOLAP. For example, a HOLAP server may allow large volumes of detail data to be stored in a
relational database, while aggregations are kept in a separate MOLAP store. The Microsoft SQL
Server 2000 supports a hybrid OLAP server.

Data Warehouse and Its Architecture
No ratings yet
Data Warehouse and Its Architecture
5 pages
Module 1 Notes
No ratings yet
Module 1 Notes
29 pages
The Study On Data Warehouse Design and Usage: Mr. Dishek Mankad, Mr. Preyash Dholakia
No ratings yet
The Study On Data Warehouse Design and Usage: Mr. Dishek Mankad, Mr. Preyash Dholakia
5 pages
DWDM Unit1
No ratings yet
DWDM Unit1
8 pages
DWDM Lecture Notes III-II
No ratings yet
DWDM Lecture Notes III-II
81 pages
C Lecture
No ratings yet
C Lecture
8 pages
Unit 2 Ques
No ratings yet
Unit 2 Ques
80 pages
L4. Datawarehouse Architecture PDF
No ratings yet
L4. Datawarehouse Architecture PDF
13 pages
Data Warehouse 9 Oct
No ratings yet
Data Warehouse 9 Oct
15 pages
Data Warehousing-Notes (Module - I & II)
No ratings yet
Data Warehousing-Notes (Module - I & II)
32 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Lec 01 - Intro To Data Warehouse
No ratings yet
Lec 01 - Intro To Data Warehouse
54 pages
Dmi Unit 6
No ratings yet
Dmi Unit 6
6 pages
Fundamentals of Data Science Notes (Module - 2)
No ratings yet
Fundamentals of Data Science Notes (Module - 2)
11 pages
DWDM Lecture Notes U-1
No ratings yet
DWDM Lecture Notes U-1
11 pages
Knowledge Discovery in Databases (KDD) Lect 4
No ratings yet
Knowledge Discovery in Databases (KDD) Lect 4
28 pages
CS2202 DataWarehouse OLAP
No ratings yet
CS2202 DataWarehouse OLAP
49 pages
Data Warehouse Architecture
100% (2)
Data Warehouse Architecture
5 pages
Data Science & Warehousing Basics
No ratings yet
Data Science & Warehousing Basics
13 pages
DWDM
No ratings yet
DWDM
15 pages
DWDM Lecture Notes III-ii - For Nlcad-6-86
No ratings yet
DWDM Lecture Notes III-ii - For Nlcad-6-86
81 pages
DWH Chapter2
No ratings yet
DWH Chapter2
42 pages
Chapter 2
No ratings yet
Chapter 2
44 pages
Data Warehousing
No ratings yet
Data Warehousing
16 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
135 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
39 pages
DW Part A Part B Notes
No ratings yet
DW Part A Part B Notes
69 pages
DM Module 1
No ratings yet
DM Module 1
16 pages
Data Warehouse Design & Architecture
No ratings yet
Data Warehouse Design & Architecture
12 pages
Olap and Oltap
No ratings yet
Olap and Oltap
14 pages
Data Warehouse Concepts
No ratings yet
Data Warehouse Concepts
53 pages
Data Warehouse Overview & Insights
No ratings yet
Data Warehouse Overview & Insights
18 pages
ALL YOU NEED Data - Mining - and - Warehousing
No ratings yet
ALL YOU NEED Data - Mining - and - Warehousing
42 pages
OE BI Module 2
No ratings yet
OE BI Module 2
87 pages
DWM Unit 1
No ratings yet
DWM Unit 1
34 pages
Data Warehousing - Architecture - Tutorialspoint
No ratings yet
Data Warehousing - Architecture - Tutorialspoint
7 pages
DWM Exp1
No ratings yet
DWM Exp1
12 pages
Data Warehousing Is The Coordinated, Architected, and Periodic Copying of Data From
No ratings yet
Data Warehousing Is The Coordinated, Architected, and Periodic Copying of Data From
33 pages
Overview of Data Warehousing and OLAP
No ratings yet
Overview of Data Warehousing and OLAP
12 pages
Data Warehouse-Design Process
No ratings yet
Data Warehouse-Design Process
16 pages
CH 2 Introduction To Data Warehousing
No ratings yet
CH 2 Introduction To Data Warehousing
31 pages
FDS Unit 2
No ratings yet
FDS Unit 2
21 pages
Data Warehousing & OLAP Basics
No ratings yet
Data Warehousing & OLAP Basics
31 pages
Data Warehousing Research Paper
50% (2)
Data Warehousing Research Paper
7 pages
Data Warehousing & Data Mining
No ratings yet
Data Warehousing & Data Mining
16 pages
Data Ware House
No ratings yet
Data Ware House
203 pages
Data Warehouse Final Report
No ratings yet
Data Warehouse Final Report
19 pages
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
No ratings yet
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
5 pages
Data Warehousing and Mining Guide
No ratings yet
Data Warehousing and Mining Guide
46 pages
Paper Presentation: Data Ware Housing AND Data Mining
No ratings yet
Paper Presentation: Data Ware Housing AND Data Mining
10 pages
DWDM Unit-1
No ratings yet
DWDM Unit-1
10 pages
Data Wareousing and Mining-Notes
No ratings yet
Data Wareousing and Mining-Notes
37 pages
DW Assigment
No ratings yet
DW Assigment
20 pages
Overview of Data Ware Housing
No ratings yet
Overview of Data Ware Housing
17 pages
Data Warehousing and Mining Complete Notes
No ratings yet
Data Warehousing and Mining Complete Notes
495 pages
Data Warehouse Architecture
No ratings yet
Data Warehouse Architecture
5 pages
DWDM 5 Unit Notes
No ratings yet
DWDM 5 Unit Notes
86 pages
f1s WF 2024 Competition Regulations
No ratings yet
f1s WF 2024 Competition Regulations
60 pages
CMOS Battery CSS 9
No ratings yet
CMOS Battery CSS 9
28 pages
Model QP - (21ECE701) - 2
No ratings yet
Model QP - (21ECE701) - 2
4 pages
H81M-HG4 R4.0
No ratings yet
H81M-HG4 R4.0
70 pages
Modern App Development On Salesforce - Data Model Design Contd
No ratings yet
Modern App Development On Salesforce - Data Model Design Contd
27 pages
Deep Foundation-Design Methods
No ratings yet
Deep Foundation-Design Methods
30 pages
Lock Picking - The Complete Guide For Beginners To Master The Art of Lock Picking Skills and Avoid Beginner Mistakes
100% (3)
Lock Picking - The Complete Guide For Beginners To Master The Art of Lock Picking Skills and Avoid Beginner Mistakes
50 pages
SBC401 CR 241224-Fa
No ratings yet
SBC401 CR 241224-Fa
30 pages
Audio Component Grounding and Interconnection - DiyAudio
No ratings yet
Audio Component Grounding and Interconnection - DiyAudio
47 pages
Features of The New GL
No ratings yet
Features of The New GL
3 pages
Thesis Writing Help for Students
100% (3)
Thesis Writing Help for Students
7 pages
Seismic Frame Analysis Guide
No ratings yet
Seismic Frame Analysis Guide
4 pages
FIL Alumni June 2024 - PPT - PankajGupta
No ratings yet
FIL Alumni June 2024 - PPT - PankajGupta
55 pages
Software Development First Edition Marvin Zelkowitz Ph.D. Ms Bs. Digital Download
100% (12)
Software Development First Edition Marvin Zelkowitz Ph.D. Ms Bs. Digital Download
127 pages
Description: Print
No ratings yet
Description: Print
8 pages
Control System Performance Measure-Pcee503-Unit-5
No ratings yet
Control System Performance Measure-Pcee503-Unit-5
32 pages
Codes, Standards, and Regulations: BICSI Telecommunications Dictionary, 2nd Edition
No ratings yet
Codes, Standards, and Regulations: BICSI Telecommunications Dictionary, 2nd Edition
23 pages
Sherwood Maintenance and Repair Manual - Jamestown Distributors
No ratings yet
Sherwood Maintenance and Repair Manual - Jamestown Distributors
48 pages
Docvaccination New 24
No ratings yet
Docvaccination New 24
43 pages
My City Bhopal
No ratings yet
My City Bhopal
8 pages
Te1sa254-003k Turkey
No ratings yet
Te1sa254-003k Turkey
202 pages
Introduction To Business Logic
No ratings yet
Introduction To Business Logic
10 pages
Safexpert Easy User Quick Guide
No ratings yet
Safexpert Easy User Quick Guide
10 pages
An Unconstrained Future How Generative Ai Could Reshape b2b Sales
No ratings yet
An Unconstrained Future How Generative Ai Could Reshape b2b Sales
7 pages
Chapter 3
No ratings yet
Chapter 3
63 pages
Forklift Truck Checklist
100% (1)
Forklift Truck Checklist
1 page
Sony-Wfc500 Invoice
No ratings yet
Sony-Wfc500 Invoice
1 page
Bolt Media Kit Brand Guidelines May2022
No ratings yet
Bolt Media Kit Brand Guidelines May2022
147 pages
Resume Bekhruz Makhmudov
No ratings yet
Resume Bekhruz Makhmudov
2 pages
Tech Utsav 2010: Seminar Topic:-Sixth Sense Technology
No ratings yet
Tech Utsav 2010: Seminar Topic:-Sixth Sense Technology
18 pages

The Design of A Data Warehouse.: Lesson Introduction

Uploaded by

The Design of A Data Warehouse.: Lesson Introduction

Uploaded by

7. The Design of a Data Warehouse.

7.1 The Design of a Data Warehouse:

We can achieve the following advantages in the data warehouse approach.

7.2 The Data Warehouse Design Process

In general, the warehouse design process consists of the following steps:

Data warehouses often adopt a three-tier architecture, as presented in Figure 7-1.

Figure 7-1: The Three-Tier Data Warehouse Architecture

7.4 Importance of the Metadata Repository

A metadata repository should contain the following:

Figure 7-2: MOLAP vs ROLAP

You might also like