Business Intelligence and Analytics:
Systems for Decision Support
Global Edition
(10th Edition)
Chapter 3:
Data Warehousing
Learning Objectives
Understand the basic definitions and
concepts of data warehouses
Learn different types of data warehousing
architectures; their comparative
advantages and disadvantages
Describe the processes used in developing
and managing data warehouses
Explain data warehousing operations
… (Continued…)
3-2 © Pearson Education Limited 2014
Learning Objectives
Explain the role of data warehouses in
decision support
Explain data integration and the
extraction, transformation, and load (ETL)
processes
Describe real-time (a.k.a. right-time
and/or active) data warehousing
Understand data warehouse
administration and security issues
3-3 © Pearson Education Limited 2014
What is a Data Warehouse?
is a pool of data produced to support decision
making; it is also a repository of current and
historical data of potential interest to managers
throughout the organization. Data are usually
structured to be available in a form ready for
analytical processing activities
3-4 © Pearson Education Limited 2014
Characteristics of DWs
Subject oriented
Integrated
Time-variant (time series)
Nonvolatile
Metadata
Web based, relational/multi-dimensional
Client/server, real-time/right-time/active...
3-5 © Pearson Education Limited 2014
DW Components
Data Mart
A departmental small-scale “DW” that
stores only limited/relevant data
Dependent data mart
A subset that is created directly from a data
warehouse
Independent data mart
A small data warehouse designed for a
strategic business unit or a department
3-6 © Pearson Education Limited 2014
Other DW Components
Operational data stores (ODS)
A type of database often used as an interim
area for a data warehouse
Oper marts - an operational data mart.
Enterprise data warehouse (EDW)
A data warehouse for the enterprise.
Metadata: Data about data.
In a data warehouse, metadata describe the
contents of a data warehouse and the manner
of its acquisition and use
3-7 © Pearson Education Limited 2014
DW Architecture
Three-tier architecture
1. Data acquisition software (App server)
2. The data warehouse that contains the data &
software(Database server)
3. Client software that allows users to access and
analyze data from the warehouse(Clint
Worksatation)
Two-tier architecture
First two tiers in three-tier architecture is combined
into one
3-8 © Pearson Education Limited 2014
DW Architectures
Tier 1: Tier 2: Tier 3:
Client workstation Application server Database server
Tier 1: Tier 2:
Client workstation Application & database server
3-9 © Pearson Education Limited 2014
A Web-Based DW Architecture
Web pages
Application
Server
Client Web
(Web browser) Internet/ Server
Intranet/
Extranet
Data
warehouse
3-10 © Pearson Education Limited 2014
Alternative DW Architectures
(a) Independent Data Marts Architecture
ETL
End user
Source Staging Independent data marts
access and
Systems Area (atomic/summarized data)
applications
(b) Data Mart Bus Architecture with Linked Dimensional Datamarts
ETL
Dimensionalized data marts End user
Source Staging
linked by conformed dimensions access and
Systems Area
(atomic/summarized data) applications
(c) Hub and Spoke Architecture (Corporate Information Factory)
ETL
End user
Source Staging Normalized relational
access and
Systems Area warehouse (atomic data)
applications
Dependent data marts
(summarized/some atomic data)
Ten factors that potentially affect the
architecture selection decision
1. Information 6. Strategic view of the data
interdependence between warehouse prior to
organizational units implementation
2. Upper management’s 7. Compatibility with existing
information needs systems
3. Urgency of need for a data 8. Perceived ability of the in-
warehouse house IT staff
4. Nature of end-user tasks 9. Technical issues
5. Constraints on resources 10. Social/political factors
3-12 © Pearson Education Limited 2014
Data Integration and the Extraction,
Transformation, and Load Process
Data integration
Integration that comprises three major processes: data
access, data federation, and change capture.
ETL = Extract Transform Load
Enterprise application integration (EAI)
A technology that provides a vehicle for pushing data
from source systems into a data warehouse
Enterprise information integration (EII)
An evolving tool space that promises real-time data
integration from a variety of sources, such as relational
or multidimensional databases, Web services, etc.
3-13 © Pearson Education Limited 2014
Data Integration and the Extraction,
Transformation, and Load Process
Packaged Transient
application data source
Data
warehouse
Legacy
Extract Transform Cleanse Load
system
Data mart
Other internal
applications
3-14 © Pearson Education Limited 2014
Analysis of Data in DW
OLTP vs. OLAP…
OLTP (online transaction processing)
Capturing and storing data from ERP, CRM, POS, …
The main focus is on efficiency of routine tasks
OLAP (Online analytical processing)
Converting data into information for decision support
Data cubes, drill-down / rollup, slice & dice, …
Requesting ad hoc reports
Conducting statistical and other analyses
Developing multimedia-based applications
3-15 © Pearson Education Limited 2014
OLAP vs. OLTP
3-16 © Pearson Education Limited 2014
Variations of OLAP
Multidimensional OLAP (MOLAP)
OLAP implemented via a specialized
multidimensional database (or data store) that
summarizes transactions into multidimensional
views ahead of time
Relational OLAP (ROLAP)
The implementation of an OLAP database on
top of an existing relational database
Database OLAP and Web OLAP (DOLAP and
WOLAP); Desktop OLAP,…
3-17 © Pearson Education Limited 2014
DW Implementation Issues
Identification of data sources and governance
Data quality planning, data model design
ETL tool selection
Establishment of service-level agreements
Data transport, data conversion
Reconciliation process
End-user support
Political issues
3-18 © Pearson Education Limited 2014
Successful DW Implementation
Things to Avoid
Starting with the wrong sponsorship chain
Setting expectations that you cannot meet
Engaging in politically naive behavior
Loading the data warehouse with information
just because it is available
Believing that data warehousing database design
is the same as transactional database design
Choosing a data warehouse manager who is
technology oriented rather than user oriented
3-19 © Pearson Education Limited 2014
Failure Factors in DW Projects
Lack of executive sponsorship
Unclear business objectives
Cultural issues being ignored
Change management
Unrealistic expectations
Inappropriate architecture
Low data quality / missing information
Loading data just because it is available
3-20 © Pearson Education Limited 2014
Massive DW and Scalability
Scalability
The main issues pertaining to scalability:
The amount of data in the warehouse
How quickly the warehouse is expected to grow
The number of concurrent users
The complexity of user queries
Good scalability means that queries and other
data-access functions will grow linearly with
the size of the warehouse
3-21 © Pearson Education Limited 2014
Real-Time/Active DW/BI
Enabling real-time data updates for real-
time analysis and real-time decision
making is growing rapidly
Push vs. Pull (of data)
Concerns about real-time BI
Not all data should be updated continuously
Mismatch of reports generated minutes apart
May be cost prohibitive
May also be infeasible
3-22 © Pearson Education Limited 2014
Traditional versus Active DW
3-23 © Pearson Education Limited 2014
DW Administration and Security
Data warehouse administrator (DWA)
DWA should…
have the knowledge of high-performance software,
hardware and networking technologies
possess solid business knowledge and insight
be familiar with the decision-making processes so as to
suitably design/maintain the data warehouse structure
possess excellent communications skills
Security and privacy is a pressing issue in DW
Safeguarding the most valuable assets
Government regulations (HIPAA, etc.)
Must be explicitly planned and executed
3-24 © Pearson Education Limited 2014
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise,
without the prior written permission of the publisher. Printed in the
United States of America.
3-25 © Pearson Education Limited 2014