0% found this document useful (0 votes)

40 views10 pages

DSS Ch03

Lecture notes decision support system Chapter 3

Uploaded by

coolco270

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views10 pages

DSS Ch03

Lecture notes decision support system Chapter 3

Uploaded by

coolco270

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 3:

Data Warehousing

Learning Objectives for Chapter 3

1. Understand the basic definitions and concepts of data warehouses

2. Understand data warehousing architectures
3. Describe the processes used in developing and managing data warehouses
4. Explain data warehousing operations
5. Explain the role of data warehouses in decision support
6. Explain data integration and the extraction, transformation, and load (ETL) processes
7. Describe real-time (active) data warehousing
8. Understand data warehouse administration and security issues

Section 3.2

Data warehouse

A data warehouse is defined in this section as “a pool of data produced to support decision
making.” A physical repository where relational data are specially organized to provide enterprise-
wide, cleansed data in a standardized format

The data warehouse is a collection of integrated, subject-oriented databases designed to support

DSS functions, where each unit of data is non-volatile and relevant to some moment in time”.

Data warehouse versus database

Technically a data warehouse is a database, albeit with certain characteristics to facilitate its role in
decision support. Specifically, however, it is an “integrated, time-variant, nonvolatile, subject-
oriented repository of detail and summary data used for decision support and business
analytics within an organization.” These characteristics are not necessarily true of databases in
general—though each could apply individually to a given one.

As a practical matter, most databases are highly normalized, in part to avoid update anomalies.
Data warehouses are highly denormalized for performance reasons. This is acceptable because
their content is never updated, just added to. Historical data are static.

Operational Data Store (ODS) is the database from which a business operates on an on-going
basis.
An EDW (Enterprise Data Warehouse) is an all-encompassing DW that covers all subject areas
of interest to the entire organization.

1
Data mart is a smaller DW designed around one problem, organizational function, topic, or other
suitable focus area.

Metadata is data about data

Metadata is beneficial because it is the mean through which applications and users access the
content of a data warehouse, through which its security is managed, and through which
organizational management manages, in the true sense of the word, its information assets. Most
database management systems would be unable to function without at least some metadata. Indeed,
the use of metadata, which enable data access through names and logical relationships rather than
physical locations, is fundamental to the very concept of a DBMS.Metadata are essential to any
database, not just a data warehouse.

Section 3.3

Description of the data warehousing process

The data warehousing process consists of the following steps:

1. Data are imported from various internal and external sources
2. Data are cleansed and organized consistently with the organization’s needs
3. Data are loaded into the enterprise data warehouse, or Data are loaded into data marts.
4. If desired, data marts are created as subsets of the EDW, or the data marts are consolidated
into the EDW
5. Analyses are performed as needed

The major components of a data warehouse

 Data sources. Data are sourced from operational systems and possibly from external data
sources.
 Data extraction and transformation. Data are extracted and properly transformed using
custom-written or commercial software called ETL.
 Data loading. Data are loaded into a staging area, where they are transformed and cleansed.
The data are then ready to load into the data warehouse.
 Comprehensive database. This is the EDW that supports decision analysis by providing
relevant summarized and detailed information.
 Metadata. Metadata are maintained for access by IT personnel and users. Metadata include
rules for organizing data summaries that are easy to index and search.
 Middleware tools. Middleware tools enable access to the data warehouse from a variety of
front-end applications.

Middleware Tools and their Role

Middleware tools enable access to the data warehouse. Power users such as analysts may write
their own SQL queries. Others may access data through a managed query environment. There are
many front-end applications that business users can use to interact with data stored in the data
repositories, including data mining, OLAP, reporting tools, and data visualization tools. All these
have their own data access requirements. Those may not match with how a given data warehouse
must be accessed. Middleware translates between the two.

2
Section 3.4 DW Architecture

Three-tier architecture
1. Data acquisition software (back-end)
2. The data warehouse that contains the data & software
3.Client (front-end) software that allows users to access and analyze data

Tier 1: Tier 2: Tier 3:

Client workstation Application server Database server

Two-tier architecture

Tier 1: Tier 2:
Client workstation Application & database server

Key Similarities and Differences between a Two-Tiered Architecture and a Three-Tiered

Architecture

Both provide the same user visibility through a client system that accesses a DSS/BI application
remotely. The difference is behind the scenes and is invisible to the user: in a two-tiered
architecture, the application and data warehouse reside on the same machine; in a three-tiered
architecture, they are on separate machines.

However, Web influence on data warehouse design by making Web-based data warehousing
possible.

Web pages
Application
Server

Client Web
(Web browser) Internet/ Server
Intranet/
Extranet
Data
warehouse

3
Alternative data warehousing architectures

 Independent data marts architecture

 Data mart bus architecture with linked dimensional data marts
 Hub-and-spoke architecture (corporate information factory)
 Centralized data warehouse architecture
 Federated architecture

(a) Independent Data Marts Architecture

ETL
End user
Source Staging Independent data marts
access and
Systems Area (atomic/summarized data)
applications

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

ETL
Dimensionalized data marts End user
Source Staging
linked by conformed dimensions access and
Systems Area
(atomic/summarized data) applications

(c) Hub and Spoke Architecture (Corporate Information Factory)

ETL
End user
Source Staging Normalized relational
access and
Systems Area warehouse (atomic data)
applications

Dependent data marts

(summarized/some atomic data)

ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data
out of the source systems and placing it into a data warehouse.
ETL involves:
- extracting the data from source systems (SAP(System Analysis and Program Development),
ERP, other operational systems), data from different source systems is converted into one
consolidated data warehouse format which is ready for transformation processing.
- transforming the data may involve the following tasks:
applying business rules (so-called derivations, e.g., calculating new measures and dimensions),
cleaning (e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.),
filtering (e.g., selecting only certain columns to load),
splitting a column into multiple columns and vice versa,
joining together data from multiple sources (e.g., lookup, merge),
- loading the data into a data warehouse or data repository other reporting applications

4
(a) Independent Data Marts Architecture

ETL
End user
Source Staging Independent data marts
access and
Systems Area (atomic/summarized data)
applications

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

ETL
Dimensionalized data marts End user
Source Staging
linked by conformed dimensions access and
Systems Area
(atomic/summarized data) applications

(c) Hub and Spoke Architecture (Corporate Information Factory)

ETL
End user
Source Staging Normalized relational
access and
Systems Area warehouse (atomic data)
applications

Dependent data marts

(summarized/some atomic data)

The factors that should be considered when deciding which architecture to use in developing a
data warehouse:

1. Information interdependence between organizational units

2. Upper management’s information needs
3. Urgency of need for a data warehouse
4. Nature of end-user tasks
5. Constraints on resources
6. Strategic view of the data warehouse prior to implementation
7. Compatibility with existing systems
8. Perceived ability of the in-house IT staff
9. Technical issues
10. Social/political factors

Section 3.5
Data Integration

Data integration is an umbrella term that covers three processes that combine to move data from
multiple sources into a data warehouse: accessing the data, combining different views of the
data, and capturing changes to the data.

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications

5
ETL process:

ETL is the process through which data are loaded into a data warehouse, a DW could not exist
without it. The ETL process also contributes to the quality of the data in a DW. It includes the
following steps:

Extraction: selecting data from one or more sources and reading the selected data.

Transformation: converting data from their original form to whatever form the DW needs. This
step often also includes cleansing of the data to remove as many errors as possible.

Load: putting the converted (transformed) data into the DW.

Section 3.6
Benefits of data warehouses
Direct benefits include:
 Allowing end users to perform extensive analysis in numerous ways.
 A consolidated view of corporate data (i.e., a single version of the truth).
 Better and more timely information. A data warehouse permits information processing to be
offloaded from costly operational systems onto low-cost servers; therefore, end-user
information requests can be processed more quickly.
 Enhanced system performance. A data warehouse frees production processing because some
operational system reporting requirements are moved to DSS.
 Simplification of data access.

Indirect benefits arise when end users take advantage of these direct benefits.

Criteria for selecting a data warehouse vendor

Six important criteria are: financial strength, ERP linkages, qualified consultants, market share,
industry experience, and established partnerships. These are important to indicate that a
vendor is likely to be in business for the long term, to have the support capabilities its customers
need, and to provide products that interoperate with other products the potential user has or may
obtain.

Additionally, product functionality (Does it do what we need?), vendor strategic vision (Does their
direction make sense for our future plans and/or is it consistent with industry trends?) and quality of
customer references (What do their existing customers think of them?).

OLAP Online analytical processingis one of the most commonly used data analysis techniques in
data warehouses. OLAP is an approach to quickly answer ad hoc questions that require data
analysis. Data stored in a data warehouse can be analyzed using techniques referred to as OLAP.

The main operational structure in OLAP is based on a concept called cube. A cube in OLAP is a
multidimensional data structure (actual or virtual) that allows fast analysis of data.

6
Using OLAP, an analyst can navigate through the database and screen for a particular subset of the
data (and its progression over time) by changing the data’s orientations and defining analytical
calculations. These types of user-initiated navigation of data through the specification of slices (via
rotations) and drill down/up (via aggregation and disaggregation) are sometimes called “slice and
dice.” Commonly used OLAP operations include slice and dice, drill down, roll up, and pivot.

 Slice: A slice is a subset of a multidimensional array (usually a two-dimensional

representation) corresponding to a single value set for one (or more) of the dimensions not
in the subset.

 Dice: The dice operation is a slice on more than two dimensions of a data cube.

 Drill Down/Up: Drilling down or up is a specific OLAP technique whereby the user
navigates among levels of data ranging from the most summarized (up) to the most detailed
(down).

OLTP online transaction processing is concerned with the capture and storage of data.

ROLAP stands for Relational Online Analytical Processing. ROLAP is an alternative to the
MOLAP (Multidimensional OLAP) technology. While both ROLAP and MOLAP analytic tools
are designed to allow analysis of data through the use of a multidimensional data model, ROLAP
differs in that it does not require the pre-computation and storage of information. Instead, ROLAP
tools access the data in a relational database and generate SQL queries to calculate information at
the appropriate level when an end user requests it. MOLAP is an alternative to the ROLAP
technology. MOLAP differs from ROLAP significantly in that it requires the pre-computation and
storage of information in the cube—the operation known as preprocessing. MOLAP stores this data
in an optimized multidimensional array storage, rather than in a relational database.

HOLAP (Hybrid Online Analytical Processing) is a combination of ROLAP and MOLAP.

HOLAP allows storing part of the data in a MOLAP store and another part of the data in a ROLAP
store. The degree of control that the cube designer has over this partitioning varies from product to
product.

All of these are variations of OLAP.

Section 3.7
Major DW implementation tasks that can be performed in parallel

Reeves (2009) and Solomon (2005) provided some guidelines regarding the critical questions that
must be asked, some risks that should be weighted, and some processes that can be followed to help
ensure a successful data warehouse implementation. They compiled a list of 11 major tasks that
could be performed in parallel:

Establishment of service-level agreements and data-refresh requirements

Identification of data sources and their governance policies
Data quality planning
Data model design
ETL tool selection
Relational database software and platform selection

7
Data transport
Data conversion
Reconciliation process
Purge and archive planning
End-user support

DW implementation guidelines

 Senior management must support development of the data warehouse. The DW needs a
project champion at a high position in the organization chart. Benefits of a DW project may
be difficult to measure, so management support makes it more likely the project will receive
funding.
 Web-based data warehouses may need special security requirements. These ensure that only
authorized users have access to the data.
 Users should participate in the development process. Their participation is essential for data
modeling and access modeling. User participation ensures that the DW includes the needed
data and that decision makers can retrieve the data they need.
 DW implementation requires certain skills from members of the development team: in-depth
knowledge of database technology and the development tools used.

Risks and issues to consider and potentially avoid when developing a successful data warehouse

 Starting with the wrong sponsorship chain

 Setting expectations that you cannot meet
 Engaging in politically naive behavior
 Loading the data warehouse with information just because it is available
 Believing that data warehousing database design is the same as transactional database design
 Choosing a data warehouse manager who is technology oriented rather than user oriented
 Focusing on traditional internal record-oriented data and ignoring the value of external data
and of text, images, and, perhaps, sound and video
 Delivering data with overlapping and confusing definitions
 Believing promises of performance, capacity, and scalability
 Believing that your problems are over when the data warehouse is up and running
 Focusing on ad hoc data mining and periodic reporting instead of alerts

Scalability refers to the degree to which a system can adjust to changes in demand without major
additional changes or investments. DW scalability issues are the amount of data in the
warehouse, how quickly the warehouse is expected to grow, the number of concurrent users, and
the complexity of user queries. A data warehouse must scale both horizontally and vertically.
The warehouse will grow as a function of data growth and the need to expand the warehouse to
support new business functionality. Data growth may be a result of the addition of current cycle
data (e.g., this month’s results) and/or historical data.

Section 3.8

RDW A real-time data warehouse, in which decision-making data are updated on an ongoing basis
as business transactions occur; same as an active data warehouse (ADW).

8
The benefits of an RDW

The RDW extends the benefits of data warehousing, in general, down into tactical, and perhaps
operational, decision making. It empowers people who interact directly with customers and
suppliers by providing them with information to make decisions. It can then be extended to
customers and suppliers themselves, thus affecting almost all aspects of customer service, SCM,
logistics, and beyond. It can also facilitate e-business activities, as when sales outlets such as
overstock.com (cited in the text) use historical data to price new close-outs.

Differences between a traditional data warehouse and an RDW

1. A traditional data warehouse (TDW) is used for strategic decisions (and sometimes
tactical); an RDW for strategic and tactical (sometimes operational) ones.
2. The results of using a TDW can be hard to measure; results of using an RDW are
measured by operational data.
3. Acceptable TDW refresh rates range from daily to monthly; RDW data must be up to the
minute.
4. TDW summaries are often appropriate; RDWs must supply detailed data.
5. Small user community at upper organizational levels means a TDW supports few
concurrent users; an RDW must support many, perhaps over a thousand.
6. TDWs typically use restrictive reporting to confirm or check patterns, often predefined
summary tables; RDWs need flexible, ad hoc reporting.
7. TDW user community generally consists of power users, knowledge workers, managers,
other internal users; RDWs are used by operational staff, call centers, perhaps external
users.

The drivers for RDW

 A business often cannot afford to wait a whole day for its operational data to load into the
data warehouse for analysis.
 Traditional data warehouses have captured snapshots of an organization’s fixed states
instead of incremental real-time data showing every state change and almost analogous
patterns over time.
 With a traditional hub-and-spoke architecture, retaining the metadata in sync is difficult. It
is also costly to develop, maintain, and secure many systems as opposed to one huge data
warehouse so that data are centralized for BI/BA tools.
 In cases of huge nightly batch loads, the necessary ETL setup and processing power for
large nightly data warehouse loading might be very high, and the processes might take too
long. An EAI with real-time data collection can reduce or eliminate the nightly batch
processes.
Section 3.9
Effective security in a data warehouse should focus on four main areas:

Step 1. Establishing effective corporate and security policies and procedures. An effective security
policy should start at the top and be communicated to everyone in the organization.

9
Step 2. Implementing logical security procedures and techniques to restrict access. This includes
user authentication, access controls, and encryption.
Step 3. Limiting physical access to the data center environment.
Step 4. Establishing an effective internal control review process for security and privacy.

Skills of a Data warehouse administrator DWA

 Familiarity with high-performance hardware, software, and networking technologies, since

the data warehouse is based on those
 Solid business insight, to understand the purpose of the DW and its business justification
 Familiarity with business decision-making processes to understand how the DW will be
used
 Excellent communication skills, to communicate with the rest of the organization

Following are some of the recently popularized concepts and technologies that will play a
significant role in defining the future of data warehousing

Sourcing: Acquisition of data from diverse and dispersed sources

 Web, social media, and Big Data
 Open source software
 SaaS (software as a service) “The Extended ASP Model”
 Cloud computing Infrastructure: Architectural—hardware and software—enhancements
 Columnar (a new way to score and access data in the database)
 Real-time data warehousing
 Data warehouse appliances (all-in-one solutions to DW)
 Data management technologies and practices
 In-database processing technology (putting the algorithms where the data is)
 In-memory storage technology (moving the data in the memory for faster processing)
 New database management systems
 Advanced analytics
As the world of business becomes more global and complex, the need for business intelligence and
data warehousing tools also becomes more prominent. The fast improving information technology
tools and techniques seem to be moving in the right direction to address the needs of the future
business intelligence systems.

Turban Dss9e Ch08
No ratings yet
Turban Dss9e Ch08
50 pages
Topic 4 (Data Warehouse)
No ratings yet
Topic 4 (Data Warehouse)
41 pages
BI Chapter 03 - Unlocked
No ratings yet
BI Chapter 03 - Unlocked
80 pages
Data Warehouse
No ratings yet
Data Warehouse
39 pages
Data Warehousing for IT Professionals
No ratings yet
Data Warehousing for IT Professionals
65 pages
Unit1 (DW&DM)
No ratings yet
Unit1 (DW&DM)
30 pages
3510-6510 Ch3
No ratings yet
3510-6510 Ch3
64 pages
Unit-3 - I MGN 343
No ratings yet
Unit-3 - I MGN 343
61 pages
Lect 4
No ratings yet
Lect 4
27 pages
Data Warehousing
No ratings yet
Data Warehousing
32 pages
DWM Unit 2 Notes
No ratings yet
DWM Unit 2 Notes
22 pages
Data Warehouse Notes
No ratings yet
Data Warehouse Notes
21 pages
Sharda dss10 PPT 03 GE-211566
No ratings yet
Sharda dss10 PPT 03 GE-211566
48 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
42 pages
Data Warehouse & Data Mining Notes
No ratings yet
Data Warehouse & Data Mining Notes
9 pages
Business Intelligence Unit-2
No ratings yet
Business Intelligence Unit-2
35 pages
Module 3 - Datawarehousing
No ratings yet
Module 3 - Datawarehousing
45 pages
Data Warehousing
No ratings yet
Data Warehousing
15 pages
CH 03
No ratings yet
CH 03
48 pages
Data Warehouse - Final
No ratings yet
Data Warehouse - Final
28 pages
Data Warehouse Definition
No ratings yet
Data Warehouse Definition
12 pages
Week 7 Descriptive Analytics II Business Intelligence and Data Warehousing
No ratings yet
Week 7 Descriptive Analytics II Business Intelligence and Data Warehousing
43 pages
EDWH
No ratings yet
EDWH
10 pages
Unit B Data Warehousing
No ratings yet
Unit B Data Warehousing
26 pages
Sharda dss10 PPT 03
No ratings yet
Sharda dss10 PPT 03
43 pages
Lecture 2 - Data Warehouse Architecture
No ratings yet
Lecture 2 - Data Warehouse Architecture
28 pages
Chap 2 - Data Warehousing Part I
No ratings yet
Chap 2 - Data Warehousing Part I
31 pages
DW Notes
No ratings yet
DW Notes
4 pages
DW Midterms Notes
No ratings yet
DW Midterms Notes
48 pages
03 Data Warehouse
No ratings yet
03 Data Warehouse
27 pages
BI Module 3
No ratings yet
BI Module 3
10 pages
Data Warehousing for Managers
No ratings yet
Data Warehousing for Managers
16 pages
Business Intelligence, Analytics, and Data Science: A Managerial Perspective
No ratings yet
Business Intelligence, Analytics, and Data Science: A Managerial Perspective
64 pages
Bida Notes
No ratings yet
Bida Notes
67 pages
DW Architecture & Best Practices
No ratings yet
DW Architecture & Best Practices
67 pages
Data Warehousing Information
No ratings yet
Data Warehousing Information
20 pages
Descriptive Analytics II
No ratings yet
Descriptive Analytics II
63 pages
Data Warehouse
No ratings yet
Data Warehouse
74 pages
CH 2 Introduction To Data Warehousing
No ratings yet
CH 2 Introduction To Data Warehousing
31 pages
Module 2
No ratings yet
Module 2
43 pages
Data Warehousing and Mining Guide
No ratings yet
Data Warehousing and Mining Guide
46 pages
Data Warehouse
No ratings yet
Data Warehouse
56 pages
Ch3 - Data Warehousing
No ratings yet
Ch3 - Data Warehousing
28 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
39 pages
DWH Chapter2
No ratings yet
DWH Chapter2
42 pages
C Lecture
No ratings yet
C Lecture
8 pages
Module - 3 BI
No ratings yet
Module - 3 BI
50 pages
Data Warehousing
No ratings yet
Data Warehousing
16 pages
Dataware House
100% (8)
Dataware House
42 pages
Turban Bi2e PP Ch02
No ratings yet
Turban Bi2e PP Ch02
48 pages
2024 Meeting 1 - Data Warehouse Fundamentals
No ratings yet
2024 Meeting 1 - Data Warehouse Fundamentals
47 pages
Data Warehousing Essentials Guide
100% (2)
Data Warehousing Essentials Guide
60 pages
Data Warehousing and On-Line Analytical Processing
No ratings yet
Data Warehousing and On-Line Analytical Processing
40 pages
Data Warehouse
100% (1)
Data Warehouse
12 pages
Data Warehousing Presentation
No ratings yet
Data Warehousing Presentation
12 pages
Fourth Edition: Descriptive Analytics II: Business Intelligence and Data Warehousing
No ratings yet
Fourth Edition: Descriptive Analytics II: Business Intelligence and Data Warehousing
61 pages
Chapter 03 DSS 2025 Part1
No ratings yet
Chapter 03 DSS 2025 Part1
28 pages
Oracle 1z0 134
No ratings yet
Oracle 1z0 134
10 pages
Fortinet's Security Solutions Overview
No ratings yet
Fortinet's Security Solutions Overview
12 pages
Materi Workshop 2 ISO 27K Perbarindo 2023 v3
No ratings yet
Materi Workshop 2 ISO 27K Perbarindo 2023 v3
111 pages
HDF5 Users Guide
No ratings yet
HDF5 Users Guide
342 pages
SAP Transaction Codes Guide
No ratings yet
SAP Transaction Codes Guide
4 pages
Assignment 1 DBMS
No ratings yet
Assignment 1 DBMS
13 pages
Cloud Computing Summary
No ratings yet
Cloud Computing Summary
2 pages
How To Initialize TPM Successfully When You Enable Bitlocker in Windows 7
No ratings yet
How To Initialize TPM Successfully When You Enable Bitlocker in Windows 7
19 pages
GitHub - Ieski - Doodba - Base Image For Making The Creation of Customized Odoo Environments A Piece of Cake
No ratings yet
GitHub - Ieski - Doodba - Base Image For Making The Creation of Customized Odoo Environments A Piece of Cake
16 pages
Business Intelligence Insights
No ratings yet
Business Intelligence Insights
5 pages
4 Activities and Techniques For Data Quality: Generalities
No ratings yet
4 Activities and Techniques For Data Quality: Generalities
2 pages
Four Student's Science Subject at Trinity Christian Academy"
No ratings yet
Four Student's Science Subject at Trinity Christian Academy"
3 pages
Oracle Exam Prep: 1z0-083
No ratings yet
Oracle Exam Prep: 1z0-083
28 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
8 pages
cp4451 Tammik Estorage
No ratings yet
cp4451 Tammik Estorage
16 pages
Updated Naan Mudhalvan
No ratings yet
Updated Naan Mudhalvan
12 pages
SNMP Faq Part2
No ratings yet
SNMP Faq Part2
91 pages
ER Diagram Representation
No ratings yet
ER Diagram Representation
2 pages
Railway Reservation System Update
No ratings yet
Railway Reservation System Update
1 page
Salesforce Business Analyst Resume
100% (1)
Salesforce Business Analyst Resume
4 pages
Madhuri Resume
No ratings yet
Madhuri Resume
4 pages
Recon Automation Using OSINT
100% (1)
Recon Automation Using OSINT
5 pages
Rapport PFE Rayen BOUROUBA Finale
No ratings yet
Rapport PFE Rayen BOUROUBA Finale
183 pages
File System Structure
No ratings yet
File System Structure
12 pages
Federated Cloud
No ratings yet
Federated Cloud
3 pages
CSS Combinators Guide
No ratings yet
CSS Combinators Guide
4 pages
HOSTEL
No ratings yet
HOSTEL
11 pages
CS614 FinalSolvedMCQ
No ratings yet
CS614 FinalSolvedMCQ
19 pages
DM Notes
No ratings yet
DM Notes
26 pages
Points List.: Digitizing With Mapinfo
No ratings yet
Points List.: Digitizing With Mapinfo
6 pages

DSS Ch03

Uploaded by

DSS Ch03

Uploaded by

Chapter 3:

Learning Objectives for Chapter 3

1. Understand the basic definitions and concepts of data warehouses

The data warehouse is a collection of integrated, subject-oriented databases designed to support

Data warehouse versus database

Metadata is data about data

Description of the data warehousing process

The data warehousing process consists of the following steps:

The major components of a data warehouse

Middleware Tools and their Role

Tier 1: Tier 2: Tier 3:

Key Similarities and Differences between a Two-Tiered Architecture and a Three-Tiered

 Independent data marts architecture

(a) Independent Data Marts Architecture

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

(c) Hub and Spoke Architecture (Corporate Information Factory)

Dependent data marts

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

(c) Hub and Spoke Architecture (Corporate Information Factory)

Dependent data marts

1. Information interdependence between organizational units

Load: putting the converted (transformed) data into the DW.

Criteria for selecting a data warehouse vendor

 Slice: A slice is a subset of a multidimensional array (usually a two-dimensional

HOLAP (Hybrid Online Analytical Processing) is a combination of ROLAP and MOLAP.

All of these are variations of OLAP.

Establishment of service-level agreements and data-refresh requirements

 Starting with the wrong sponsorship chain

Differences between a traditional data warehouse and an RDW

The drivers for RDW

Skills of a Data warehouse administrator DWA

 Familiarity with high-performance hardware, software, and networking technologies, since

Sourcing: Acquisition of data from diverse and dispersed sources

You might also like