0% found this document useful (0 votes)

60 views16 pages

Data Warehouse

Uploaded by

naackrmu2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views16 pages

Data Warehouse

Uploaded by

naackrmu2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

What is a Data Warehouse?

A Data Warehouse (DW) is a relational database that is designed for query and analysis rather than
transaction processing. It includes historical data derived from transaction data from single and
multiple sources.

A Data Warehouse provides integrated, enterprise-wide, historical data and focuses on providing
support for decision-makers for data modeling and analysis.

Data Warehouse is a relational database management system (RDBMS) construct to meet the
requirement of transaction processing systems. It can be loosely described as any centralized data
repository which can be queried for business benefits. It is a database that stores information
oriented to satisfy decision-making requests. It is a group of decision support technologies, targets to
enabling the knowledge worker (executive, manager, and analyst) to make superior and higher
decisions. So, Data Warehousing support architectures and tool for business executives to
systematically organize, understand and use their information to make strategic decisions.

Data Warehouse environment contains an extraction, transportation, and loading (ETL) solution, an
online analytical processing (OLAP) engine, customer analysis tools, and other applications that
handle the process of gathering information and delivering it to business users.

● A data warehouse is a database, which is kept separate from the organization's operational
database.
● There is no frequent updating done in a data warehouse.
● It possesses consolidated historical data, which helps the organization to analyze its business.
● A data warehouse helps executives to organize, understand, and use their data to take
strategic decisions.
● Data warehouse systems help in the integration of diversity of application systems.
● A data warehouse system helps in consolidated historical data analysis.

A Data Warehouse can be viewed as a data system with the following attributes:

● It is a database designed for investigative tasks, using data from various applications.

● It supports a relatively small number of clients with relatively long interactions.

● It includes current and historical data to provide a historical perspective of information.

● Its usage is read-intensive.

● It contains a few large tables.

"Data Warehouse is a subject-oriented, integrated, and time-variant store of information in support of

management's decisions."
Data Warehouse Features
The key features of a data warehouse are discussed below −
● Subject Oriented − A data warehouse is subject oriented because it provides information
around a subject rather than the organization's ongoing operations. These subjects can be
product, customers, suppliers, sales, revenue, etc. A data warehouse does not focus on the
ongoing operations, rather it focuses on modelling and analysis of data for decision making.
● Integrated − A data warehouse is constructed by integrating data from heterogeneous sources
such as relational databases, flat files, etc. This integration enhances the effective analysis of
data.
● Time Variant − The data collected in a data warehouse is identified with a particular time
period. The data in a data warehouse provides information from the historical point of view.
● Non-volatile − Non-volatile means the previous data is not erased when new data is added to
it. A data warehouse is kept separate from the operational database and therefore frequent
changes in operational database is not reflected in the data warehouse.

Data Warehouse Applications

As discussed before, a data warehouse helps business executives to organize, analyze, and use
their data for decision making. A data warehouse serves as a sole part of a plan-execute-assess
"closed-loop" feedback system for the enterprise management. Data warehouses are widely used in
the following fields −
● Financial services
● Banking services
● Consumer goods
● Retail sectors
● Controlled manufacturing

Types of Data Warehouse

Information processing, analytical processing, and data mining are the three types of data
warehouse applications that are discussed below −
● Information Processing − A data warehouse allows to process the data stored in it. The data
can be processed by means of querying, basic statistical analysis, reporting using crosstabs,
tables, charts, or graphs.
● Analytical Processing − A data warehouse supports analytical processing of the information
stored in it. The data can be analyzed by means of basic OLAP operations, including
slice-and-dice, drill down, drill up, and pivoting.
● Data Mining − Data mining supports knowledge discovery by finding hidden patterns and
associations, constructing analytical models, performing classification and prediction. These
mining results can be presented using the visualization tools.
Sr.No. Data Warehouse (OLAP) Operational Database(OLTP)

1 It involves historical processing of It involves day-to-day processing.

information.

2 OLAP systems are used by OLTP systems are used by clerks, DBAs, or
knowledge workers such as database professionals.
executives, managers, and
analysts.

3 It is used to analyze the business. It is used to run the business.

4 It focuses on Information out. It focuses on Data in.

5 It is based on Star Schema, It is based on Entity Relationship Model.

Snowflake Schema, and Fact
Constellation Schema.

6 It focuses on Information out. It is application oriented.

7 It contains historical data. It contains current data.

8 It provides summarized and It provides primitive and highly detailed data.

consolidated data.

9 It provides summarized and It provides detailed and flat relational view of data.
multidimensional view of data.

10 The number of users is in hundreds. The number of users is in thousands.

11 The number of records accessed is in The number of records accessed is in tens.
millions.

12 The database size is from 100GB to The database size is from 100 MB to 100 GB.
100 TB.

13 These are highly flexible. It provides high performance.

Need for Data Warehouse

1. 1) Business User: Business users require a data warehouse to view summarized data from the
past. Since these people are non-technical, the data may be presented to them in an
elementary form.

2. 2) Store historical data: Data Warehouse is required to store the time variable data from the
past. This input is made to be used for various purposes.

3. 3) Make strategic decisions: Some strategies may be depending upon the data in the data
warehouse. So, data warehouse contributes to making strategic decisions.

4. 4) For data consistency and quality: Bringing the data from different sources at a
commonplace, the user can effectively undertake to bring the uniformity and consistency in
data.

5. 5) High response time: Data warehouse has to be ready for somewhat unexpected loads and
types of queries, which demands a significant degree of flexibility and quick response time.

Benefits of Data Warehouse

1. Understand business trends and make better forecasting decisions.

2. Data Warehouses are designed to perform well enormous amounts of data.

3. The structure of data warehouses is more accessible for end-users to navigate, understand,
and query.
4. Queries that would be complex in many normalized databases could be easier to build and
maintain in data warehouses.

5. Data warehousing is an efficient method to manage demand for lots of information from lots
of users.

6. Data warehousing provide the capabilities to analyze a large amount of historical data.

Difference between Operational Database and Data

Warehouse

Operational Database Data Warehouse

Operational systems are designed to support Data warehousing systems are typically

high-volume transaction processing. designed to support high-volume analytical

processing (i.e., OLAP).

Operational systems are usually concerned with Data warehousing systems are usually

current data. concerned with historical data.

Data within operational systems are mainly Non-volatile, new data may be added regularly.

updated regularly according to need. Once Added rarely changed.

It is designed for real-time business dealing and It is designed for analysis of business

processes. measures by subject area, categories, and

attributes.
It is optimized for a simple set of transactions, It is optimized for extent loads and high,

generally adding or retrieving a single row at a complex, unpredictable queries that access

time per table. many rows per table.

It is optimized for validation of incoming Loaded with consistent, valid information,

information during transactions, uses validation requires no real-time validation.

data tables.

It supports thousands of concurrent clients. It supports a few concurrent clients relative to

OLTP.

Operational systems are widely process-oriented. Data warehousing systems are widely

subject-oriented

Operational systems are usually optimized to Data warehousing systems are usually

perform fast inserts and updates of associatively optimized to perform fast retrievals of

small volumes of data. relatively high volumes of data.

Data In Data Out

Less Number of data accessed. Large Number of data accessed.

Relational databases are created for on-line Data Warehouse designed for on-line

transactional Processing (OLTP) Analytical Processing (OLAP)

Data Warehouse Architecture

A data warehouse architecture is a method of defining the overall architecture of data communication
processing and presentation that exist for end-clients computing within the enterprise. Each data
warehouse is different, but all are characterized by standard vital components.

Production applications such as payroll accounts payable product purchasing and inventory control
are designed for online transaction processing (OLTP). Such applications gather detailed data from
day to day operations.

Data Warehouse Staging Area is a temporary location where a record from source systems is copied.

Architecture is the proper arrangement of the elements. We build a data warehouse with software and
hardware components. To suit the requirements of our organizations, we arrange these building we
may want to boost up another part with extra tools and services. All of these depends on our
circumstances.
A data warehouses uses a staging area (A place where data is processed before entering the
warehouse).

Types of Data Warehouse Architectures

Single-Tier Architecture

Single-Tier architecture is not periodically used in practice. Its purpose is to minimize the amount of
data stored to reach this goal; it removes data redundancies.

The figure shows the only layer physically available is the source layer. In this method, data
warehouses are virtual. This means that the data warehouse is implemented as a multidimensional
view of operational data created by specific middleware, or an intermediate processing layer.
The vulnerability of this architecture lies in its failure to meet the requirement for separation between
analytical and transactional processing. Analysis queries are agreed to operational data after the
middleware interprets them. In this way, queries affect transactional workloads.

Two-Tier Architecture

The requirement for separation plays an essential role in defining the two-tier architecture for a data
warehouse system, as shown in fig:

Although it is typically called two-layer architecture to highlight a separation between physically

available sources and data warehouses, in fact, consists of four subsequent data flow stages:

1. Source layer: A data warehouse system uses a heterogeneous source of data. That data is
stored initially to corporate relational databases or legacy databases, or it may come from an
information system outside the corporate walls.
1. Data Staging: The data stored to the source should be extracted, cleansed to remove
inconsistencies and fill gaps, and integrated to merge heterogeneous sources into one
standard schema. The so-named Extraction, Transformation, and Loading Tools (ETL) can
combine heterogeneous schemata, extract, transform, cleanse, validate, filter, and load source
data into a data warehouse.

2. Data Warehouse layer: Information is saved to one logically centralized individual repository: a
data warehouse. The data warehouses can be directly accessed, but it can also be used as a
source for creating data marts, which partially replicate data warehouse contents and are
designed for specific enterprise departments. Meta-data repositories store information on
sources, access procedures, data staging, users, data mart schema, and so on.

3. Analysis: In this layer, integrated data is efficiently, and flexible accessed to issue reports,
dynamically analyze information, and simulate hypothetical business scenarios. It should
feature aggregate information navigators, complex query optimizers, and customer-friendly
GUIs.

Three-Tier Architecture

The three-tier architecture consists of the source layer (containing multiple source system), the
reconciled layer and the data warehouse layer (containing both data warehouses and data marts).
The reconciled layer sits between the source data and data warehouse.

The main advantage of the reconciled layer is that it creates a standard reference data model for a
whole enterprise. At the same time, it separates the problems of source data extraction and
integration from those of data warehouse population. In some cases, the reconciled layer is also
directly used to accomplish better some operational tasks, such as producing daily reports that
cannot be satisfactorily prepared using the corporate applications or generating data flows to feed
external processes periodically to benefit from cleaning and integration.

This architecture is especially useful for the extensive, enterprise-wide systems. A disadvantage of
this structure is the extra file storage space used through the extra redundant reconciled layer. It also
makes the analytical tools a little further away from being real-time.
Metadata
Metadata is simply defined as data about data. The data that are used to represent other data is
known as metadata. For example, the index of a book serves as a metadata for the contents in the
book. In other words, we can say that metadata is the summarized data that leads us to the detailed
data.
In terms of data warehouse, we can define metadata as following −
● Metadata is a road-map to data warehouse.
● Metadata in data warehouse defines the warehouse objects.
● Metadata acts as a directory. This directory helps the decision support system to locate the
contents of a data warehouse.

Metadata Repository
Metadata repository is an integral part of a data warehouse system. It contains the following
metadata −
● Business metadata − It contains the data ownership information, business definition, and
changing policies.
● Operational metadata − It includes currency of data and data lineage. Currency of data refers
to the data being active, archived, or purged. Lineage of data means history of data migrated
and transformation applied on it.
● Data for mapping from operational environment to data warehouse − It metadata includes
source databases and their contents, data extraction, data partition, cleaning, transformation
rules, data refresh and purging rules.
● The algorithms for summarization − It includes dimension algorithms, data on granularity,
aggregation, summarizing, etc.
Data Cube
A data cube helps us represent data in multiple dimensions. It is defined by dimensions and facts.
The dimensions are the entities with respect to which an enterprise preserves the records.

Illustration of Data Cube

Suppose a company wants to keep track of sales records with the help of sales data warehouse with
respect to time, item, branch, and location. These dimensions allow to keep track of monthly sales
and at which branch the items were sold. There is a table associated with each dimension. This table
is known as dimension table. For example, "item" dimension table may have attributes such as
item_name, item_type, and item_brand.
The following table represents the 2-D view of Sales Data for a company with respect to time, item,
and location dimensions.

But here in this 2-D table, we have records with respect to time and item only. The sales for New
Delhi are shown with respect to time, and item dimensions according to type of items sold. If we
want to view the sales data with one more dimension, say, the location dimension, then the 3-D view
would be useful. The 3-D view of the sales data with respect to time, item, and location is shown in
the table below −

The above 3-D table can be represented as 3-D data cube as shown in the following figure −
Data Mart
Data marts contain a subset of organization-wide data that is valuable to specific groups of people in
an organization. In other words, a data mart contains only those data that is specific to a particular
group. For example, the marketing data mart may contain only data related to items, customers, and
sales. Data marts are confined to subjects.

Points to Remember About Data Marts

● Windows-based or Unix/Linux-based servers are used to implement data marts. They are
implemented on low-cost servers.
● The implementation cycle of a data mart is measured in short periods of time, i.e., in weeks
rather than months or years.
● The life cycle of data marts may be complex in the long run, if their planning and design are
not organization-wide.
● Data marts are small in size.
● Data marts are customized by department.
● The source of a data mart is departmentally structured data warehouse.
● Data marts are flexible.
The following figure shows a graphical representation of data marts.
OLTP System

OLTP System handle with operational data. Operational data are those data contained in the operation
of a particular system. Example, ATM transactions and Bank transactions, etc.

OLAP System

OLAP handle with Historical Data or Archival Data. Historical data are those data that are achieved
over a long period. For example, if we collect the last 10 years information about flight reservation, the
data can give us much meaningful data such as the trends in the reservation. This may provide useful
information like peak time of travel, what kind of people are traveling in various classes
(Economy/Business) etc.

The major difference between an OLTP and OLAP system is the amount of data analyzed in a single
transaction. Whereas an OLTP manage many concurrent customers and queries touching only an
individual record or limited groups of files at a time. An OLAP system must have the capability to
operate on millions of files to answer a single query.

Feature OLTP OLAP

Characteristic It is a system which is used to It is a system which is used to manage

manage operational Data. informational Data.

Users Clerks, clients, and information Knowledge workers, including managers,

technology professionals. executives, and analysts.

System OLTP system is a OLAP system is market-oriented, knowledge

orientation customer-oriented, transaction, workers including managers, do data analysts

and query processing are done executive and analysts.

by clerks, clients, and

information technology

professionals.

Data contents OLTP system manages current OLAP system manages a large amount of

data that too detailed and are historical data, provides facilitates for

used for decision making. summarization and aggregation, and stores

and manages data at different levels of

granularity. This information makes the data

more comfortable to use in informed decision

making.

Database Size 100 MB-GB 100 GB-TB

Database OLTP system usually uses an OLAP system typically uses either a star or

design entity-relationship (ER) data snowflake model and subject-oriented

model and application-oriented database design.

database design.

View OLTP system focuses primarily OLAP system often spans multiple versions of

on the current data within an a database schema, due to the evolutionary

enterprise or department, process of an organization. OLAP systems

without referring to historical also deal with data that originates from

information or data in different various organizations, integrating information

organizations. from many data stores.

Volume of data Not very large Because of their large volume, OLAP data are

stored on multiple storage media.

Access The access patterns of an OLTP Accesses to OLAP systems are mostly

patterns system subsist mainly of short, read-only methods because of these data

atomic transactions. Such a warehouses stores historical data.

system requires concurrency

control and recovery techniques.

Access mode Read/write Mostly write

Insert and Short and fast inserts and Periodic long-running batch jobs refresh the

Updates updates proposed by end-users. data.

Number of Tens Millions

records

accessed

Normalization Fully Normalized Partially Normalized

Processing Very Fast It depends on the amount of files contained,

Speed batch data refresh, and complex query may

take many hours, and query speed can be

upgraded by creating indexes.

Data Warehouse Unit 1
No ratings yet
Data Warehouse Unit 1
7 pages
Data Mining Final New
No ratings yet
Data Mining Final New
109 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
24 pages
Data Warehousing Fundamentals
No ratings yet
Data Warehousing Fundamentals
108 pages
Data Warehousing Module 1
No ratings yet
Data Warehousing Module 1
58 pages
Data Warehousing for Decision Makers
No ratings yet
Data Warehousing for Decision Makers
31 pages
Introduction To Data Warehouse Edited
No ratings yet
Introduction To Data Warehouse Edited
34 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
34 pages
Advance Concept in Data Bases Unit-5 by Arun Pratap Singh
100% (1)
Advance Concept in Data Bases Unit-5 by Arun Pratap Singh
82 pages
Data Mining Complete
No ratings yet
Data Mining Complete
95 pages
What Is Data Warehouse
No ratings yet
What Is Data Warehouse
19 pages
Unit One
No ratings yet
Unit One
41 pages
Data 20warehouse 20week 201 281 29
No ratings yet
Data 20warehouse 20week 201 281 29
27 pages
Data and AI - Data Warehousing
No ratings yet
Data and AI - Data Warehousing
58 pages
Data Warehousing: Understanding A Data Warehouse
No ratings yet
Data Warehousing: Understanding A Data Warehouse
4 pages
Data Warehousing
No ratings yet
Data Warehousing
77 pages
Data Mining 1
No ratings yet
Data Mining 1
41 pages
Data Warehousing
No ratings yet
Data Warehousing
77 pages
Data Warehousing & Mining BCA V SEM
No ratings yet
Data Warehousing & Mining BCA V SEM
107 pages
Data Warehouse
No ratings yet
Data Warehouse
39 pages
Notes DWDM
No ratings yet
Notes DWDM
12 pages
Module 3
No ratings yet
Module 3
17 pages
Introduction To Data Warehousing
No ratings yet
Introduction To Data Warehousing
5 pages
Etl Testing Documentation PDF
No ratings yet
Etl Testing Documentation PDF
22 pages
Data Warehousing Quick Guide
No ratings yet
Data Warehousing Quick Guide
92 pages
Data Warehousing - Quick Guide
No ratings yet
Data Warehousing - Quick Guide
70 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
34 pages
Data Warehousing - Quick Guide - Tutorialspoint
No ratings yet
Data Warehousing - Quick Guide - Tutorialspoint
67 pages
Data Warehousing Quick Guide
No ratings yet
Data Warehousing Quick Guide
43 pages
Data Warehousing&Data Mining
No ratings yet
Data Warehousing&Data Mining
170 pages
CCS341-Data Warehousing Notes-Unit I
100% (2)
CCS341-Data Warehousing Notes-Unit I
30 pages
DW Unit1
No ratings yet
DW Unit1
26 pages
Data Warehousing for Managers
No ratings yet
Data Warehousing for Managers
26 pages
Eval of Business Performance - Module 1
No ratings yet
Eval of Business Performance - Module 1
8 pages
Unit 1 DWDM Complete
No ratings yet
Unit 1 DWDM Complete
104 pages
Unit 1
No ratings yet
Unit 1
26 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
156 pages
Data Warehouse-Ccs341 Material
No ratings yet
Data Warehouse-Ccs341 Material
58 pages
Data Warehousing Fundamentals
No ratings yet
Data Warehousing Fundamentals
47 pages
Unit 2
No ratings yet
Unit 2
31 pages
Data War Eh Puse
No ratings yet
Data War Eh Puse
51 pages
Data Warehousing
No ratings yet
Data Warehousing
71 pages
DH&DM Unit-1
No ratings yet
DH&DM Unit-1
16 pages
UNITyssu 1 LT
No ratings yet
UNITyssu 1 LT
12 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
122 pages
Data Warehousing Essentials Guide
No ratings yet
Data Warehousing Essentials Guide
24 pages
Dataware House
No ratings yet
Dataware House
31 pages
Introduction To DW
No ratings yet
Introduction To DW
28 pages
DWM 1
No ratings yet
DWM 1
15 pages
Data Warehouse Tutorial
No ratings yet
Data Warehouse Tutorial
88 pages
Week #2 Data WH (Introduction)
No ratings yet
Week #2 Data WH (Introduction)
15 pages
ARCON PAM ITSM Integration
No ratings yet
ARCON PAM ITSM Integration
9 pages
SQL Code Challenges
No ratings yet
SQL Code Challenges
14 pages
Resume DS
No ratings yet
Resume DS
2 pages
Pexpect Tutorial
No ratings yet
Pexpect Tutorial
143 pages
Chapter 11 - Dynamic-Object-Modeling
No ratings yet
Chapter 11 - Dynamic-Object-Modeling
32 pages
1.2 Informix Database Administration
No ratings yet
1.2 Informix Database Administration
108 pages
Wave Creation in EWM - SAPSPOT
100% (1)
Wave Creation in EWM - SAPSPOT
23 pages
Sandeep Mishra Resume
No ratings yet
Sandeep Mishra Resume
3 pages
Attacking and Defending Active Directory
No ratings yet
Attacking and Defending Active Directory
59 pages
Brian Vanderburg Resume
No ratings yet
Brian Vanderburg Resume
3 pages
Active Directory Federation Services Training
No ratings yet
Active Directory Federation Services Training
3 pages
Using The Pentaho BI Suite Online Demonstration
No ratings yet
Using The Pentaho BI Suite Online Demonstration
14 pages
SAP - BPC - 70 - SP01 - NW - Sec PDF
No ratings yet
SAP - BPC - 70 - SP01 - NW - Sec PDF
40 pages
AspNet Core Documentation PDF
No ratings yet
AspNet Core Documentation PDF
3,087 pages
HR Analytics Dashboard
No ratings yet
HR Analytics Dashboard
1 page
Data Dictionary
100% (1)
Data Dictionary
6 pages
Auditing Quiz for Accounting Students
No ratings yet
Auditing Quiz for Accounting Students
3 pages
OOPS Long
No ratings yet
OOPS Long
10 pages
C# SQL BLOB Handling Guide
No ratings yet
C# SQL BLOB Handling Guide
6 pages
Db2 Faq: Db2 Questions Amit Sethi Page 1 of 49
No ratings yet
Db2 Faq: Db2 Questions Amit Sethi Page 1 of 49
49 pages
QUESTION
No ratings yet
QUESTION
13 pages
Microsoft Access - Exercise: Houston Public Library
No ratings yet
Microsoft Access - Exercise: Houston Public Library
3 pages
Platform Dependency
No ratings yet
Platform Dependency
6 pages
Laboratory Exercise: SAP Business One As An ERP System and Its Basic Navigation
No ratings yet
Laboratory Exercise: SAP Business One As An ERP System and Its Basic Navigation
9 pages
VMware PowerCLI User's Guide
No ratings yet
VMware PowerCLI User's Guide
150 pages
Accessing LR Portal
No ratings yet
Accessing LR Portal
51 pages
Lab 05 Analyzing Types of Attacks and Mitigation Techniques
No ratings yet
Lab 05 Analyzing Types of Attacks and Mitigation Techniques
8 pages
Granularity in The Data Warehouse - Building The Data Warehouse
No ratings yet
Granularity in The Data Warehouse - Building The Data Warehouse
20 pages
Geometry Dash Hackermode Guide
0% (1)
Geometry Dash Hackermode Guide
8 pages
DATABASE of Online Examination System
85% (13)
DATABASE of Online Examination System
3 pages

Data Warehouse

Uploaded by

Data Warehouse

Uploaded by

What is a Data Warehouse?

● It supports a relatively small number of clients with relatively long interactions.

● It includes current and historical data to provide a historical perspective of information.

● Its usage is read-intensive.

● It contains a few large tables.

"Data Warehouse is a subject-oriented, integrated, and time-variant store of information in support of

Data Warehouse Applications

Types of Data Warehouse

1 It involves historical processing of It involves day-to-day processing.

3 It is used to analyze the business. It is used to run the business.

4 It focuses on Information out. It focuses on Data in.

5 It is based on Star Schema, It is based on Entity Relationship Model.

6 It focuses on Information out. It is application oriented.

7 It contains historical data. It contains current data.

8 It provides summarized and It provides primitive and highly detailed data.

10 The number of users is in hundreds. The number of users is in thousands.

13 These are highly flexible. It provides high performance.

Need for Data Warehouse

Benefits of Data Warehouse

1. Understand business trends and make better forecasting decisions.

2. Data Warehouses are designed to perform well enormous amounts of data.

Difference between Operational Database and Data

Operational Database Data Warehouse

high-volume transaction processing. designed to support high-volume analytical

processing (i.e., OLAP).

current data. concerned with historical data.

updated regularly according to need. Once Added rarely changed.

processes. measures by subject area, categories, and

time per table. many rows per table.

It is optimized for validation of incoming Loaded with consistent, valid information,

information during transactions, uses validation requires no real-time validation.

It supports thousands of concurrent clients. It supports a few concurrent clients relative to

small volumes of data. relatively high volumes of data.

Data In Data Out

Less Number of data accessed. Large Number of data accessed.

transactional Processing (OLTP) Analytical Processing (OLAP)

Data Warehouse Architecture

Types of Data Warehouse Architectures

Although it is typically called two-layer architecture to highlight a separation between physically

Illustration of Data Cube

Points to Remember About Data Marts

Feature OLTP OLAP

Characteristic It is a system which is used to It is a system which is used to manage

manage operational Data. informational Data.

Users Clerks, clients, and information Knowledge workers, including managers,

technology professionals. executives, and analysts.

System OLTP system is a OLAP system is market-oriented, knowledge

orientation customer-oriented, transaction, workers including managers, do data analysts

and query processing are done executive and analysts.

by clerks, clients, and

used for decision making. summarization and aggregation, and stores

and manages data at different levels of

granularity. This information makes the data

more comfortable to use in informed decision

Database Size 100 MB-GB 100 GB-TB

design entity-relationship (ER) data snowflake model and subject-oriented

model and application-oriented database design.

on the current data within an a database schema, due to the evolutionary

enterprise or department, process of an organization. OLAP systems

information or data in different various organizations, integrating information

organizations. from many data stores.

stored on multiple storage media.

atomic transactions. Such a warehouses stores historical data.

system requires concurrency

control and recovery techniques.

Access mode Read/write Mostly write

Updates updates proposed by end-users. data.

Number of Tens Millions

Normalization Fully Normalized Partially Normalized

Processing Very Fast It depends on the amount of files contained,

Speed batch data refresh, and complex query may

take many hours, and query speed can be

upgraded by creating indexes.

You might also like