0% found this document useful (0 votes)

59 views16 pages

Dwbi Unit-1

The document provides an overview of data warehousing and business intelligence, defining key concepts such as data, data warehouses, and their benefits. It distinguishes between operational databases and data warehouses, highlighting their different functions, structures, and user requirements. Additionally, it covers OLAP operations, data cube classifications, and modeling techniques essential for effective data analysis and decision-making.

Uploaded by

yashusai567

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views16 pages

Dwbi Unit-1

Uploaded by

yashusai567

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

R-22

DATA WAREHOUSING AND BUSINESS INTELLIGENCE

UNIT-I

UNIT - I : Data Warehouse, Data Warehouse Modelling, OLAP operations, Data Cube Computation
methods.

What is Data?

Data is a collection of information gathered by observations, measurements, research or analysis. They

may consist of facts, numbers, names, figures or even description of things. Data is organized in the form
of graphs, charts or tables. There exist data scientist who does data mining and with the help of that data
analyse our world.

You must have watched the weather forecast reports on news channels. They list the minimum
temperature, the maximum temperature, rainfall predictions and measurements. The tabular
representation is indicated below.

Classification of Data

Data is classified into

 Qualitative: It describes the quality of something or someone. It is descriptive information. For

example, the skin colour, eye colour, hair texture, etc. gives us the qualitative information about a
person.

 Quantitative: It provides numerical information. Example, the height and weight of a person.

Data Warehouse Definition :

A Data Warehouse (DW) is a relational database that is designed for query and analysis rather
than transaction processing. It includes historical data derived from transaction data from single
1
R-22

and multiple sources.

A Data Warehouse provides integrated, enterprise-wide, historical data and focuses on providing
support for decision-makers for data modeling and analysis.

A Data Warehouse is a group of data specific to the entire organization, not only to a particular group
of users.

It is not used for daily operations and transaction processing but used for making decisions

Subject-Oriented :

A data warehouse target on the modeling and analysis of data for decision-makers. Therefore,
data warehouses typically provide a concise and straightforward view around a particular
subject, such as customer, product, or sales, instead of the global organization's ongoing
operations. This is done by excluding data that are not useful concerning the subject and
including all data needed by the users to understand the subject.

Integrated :

A data warehouse integrates various heterogeneous data sources like RDBMS, flat files, and
online transaction records. It requires performing data cleaning and integration during data
warehousing to ensure consistency in naming conventions, attributes types, etc., among
different data sources

Time-Variant :

Historical information is kept in a data warehouse. For example, one can retrieve files from 3
2
R-22

months, 6 months, 12 months, or even previous data from a data warehouse. These
variations with a transactions system, where often only the most current file is kept

Non-Volatile:

The data warehouse is a physically separate data storage, which is transformed from the
source operational RDBMS. The operational updates of data do not occur in the data
warehouse, i.e., update, insert, and delete operations are not performed. It usually requires
only two procedures in data accessing: Initial loading of data and access to data.

Benefits of Data Warehouse :

1. Understand business trends and make better forecasting decisions.

2. Data Warehouses are designed to perform well enormous amounts of data.

3. The structure of data warehouses is more accessible for end-users to navigate,

understand, and query.

4. Queries that would be complex in many normalized databases could be easier to build
and maintain in data warehouses.

5. Data warehousing is an efficient method to manage demand for lots of information from
lots of users.

6. Data warehousing provide the capabilities to analyze a large amount of historical data.

Need for Data Warehouse :

3
R-22

1. Business User: Business users require a data warehouse to view summarized data from the
past. Since these people are non-technical, the data may be presented to them in an
elementary form.

2. Store historical data: Data Warehouse is required to store the time variable data from the
past. This input is made to be used for various purposes.

3. Make strategic decisions: Some strategies may be depending upon the data in the
data warehouse. So, data warehouse contributes to making strategic decisions.

4. For data consistency and quality: Bringing the data from different sources at a
commonplace, the user can effectively undertake to bring the uniformity and consistency in
data.

5. High response time: Data warehouse has to be ready for somewhat unexpected loads and
types of queries, which demands a significant degree of flexibility and quick response time.

Database Vs Datawarehouse

Feature Database Data Warehouse

Store and manage data for day-to- Analyze historical data for
Primary Function
day operations trends and insights

Optimized for fast retrieval and Optimized for complex

Data
modification (CRUD – Create, Read, queries and analysis (OLAP –
Structure
Update, Delete) Online Analytical Processing)

Primarily historical and

Data
Primarily current data integrated data from various
Currency
sources

4
R-22

Often denormalized to
Highly normalized to minimize
Schema improve query performance
redundancy
for analysis

Frequent updates as transactions Periodic updates (batch

Updates
occur processing)

Operational applications, individual Business analysts, data

Users
users scientists, executives

Focuses on data governance

Focuses on data integrity and access
Security and access control for
control for specific users
analytical purposes

More complex to design,

implement, and maintain
Complexity Simpler to design and manage
due to data integration and
transformation

Higher cost due to larger

Lower cost due to smaller size and
Cost storage requirements and
simpler infrastructure
processing power

Data Warehouse Modeling

Data warehouse modeling is the process of designing the schemas of the detailed and
summarized information of the data warehouse. The goal of data warehouse modeling is to
develop a schema describing the reality, or at least a part of the fact, which the data warehouse
is needed to support.

Data warehouse modeling is an essential stage of building a data warehouse for two main
5
R-22

reasons. Firstly, through the schema, data warehouse clients can visualize the relationships
among the warehouse data, to use them with greater ease. Secondly, a well-designed schema
allows an effective data warehouse structure to emerge, to help decrease the cost of
implementing the warehouse and improve the efficiency of using it.

Data modeling in data warehouses is different from data modeling in operational database
systems. The primary function of data warehouses is to support DSS processes. Thus, the
objective of data warehouse modeling is to make the data warehouse efficiently support
complex queries on long term information.

In contrast, data modeling in operational database systems targets efficiently supporting simple
transactions in the database such as retrieving, inserting, deleting, and changing data. Moreover,
data warehouses are designed for the customer with general information knowledge about the
enterprise, whereas operational database systems are more oriented toward use by software
specialists for creating distinct applications.

6
R-22

o Reflects the most current happenings, which are commonly the most stimulating.

o It is numerous as it is saved at the lowest method of the Granularity.

o It is always (almost) saved on disk storage, which is fast to access but expensive

and difficult to manage.

Older detail data is stored in some form of mass storage, and it is infrequently accessed
and kept at a level detail consistent with current detailed data.

Lightly summarized data is data extract from the low level of detail found at the
current, detailed level and usually is stored on disk storage. When building the data
warehouse have

to remember what unit of time is summarization done over and also the components or
what attributes the summarized data will contain.

Highly summarized data is compact and directly available and can even be found outside
the warehouse.

Difference between OLTP and OLAP

Below is the difference between OLAP and OLTP in Data Warehouse:

Para
meter OLTP OLAP
s

It is an online transactional
Proce OLAP is an online analysis
system. It manages database
ss and data retrieving process.
modification.

Chara It is characterized by large

It is characterized by a large
cterist numbers of short online
volume of data.
ic transactions.

7
R-22

Para
meter OLTP OLAP
s

Functi
OLTP is an online database OLAP is an online database query
onalit
modifying system. management system.
y

OLTP uses
Meth OLAP uses the data
traditional
od warehouse.
DBMS.

Insert, Update, and Delete Mostly select

Query
information from the database. operations

Tables in OLTP database Tables in OLAP database are

Table
are normalized. not normalized.

Sourc OLTP and its transactions are the Different OLTP databases become
e sources of data. the source of data for OLAP.

Data OLAP database does not get

OLTP database must maintain data
Integr frequently modified. Hence,
integrity constraint.
ity data integrity is not an issue.

Respons It’s response time is in Response time in

e time millisecond. seconds to minutes.

Data The data in the OLTP database

The data in OLAP process might
qualit is always detailed and
not be organized.
y organized.

8
R-22

Para
meter OLTP OLAP
s

Useful It helps to control and run It helps with planning, problem-

ness fundamental business tasks. solving, and decision support.

Opera Only read and rarely

Allow read/write operations.
tion write.

OLAP operations :

OLAP stands for Online Analytical Processing (OLAP) could be a innovation that’s utilized to
organize expansive business databases and back business intelligence. OLAP databases are
separated into one or more cubes, and each cube is organized and designed by a cube
administrator to fit the way simply recover and analyze data so that it is less demanding to
form and utilize the PivotTable reports and PivotChart reports that and just require.

OLAP operations:

There are five basic analytical operations that can be performed on an OLAP cube:

Drill down: In drill-down operation, the less detailed data is converted into highly detailed
data. It can be done by:
 Moving down in the concept hierarchy
 Adding a new dimension
In the cube given in overview section, the drill down operation is performed by moving down in
the concept hierarchy of Time dimension (Quarter -> Month).

9
R-22

Roll up: It is just opposite of the drill-down operation. It performs aggregation on the OLAP cube. It
can be done by:
 Climbing up in the concept hierarchy
 Reducing the dimensions
In the cube given in the overview section, the roll-up operation is performed by climbing up in the
concept hierarchy of Location dimension (City -> Country)

Dice: It selects a sub-cube from the OLAP cube by selecting two or more dimensions. In the cube
given in the overview section, a sub-cube is selected by selecting following dimensions with criteria:
 Location = “Delhi” or “Kolkata”
 Time = “Q1” or “Q2”
 Item = “Car” or “Bus”

10
R-22

Slice: It selects a single dimension from the OLAP cube which results in a new sub-cube creation. In
the cube given in the overview section, Slice is performed on the dimension Time = “Q1”

Pivot: It is also known as rotation operation as it rotates the current view to get a new view of the
representation. In the sub-cube obtained after the slice operation, performing pivot operation
gives a new view of it.

What is a data cube?

A data cube is a multidimensional representation of data intended to facilitate effortless retrieval and
11
R-22

structured analysis of the underlying data. When organized in a cube rather than a network of relational tables,
it becomes easier for the user to establish relationships between data that could otherwise be challenging to
figure out. This directly results in enhanced in-depth analysis and advanced drill down. Every face of the cube
can be programmed to represent a particular category and users can pivot the cube to look at the same data
from a unique perspective.

While data cubes can exist as a simple representation of data, without any extensive capabilities to analyze
large volumes, OLAP data cubes are particularly valuable for complex data analysis, including business
intelligence as they provide a comprehensive view of information across different dimensions, such as time,
products, locations, or customer segments. For example, if you are looking at a sales data cube, different
dimensions can show you data by year, product category, locations, customers, etc. So, whenever we mention a
data cube from here on, we will be referring to an OLAP model.
What are data cube classifications?
OLAP emerged as a response to the limitations of relational databases for analytical and multidimensional data
processing. OLAP databases are optimized for complex queries, multidimensional analysis, and fast retrieval of
processed data. They allow users to interact with data from different angles and hierarchies. OLAP has
developed into three major classifications:
 ROLAP: Relational OLAP stores and processes data in a relational database. ROLAP servers
typically query the relational database to generate reports and analyze data. Although scalable to large
volumes of data, ROLAP can encounter scalability issues after a certain point due to its dependency on
the underlying database. It integrates well with existing relational databases which makes it easier to
implement and maintain.
 MOLAP: Multidimensional OLAP stores and processes data in a multidimensional format
conceptualized like a cube. This structure makes it easier to establish relationships between data points
and optimize the data for complex analytical queries. MOLAP cubes are pre-calculated and stored in a
separate database from the source data. With many operations done beforehand, MOLAP does not
require as much processing power from the relational database server.
 HOLAP: Hybrid OLAP combines the best features of ROLAP and MOLAP. HOLAP stores some data
in a relational database and some data in the multidimension format. Therefore, HOLAP provides both
the scalability of ROLAP and the performance of MOLAP for organizations with large and complex data
sets.
12
R-22

What are the data cube operations?

Data cubes support various operations that allow users to examine and analyze data from different
perspectives. Here is an overview of some key data cube operations:
 Roll-up: This operation adds up all the data from a category and presents it as a singular record.
It is like zooming out of the cube and looking at the data from a broader perspective.
 Drill-down: While trying to access a transaction on the point-of-access, users need to descend
into a dimension hierarchy. For instance, drilling down on the product dimension in the sales data
cube would provide detailed sales figures for each product within each region.
 Slicing: When users want to focus on a specific set of facts from a particular dimension, they can
filter the data to focus on that subset. Slicing a sales data cube to focus on “Electronics” would restrict
the data to sales of electronic products only.
 Dicing: Breaking the data into multiple slices from a data cube can isolate a particular
combination of factors for analysis. By selecting a subset of values from each dimension, the user can
focus on the point where the two dimensions intersect each other. For example, dicing the product
dimension to “Electronics” and the region dimension to “Asia” would restrict the data to sales of
electronic products in the Asian region.
 Pivoting: Pivoting means rotating the cube to view the data from a unique perspective or
reorienting analysis to focus on a different aspect. Pivoting the sales data cube to swap the product and
region dimensions would shift the focus from sales by product to sales by region.
Data Cube Computation methods :

1) Materialization of Cube: Full, Iceberg, Closed and Shell Cubes

A three-dimensional data cube with dimensions A, B, and C and an aggregate measure M. And can
think of a data cube as a lattice of cuboids. Each cube is meant to symbolize a group. The fundamental
cuboid, encompassing all three dimensions, is the ABC.

The aggregate measure (M) is calculated for each permutation of the three dimensions. There are six
different cuboids that make up a data cube, with the base cuboid being the most specific. The apex
cuboid is the most generalized cuboid. It stores a single number - the sum of all the tuples' measures in
the base cuboid's measure M. From the topmost cuboid of the data cube example, we can descend
into the lattice to access deeper levels of information.

13
R-22

Base Cell In Data Cube

The term "base cell" refers to a cell in the base cuboid. Aggregate cells are cells that are not
based on a cube. Each dimension that is aggregated in an aggregate cell is represented by a ""
in the cell notation. Let's pretend we're working with an n-dimensional data cube.

Let each cell of the cuboids that make up the data cube be denoted by a = (a1, a2,..., an,
measurements). If there are m (m n) values of a, b, c, d, e, f, g, h, I j, k, l, m, n, and o that are
not "," then we say that an is an m-dimensional cell (that is, from an m- A is a base cell if and
only if m = n; otherwise, it is an aggregate cell (where m n).

Cuboids In Data Cube

As an intriguing compromise between storage requirements and response times for OLAP, data
cubes that are partially materialized are a viable option. We cannot compute the entire data cube,
but rather only parts of it, called cuboids, each of which is composed of a subset of the cells in the full
cube.

Minimum Support And Iceberg In Data Cube

Not only does this result in a more effective utilization of resources (namely, CPU time and disc
space), but it also makes it possible to conduct more accurate analysis. There is a good chance that
the non-passing cells are not important enough to warrant further investigation.

Cubes that only partially materialize are referred to as iceberg cubes, and this phrase is used to
characterize such cubes. The term "minimum support," also abbreviated as "min sup" for short,
describes the criteria that are the absolute minimum acceptable. It is common practise to refer to the
effect of materializing only a fraction of the cells in a data cube as the "tip of the iceberg." In this
14
R-22

context, "iceberg" refers to the entire cube including all cells.

2) Roll-up/Drill-down- This method involves aggregating data along one or more dimensions to create a
summary of the dataset. It can be used to drill-down into specific areas of interest within the data.
Roll-up/Drill-down is useful for quickly summarizing large datasets into manageable chunks while still
maintaining important information about each dimension.

For example, if and have sales data for multiple products across several regions, and could use roll-
up/drill-down to see total sales across all regions or drill-down into sales numbers for one particular
product in one region.

3) Slice-and-Dice - This method involves selecting subsets of data based on certain criteria and then
analyzing it using different dimensions. It is useful for identifying patterns that may not be
immediately apparent when looking at the entire dataset.

Slice-and-Dice allows users to select subsets of data based on specific criteria such as time period
or customer demographics which can then be analyzed using different dimensions like product
categories or geographic locations. This helps identify patterns that may not be immediately apparent
when looking at the entire dataset.

4) Grouping Sets - This method involves grouping data by multiple dimensions at once, allowing for
more complex analysis of the dataset. Grouping Sets are useful when analyzing large datasets with
multiple dimensions where users want to group by two or more dimensions at once. For example,
grouping sets could show total revenue broken down by both product category and region
simultaneously.

5) Online Analytical Processing (OLAP) - This method uses a multidimensional database to store and
analyze large amounts of data. It allows for quick querying and analysis of the data in different
ways.OLAP databases are specifically designed for analyzing large amounts of multi-dimensional data
quickly through pre-aggregated values stored in memory making it ideal for real-time decision-
making scenarios like stock market analysis.

6) SQL Queries - SQL queries can be used to compute data cubes by selecting specific columns and
aggregating them based on certain criteria. This is a flexible method that can be customized based on
the needs of the user.
15
R-22

***

Unit 2
No ratings yet
Unit 2
31 pages
DWM Unit I
No ratings yet
DWM Unit I
114 pages
DBMS II Seven 7
No ratings yet
DBMS II Seven 7
13 pages
Unit-1.1 Data Warehouse
No ratings yet
Unit-1.1 Data Warehouse
29 pages
Data Warehousing for Business Analysts
100% (1)
Data Warehousing for Business Analysts
28 pages
Data Mining Final New
No ratings yet
Data Mining Final New
109 pages
Data Warehousing for Decision Makers
No ratings yet
Data Warehousing for Decision Makers
31 pages
Data Mining Complete
No ratings yet
Data Mining Complete
95 pages
DATA WAREHOUSE Basic Concepts
No ratings yet
DATA WAREHOUSE Basic Concepts
26 pages
Lesson 2. Data Warehouse Basic Concepts
No ratings yet
Lesson 2. Data Warehouse Basic Concepts
18 pages
Unit 1 DWDM
No ratings yet
Unit 1 DWDM
122 pages
Lect 14 DM
No ratings yet
Lect 14 DM
33 pages
Unit - 1 Introduction To Data Warehousing
No ratings yet
Unit - 1 Introduction To Data Warehousing
57 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
34 pages
Data Warehouse
No ratings yet
Data Warehouse
16 pages
Introduction To Data Warehouse Edited
No ratings yet
Introduction To Data Warehouse Edited
34 pages
Data Mining Unit-2 Notes
No ratings yet
Data Mining Unit-2 Notes
8 pages
Data Warehousing Fundamentals
No ratings yet
Data Warehousing Fundamentals
108 pages
Advance Concept in Data Bases Unit-5 by Arun Pratap Singh
100% (1)
Advance Concept in Data Bases Unit-5 by Arun Pratap Singh
82 pages
Dataware Housing Notes
No ratings yet
Dataware Housing Notes
134 pages
Final Interview Questions (Etl - Informatica) : Subject Oriented, Integrated, Time Variant, Non Volatile
100% (1)
Final Interview Questions (Etl - Informatica) : Subject Oriented, Integrated, Time Variant, Non Volatile
77 pages
Data Warehouse and Mining-1
No ratings yet
Data Warehouse and Mining-1
40 pages
What Is Data Warehouse
No ratings yet
What Is Data Warehouse
19 pages
Data Warehousing & Mining BCA V SEM
No ratings yet
Data Warehousing & Mining BCA V SEM
107 pages
DATA Ware House Mining NOTES
No ratings yet
DATA Ware House Mining NOTES
31 pages
DATA Ware House & Mining NOTES
100% (2)
DATA Ware House & Mining NOTES
31 pages
Dataware House
No ratings yet
Dataware House
31 pages
Data Warehousing for Managers
No ratings yet
Data Warehousing for Managers
26 pages
Term 1
No ratings yet
Term 1
12 pages
Data Warehousing Module 1
No ratings yet
Data Warehousing Module 1
58 pages
Data Warehousing
No ratings yet
Data Warehousing
77 pages
Etl 1
No ratings yet
Etl 1
8 pages
Data Warehousing&Data Mining
No ratings yet
Data Warehousing&Data Mining
170 pages
Module 1-1basic Concepts
No ratings yet
Module 1-1basic Concepts
40 pages
Warehousing
No ratings yet
Warehousing
15 pages
Unit I
No ratings yet
Unit I
33 pages
Data Mining UNIT 2 LECTURE NOTES
No ratings yet
Data Mining UNIT 2 LECTURE NOTES
32 pages
Data and AI - Data Warehousing
No ratings yet
Data and AI - Data Warehousing
58 pages
Data Warehouse
No ratings yet
Data Warehouse
143 pages
DW Unit1
No ratings yet
DW Unit1
26 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
24 pages
DWDM Lecturenotes PDF
No ratings yet
DWDM Lecturenotes PDF
133 pages
Data Warehouse-Ccs341 Material
No ratings yet
Data Warehouse-Ccs341 Material
58 pages
Data Warehousing: Understanding A Data Warehouse
No ratings yet
Data Warehousing: Understanding A Data Warehouse
4 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
122 pages
Module 3
No ratings yet
Module 3
17 pages
Data Warehouse
No ratings yet
Data Warehouse
39 pages
Data Warehouse and Data Sources
No ratings yet
Data Warehouse and Data Sources
18 pages
Eval of Business Performance - Module 1
No ratings yet
Eval of Business Performance - Module 1
8 pages
Data Warehouse References
No ratings yet
Data Warehouse References
40 pages
Assignment (1) Muhammad Uzair Class No (260) Sec (E) Reg No (39626) Distributed Database System
No ratings yet
Assignment (1) Muhammad Uzair Class No (260) Sec (E) Reg No (39626) Distributed Database System
7 pages
Figma Shortcuts
No ratings yet
Figma Shortcuts
3 pages
Python Task Rivan Solutions 202204221557441
No ratings yet
Python Task Rivan Solutions 202204221557441
2 pages
CRUD en JDeveloper PDF
No ratings yet
CRUD en JDeveloper PDF
7 pages
DBMS Transaction Management Guide
No ratings yet
DBMS Transaction Management Guide
9 pages
Real-Time Data Visualization
No ratings yet
Real-Time Data Visualization
2 pages
Cryptography - Module 1
No ratings yet
Cryptography - Module 1
4 pages
Ankur Mittal: SAP & Business Analyst Profile
No ratings yet
Ankur Mittal: SAP & Business Analyst Profile
1 page
6188-Article Text-24244-1-10-20221203
No ratings yet
6188-Article Text-24244-1-10-20221203
10 pages
Disk Management MCQs & Scheduling
No ratings yet
Disk Management MCQs & Scheduling
5 pages
Info Systems Midterm Prep Guide
No ratings yet
Info Systems Midterm Prep Guide
9 pages
CV-EN-Antouan July 2024
No ratings yet
CV-EN-Antouan July 2024
5 pages
1 - FTTX Technology Overview
No ratings yet
1 - FTTX Technology Overview
82 pages
SQL Tutorial: SELECT Statement Basics
No ratings yet
SQL Tutorial: SELECT Statement Basics
10 pages
300+ TOP COMPUTER NETWORKS Multiple Choice Questions and Answers 2023
No ratings yet
300+ TOP COMPUTER NETWORKS Multiple Choice Questions and Answers 2023
23 pages
Digital Encoding
No ratings yet
Digital Encoding
11 pages
On Holy Wars and A Plea For Peace - by Danny Cohen
No ratings yet
On Holy Wars and A Plea For Peace - by Danny Cohen
17 pages
Master Project PTP 1588
No ratings yet
Master Project PTP 1588
56 pages
Syllabus For CSC 578 - Neural Networks and Deep Learning, Spring 2021
No ratings yet
Syllabus For CSC 578 - Neural Networks and Deep Learning, Spring 2021
3 pages
Structured Programming
No ratings yet
Structured Programming
4 pages
DSO201 manual英文说明书
No ratings yet
DSO201 manual英文说明书
6 pages
Office 365 User Guide
100% (1)
Office 365 User Guide
34 pages
MicroPython ESP32 Starter Guide
No ratings yet
MicroPython ESP32 Starter Guide
55 pages
Pre Interview File
No ratings yet
Pre Interview File
11 pages
Under Budget Tourism - PPT
No ratings yet
Under Budget Tourism - PPT
21 pages
Hcic 2021
No ratings yet
Hcic 2021
20 pages
Print - Udyam Registration Certificate3
No ratings yet
Print - Udyam Registration Certificate3
4 pages
Disability Issues While Teaching and Using Computer: JK Tushit Vijay
No ratings yet
Disability Issues While Teaching and Using Computer: JK Tushit Vijay
6 pages
Grade-11-Q1-W3.a-Access-Information For Teacher
No ratings yet
Grade-11-Q1-W3.a-Access-Information For Teacher
21 pages
Memory and I/O System Overview
No ratings yet
Memory and I/O System Overview
89 pages
Founder of Cdac
No ratings yet
Founder of Cdac
8 pages

Dwbi Unit-1

Uploaded by

Dwbi Unit-1

Uploaded by

R-22

DATA WAREHOUSING AND BUSINESS INTELLIGENCE

Data is a collection of information gathered by observations, measurements, research or analysis. They

Data is classified into

 Qualitative: It describes the quality of something or someone. It is descriptive information. For

Data Warehouse Definition :

and multiple sources.

Benefits of Data Warehouse :

1. Understand business trends and make better forecasting decisions.

2. Data Warehouses are designed to perform well enormous amounts of data.

3. The structure of data warehouses is more accessible for end-users to navigate,

Need for Data Warehouse :

Feature Database Data Warehouse

Optimized for fast retrieval and Optimized for complex

Primarily historical and

Frequent updates as transactions Periodic updates (batch

Operational applications, individual Business analysts, data

Focuses on data governance

More complex to design,

Higher cost due to larger

Data Warehouse Modeling

o It is numerous as it is saved at the lowest method of the Granularity.

and difficult to manage.

Difference between OLTP and OLAP

Below is the difference between OLAP and OLTP in Data Warehouse:

Chara It is characterized by large

Insert, Update, and Delete Mostly select

Tables in OLTP database Tables in OLAP database are

Data OLAP database does not get

Respons It’s response time is in Response time in

Data The data in the OLTP database

Useful It helps to control and run It helps with planning, problem-

Opera Only read and rarely

What is a data cube?

What are the data cube operations?

1) Materialization of Cube: Full, Iceberg, Closed and Shell Cubes

Base Cell In Data Cube

Cuboids In Data Cube

Minimum Support And Iceberg In Data Cube

context, "iceberg" refers to the entire cube including all cells.

You might also like