Q.
Explain Data Warehouse 3-tier Architecture in detail
Data warehousing: Data warehousing is the process of collecting, organizing, and managing large
volumes of data from various sources to provide meaningful business insights.
A data warehouse is a centralized repository that allows organizations to store integrated data from
multiple sources.
Data warehouses store historical data, allowing organizations to analyze trends and patterns over
time. The data warehouse stores data in a structured format, typically using tables and schemas.
• The Three-Tier Data Warehouse Architecture is the commonly used Data Warehouse design
in order to build a Data Warehouse.
• The 3-tier architecture of a data warehouse organizes the system into three layers. Each
layer serves a specific purpose.
• Following are the three tiers of the data warehouse architecture.
1. Bottom Tier
2. Middle Tier
3. Top Tier
1. Bottom Tier: Data Sources and ETL Process
This is the foundational layer where raw data is gathered, processed, and stored.
• Data Sources:
o Data comes from various sources, such as operational databases, transactional
systems, external sources , and flat files (CSV, Excel).
o Data from these sources is typically extracted using ETL (Extract, Transform, Load)
tools.
• ETL Process:
o Extract: Data is extracted from multiple heterogeneous sources.
o Transform: Data is cleaned, formatted, and transformed to ensure consistency and
quality.
o Load: Transformed data is stored in the data warehouse database.
• Storage:
o The data is stored in a data warehouse or a data mart.
o Metadata repository: Stores schema, data , and other technical details.
2. Middle Tier: OLAP Server
• The Middle Tier of a 3-tier data warehouse architecture acts as the data processing layer
• This layer processes and organizes the data to provide faster access for analysis.
• It Converts raw data into a format suitable for querying and analysis.
• The heart of this layer is the OLAP Server (Online Analytical Processing), which provides the
tools and mechanisms for multidimensional data analysis.
• OLAP Server (Online Analytical Processing):
o Supports operations like slicing, dicing, drilling up/down, and pivoting.
Slicing: Extracting a subset of data for a specific dimension.
Dicing: Creating a smaller subset of data by filtering across multiple dimensions.
Drilling Up/Down: Navigating between levels of data granularity (e.g., from year to
month or vice versa).
Pivoting: Rotating the data to view it from different perspectives.
• Data Cubes:
o Data is organized into cubes for fast querying and analytics.
o Example: A sales cube might have dimensions like time, location, and product.
• Query handling
o Processes queries and performs complex computations, aggregations to provide
faster results to users.
• Middleware:
o Acts as an intermediate between the bottom-tier data warehouse and top-tier
applications.
OLAP Server: Types
• ROLAP (Relational OLAP): Uses relational databases to store data and performs complex
queries on them.
• MOLAP (Multidimensional OLAP): Stores data in a multidimensional cube format. Offers
faster performance for querying
• HOLAP (Hybrid OLAP): Combines both ROLAP and MOLAP approaches. Uses both relational
database for and multidimensional storage.
3. Top Tier:
The Top Tier of a 3-tier data warehouse architecture is the Presentation Layer, which provides
user-friendly interfaces and tools to analyze, and visualize the processed data.
• This layer allows users to interact with the data warehouse for analysis and reporting.
Key Functions of the Top Tier
Visualization:
• Presents data in easily interpretable formats like graphs, charts, dashboards, and reports.
• Helps users identify patterns, trends, and insights at a glance.
Reporting:
• Generates standard reports (e.g., daily sales reports, monthly performance metrics) and
custom reports based on user needs.
Components of the Top Tier
1. Business Intelligence (BI) Tools:
o These tools provide powerful functionalities for data analysis and visualization.
o Examples:
▪ Tableau: For creating dashboards and interactive visualizations.
▪ Power BI: For generating reports and sharing them across teams.
2. Reporting Tools:
• Tools for creating structured reports for organizational needs.
• Examples:
o Crystal Reports
o Microsoft Excel
Advantages of 3-Tier Architecture
Security
Scalability
High Performance
Flexibility