0% found this document useful (0 votes)
23 views24 pages

Application and Adv.

Uploaded by

Barun Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views24 pages

Application and Adv.

Uploaded by

Barun Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Outline

 Characteristics of Data Warehouse


 Functionality of Data Warehouse
 Advantages and Applications of Data Warehouse
 Data-warehouse development methodology
Characteristics of Data
Warehouse

Major characteristics of data warehouse:


Data Warehouse --- Subject-Oriented
• Organized around major subjects – customer, product,
sales

• Focusing on the modeling and analysis of data for


decision makers, not on daily operations or
transaction processing

• Provides a simple and concise view around particular


subject issues by excluding data that are not useful in the
decision support process
Data Warehouse --- Integrated
• Constructed by integrating multiple, heterogeneous data sources
such as relational databases, flat files, on-line transaction
records

• Datacleaning and dataintegrationtechniques are applied.


• Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different data
sources
• When data is moved to the warehouse, it is converted.
Data Warehouse --- Time Variant
• The time horizon for the data warehouse is significantly longer
than that of operational systems
• Operational database: current value data
• Data warehouse data: provide information from a
historical perspective (e.g., past 5-10 years)

• Every key structure in the data warehouse


• Contains an element of time, explicitly or implicitly
• But the key of operational data may or may not
contain “time element”
Data Warehouse --- Nonvolatile
• A physically separate store of data transformed from the
operational environment
• Operational update of data does not occur in the data
warehouse environment
• Does not require transaction processing, recovery, and
concurrency control mechanisms

• Requires only two operations in data accessing:

•Initial loading of data


•Access ofdata
Functionality of Data Warehouse
• Data warehouses exist to facilitate complex, data-intensive, and frequent ad-hoc
queries.

• Accordingly, data warehouses must provide far greater and more efficient query support than is
demanded of transactional databases.

• The Functionalities are as below:


• Roll-up. Data is summarized with increasing generalization (for example, weekly
to quarterly to annually).
• Drill-down. Increasing levels of detail are revealed (the complement of roll-up).
• Slice and dice. Projection operations are performed on the dimensions.
• Sorting. Data is sorted by ordinal value.
• Selection. Data is available by value or range.
• Derived (computed) attributes. Attributes are computed by operations on stored and
derived values.
• Pivot. Cross tabulation (also referred to as rotation) is performed.
Data-warehouse development methodology

• A data-warehouse is a heterogeneous collection of different


data sources
• There are 2 approaches for constructing data-warehouse:
• Top-down approach and
• Bottom-up approach
Top-down approach
Cont…
• External Sources: External source is a source from where data is collected irrespective
of the type of data. Data can be structured, semi structured and unstructured as well.
• Stage Area: Since the data, extracted from the external sources does not follow a particular
format, so there is a need to validate this data to load into datawarehouse. For this purpose, it is
recommended to use ETL tool.

• E(Extracted): Data is extracted from External data source.


• T(Transform): Data is transformed into the standard format.
• L(Load): Data is loaded into datawarehouse after transforming it into the standard
format.
• Data-warehouse: After cleansing of data, it is stored in the datawarehouse as central
repository. Note that datawarehouse stores the data in its purest form in this top-down
approach.
• Data Marts: Data mart is also a part of storage component. It stores the information of a
particular function of an organization which is handled by single authority. There can be as
many number of data marts in an organization depending upon the functions. We can also say
that data mart contains subset of the data stored in datawarehouse.
• Data Mining: The practice of analyzing the big data present in datawarehouse is data
mining. It is used to find the hidden patterns that are present in the database or in
datawarehouse with the help of algorithm of data mining.
Advantages & Dis-advantages of Top-down approach
Bottom-up approach
Cont…
Advantages & Dis-advantages of Bottom-Up approach
The Kimball Lifecycle methodology was conceived during the mid-1980s by
members of the Kimball Group and other colleagues at Metaphor Computer
Systems, a pioneering decision support company.

Kimball’s Life Cycle approach


•Business Requirements Definition:
•This phase involves understanding the business needs and defining what the data
warehouse should accomplish. It's about gathering requirements from stakeholders and
ensuring that the project aligns with business objectives.
•Dimensional Modeling:
•This step involves designing the data warehouse schema using a dimensional model. This
model organizes data into facts (measurable quantities) and dimensions (contextual
information). The result is a star schema or snowflake schema that makes data easier to
analyze.
•Data Integration: In this phase, data from various sources is extracted, transformed, and
loaded (ETL) into the data warehouse. The goal is to ensure that data is accurate,
consistent, and ready for analysis.
•Data Warehouse Design: This involves designing the physical data warehouse structure,
including how data will be stored and managed. It also includes creating data marts, which
are subsets of the data warehouse tailored to specific business areas or functions.
•Deployment and Maintenance: Once the data warehouse is built, it is deployed to users.
This phase also includes ongoing maintenance to ensure the system continues to meet
business needs, such as updating the data warehouse with new data and making
improvements based on user feedback.
•End-User Access and Reporting: This final phase focuses on providing users with tools
and interfaces to access and analyze data. It involves creating reports, dashboards, and
other analytical tools that help users make informed decisions based on the data in the
warehouse.
Advantages of Data Warehousing
There are many advantages of data warehousing, some of them discussed
below.
• Cleans data: It mainly follows in data cleansing of removing errors that
are inconsistent to improve the data and its respective quality.

• Indexes multiple types: Indexing has created multiple database tables


and created to speed up the accessing of information.

• Increased system and query performance: It mainly constructs to enhance and


find the retrieval of data. It has the speed of performing different warehouses and the
corresponding storage on large volumes.

• Timely access to data: It helps the users to access different resources to


analyze the data for the retrieval process.
Business Advantages
Applications of Data Warehousing

• Data Warehousing can be applied anywhere where we have a huge amount


of data and we want to see statistical results that help in decision making.
• Social Media Websites: The social networking websites like Facebook, Twitter,
Linkedin, etc. are based on analyzing large data sets. These sites gather data
related to members, groups, locations, etc., and store it in a single central repository.
Being a large amount of data, Data Warehouse is needed for implementing the
same.

• Banking: Most of the banks these days use warehouses to see the spending
patterns of account/cardholders. They use this to provide them special offers,
deals, etc.

• Government: Government uses a data warehouse to store and analyze tax


payments which are used to detect tax thefts.
Cont…
Types of Data Warehouse
Thank you

You might also like