The architecture consists of various
interconnected elements:
◦ Operational and external database layer – this
  layer represents the source data for the DW. The
  goal is to free the information locked up in the
  operation databases.
◦ Information access layer – the tools the end user
  access to extract and analyze the data. It
  represents the tools that the end user normally
  uses day to day to extract and analyze the data.
  This layer consists of the hardware and software
  involved in displaying and printing reports,
  spreadsheets, graphs and charts for analysis.
  Data access layer – the interface between the operational
  and information access layers. A successful DW provides
  end users with universal data access so that theoretically at
  least end users should be able to access any or all of the
  enterprise’s data necessary for them to do their job
  regardless of location or information access tool.
  Metadata layer – the data directory or repository of
  metadata information. Meta data are data about the data
  stored within the DW. Meta data keeps a record of
  characteristics of the data, such as exactly what piece of
  data exist, where they are located, where they came from
  and how they can be accessed.
                                                                       1
Additional layers are:
 ◦ Process management layer – the scheduler or
   job controller to keep DW up to date. Tasks
   such as periodic download from identified
   operational data stores, scheduled
   summarization of operational data, access and
   download of external data sources and update
   the meta data are typically performed at this
   layer of the DW.
 ◦ Application messaging layer – the “middleware”
   that transports information around the firm.
   This layer can also be used to collect
   transactions or messages and deliver them to a
   certain location at a certain time. In this sense,
   the application messaging layer can be thought
   as the transport system.
                                                                3
  Physical data warehouse layer – where the actual data
  used in the DSS are located. In some cases, one can think
  of the DW simply as a logical or virtual view of data,
  because as we will see in coming chapters, in some
  instances the data warehouse may not actually store the
  data accessed through it.
  Data staging layer – all of the processes necessary to
  select, edit, summarize and load warehouse data from the
  operational and external data bases. Data staging may
  also require data quality analysis programs and other such
  filters.
                                                                    2
                                                    5
Although the DW may appear to be the
source of data for various organizational
analysis initiatives and decision making
activities, it may not physically be the location
of the data being accessed. Numerous hybrid
mechanisms exist to structure of DW, but
three basic configuration can be identified;
virtual (point to point), central and distributed
data warehouses.
                                                        3
 The virtual data warehouse – the end users have direct
 access to the data stores, using tools enabled at the data
 access layer. Virtual data warehouses often provide a
 relatively low-cost starting point for organizations to access
 what types of data end users are really looking for.
 The central data warehouse – a single physical database
 contains all of the data for a specific functional area ,
 department, division or enterprise. This warehouse approach
 is often selection when users demonstrate a common need for
 informational data and a large numbers of end users are
 already connected to a central computer or network, usually
 contains data from multiple operational applications.
The distributed data warehouse – the components are
distributed across several physical databases. Some
organizations push decision making down to lower levels of
the organizations, the data needed for decision making are
also pushed down to the LAN or local computer serving the
local decision makers. Many older DW implementations use
the distributed approach but with the advent of modern DW
implementation and management applications, reduced the
need for multiple or distributed DWs.
                                                                      4
   The name suggests some high-level
    technological concept, but it really is fairly
    simple. Metadata is “data about data”.
   With the emergence of the data warehouse
    as a decision support structure, the
    metadata are considered as much a
    resource as the business data they
    describe.
    The metadata are essential ingredients in
    the transformation of raw data into
    knowledge. They are the “keys” that allow
    us to handle the raw data.
    For example, a line in a sales database may
    contain:            1023 K596 111.21
    This is mostly meaningless until we
    consult the metadata (in the data directory)
    that tells us it was store number 1023,
    product K596 and sales of $111.21.
                                                     10
                                                          5
   The data warehouse is set up for the
    benefit of business analysts and executives
    across all functional areas.
   In their individual databases, the different
    areas may define and store data according
    to their own version of the “truth”.
   When data are retrieved from these
    different areas and placed in the
    warehouse, the transformation and
    cleansing process ensures that there is a
    single, integrated “truth” at the
    organizational level.
                                                   11
     Regardless of the nature of a query,
     certain aspects of the metadata are
     important to all decision-makers.
     Some of these are:
    ◦ What tables, attributes and keys
       does the DW contain?
    ◦ Where did each set of data come
       from?
    ◦ What transformations were applied
       with cleansing?
                                                   12
                                                        6
    ◦ How have the metadata changed
      over time?
    ◦ How often do the data get
      reloaded?
    ◦ Are there so many data elements
      that you need to be careful what
      you ask for?
                                                  13
   Transformation maps – records that show
    what transformations were applied
   Extraction history – records that show what
    data was analyzed
   Algorithms for summarization – methods
    available for aggregating and summarizing
   Data ownership – records that show origin
   Access patterns – records that show what
    data are accessed and how often
                                                  14
                                                       7
THANK YOU
            15