0% found this document useful (0 votes)
83 views16 pages

Mapping The Data Warehouse

The document discusses the mapping of data warehouse architecture to multiprocessor architecture, highlighting the advantages of parallel relational database technology such as linear speed up and scale up. It details types of parallelism, data partitioning methods, and various database architectures for parallel processing, including shared memory, shared disk, and shared nothing architectures. Each architecture has its own advantages and disadvantages regarding performance, scalability, and administration.

Uploaded by

Vidhu Saraswat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views16 pages

Mapping The Data Warehouse

The document discusses the mapping of data warehouse architecture to multiprocessor architecture, highlighting the advantages of parallel relational database technology such as linear speed up and scale up. It details types of parallelism, data partitioning methods, and various database architectures for parallel processing, including shared memory, shared disk, and shared nothing architectures. Each architecture has its own advantages and disadvantages regarding performance, scalability, and administration.

Uploaded by

Vidhu Saraswat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Mapping the data warehouse

architecture to Multiprocessor
architecture
T.R.Lekhaa
AP – IT
SNSCE

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 1
INTO MULTIPROCESSOR ARCHITECTURE
• The functions of data warehouse are based on the relational
data base technology. The relational data base technology is
implemented in parallel manner. There are two advantages of
having parallel relational data base technology for data
warehouse:
• Linear Speed up: refers the ability to increase the number of
processor to reduce response time
• Linear Scale up: refers the ability to provide same
performance on the same requests as the database size
increases

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 2
INTO MULTIPROCESSOR ARCHITECTURE
Types of parallelism
• There are two types of parallelism:
• Inter query Parallelism: In which different server threads or processes
handle multiple requests at the same time.
• Intra query Parallelism: This form of parallelism decomposes the serial SQL
query into lower level operations such as scan, join, sort etc.

• Intra query parallelism can be done in either of two ways:


• Horizontal parallelism: which means that the data base is partitioned
across multiple disks and parallel processing occurs within a specific task
that is performed concurrently on different processors against different
set of data
• Vertical parallelism: This occurs among different tasks. All query
components such as scan, join, sort etc are executed in parallel in a
pipelined fashion.
UNIT - 1 MAPPING THE DW ARCHITECTURE
7/16/2019 3
INTO MULTIPROCESSOR ARCHITECTURE
UNIT - 1 MAPPING THE DW ARCHITECTURE
7/16/2019 4
INTO MULTIPROCESSOR ARCHITECTURE
Data partitioning
• Data partitioning is the key component for effective parallel
execution of data base operations. Partition can be done randomly
or intelligently.
• Random portioning includes random data striping across multiple
disks on a single server. Another option for random portioning is
round robin fashion partitioning in which each record is placed on
the next disk assigned to the data base.
• Intelligent partitioning assumes that DBMS knows where a specific
record is located and does not waste time searching for it across all
disks.
• The various intelligent partitioning include:
• Hash partitioning: A hash algorithm is used to calculate the
partition number based on the value of the partitioning key for
each row

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 5
INTO MULTIPROCESSOR ARCHITECTURE
• Key range partitioning: Rows are placed and located in the
partitions according to the value of the partitioning key.

• Schema portioning: an entire table is placed on one disk;


another table is placed on different disk etc.

• User defined portioning: It allows a table to be partitioned on


the basis of a user defined expression.

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 6
INTO MULTIPROCESSOR ARCHITECTURE
Data base architectures of parallel
processing
• There are three DBMS software architecture
styles for parallel processing:
• 1. Shared memory or shared everything
Architecture
• 2. Shared disk architecture
• 3. Shared nothing architecture

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 7
INTO MULTIPROCESSOR ARCHITECTURE
Shared Memory Architecture
• Tightly coupled shared memory systems, illustrated in following figure
have the following characteristics:
• Multiple PUs share memory.
• Each PU has full access to all shared memory through a common bus.
• Communication between nodes occurs via shared memory.
• Performance is limited by the bandwidth of the memory bus.
• Symmetric multiprocessor (SMP) machines are often nodes in a cluster.

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 8
INTO MULTIPROCESSOR ARCHITECTURE
UNIT - 1 MAPPING THE DW ARCHITECTURE
7/16/2019 9
INTO MULTIPROCESSOR ARCHITECTURE
• Parallel processing advantages of shared memory
systems are these:
• Memory access is cheaper than inter-node
communication.

• Shared memory systems are easier to administer than


a cluster.
• A disadvantage of shared memory systems for parallel
processing is as follows:
• Scalability is limited by bus bandwidth and latency,
and by available memory.
UNIT - 1 MAPPING THE DW ARCHITECTURE
7/16/2019 10
INTO MULTIPROCESSOR ARCHITECTURE
Shared Disk Architecture
• Shared disk systems are typically loosely coupled. Such
systems, illustrated in following figure, have the following
characteristics:
• Each node consists of one or more PUs and associated
memory.
• Memory is not shared between nodes.
• Communication occurs over a common high-speed bus.
• Each node has access to the same disks and other
resources.
• A node can be an SMP if the hardware supports it.
• Bandwidth of the high-speed bus limits the number of
nodes (scalability) of the system.

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 11
INTO MULTIPROCESSOR ARCHITECTURE
UNIT - 1 MAPPING THE DW ARCHITECTURE
7/16/2019 12
INTO MULTIPROCESSOR ARCHITECTURE
• Parallel processing advantages of shared disk systems are as follows:
• Shared disk systems permit high availability.

• 

• Shared disk systems provide for incremental growth.


• Parallel processing disadvantages of shared disk systems are these:
• Inter-node synchronization is required, involving DLM overhead and
greater dependency on high-speed interconnect.
• If the workload is not partitioned well there may be high synchronization
overhead.
• There is operating system overhead of running shared disk software.

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 14
INTO MULTIPROCESSOR ARCHITECTURE
Shared Nothing Architecture
• Shared nothing systems are typically loosely coupled. In
shared nothing systems only one CPU is connected to a given
disk. If a table or database is located on that disk, access
depends entirely on the PU which owns it. Shared nothing
systems can be represented as follows:

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 15
INTO MULTIPROCESSOR ARCHITECTURE
UNIT - 1 MAPPING THE DW ARCHITECTURE
7/16/2019 16
INTO MULTIPROCESSOR ARCHITECTURE
• Shared nothing systems have advantages and disadvantages for parallel
processing:
• Advantages
• Shared nothing systems provide for incremental growth.
• System growth is practically unlimited.
• MPPs are good for read-only databases and decision support
applications.
• Failure is local: if one node fails, the others stay up.
• Disadvantages
• More coordination is required.
• More overhead is required for a process working on a disk belonging to
another node.
• If there is a heavy workload of updates or inserts, as in an online
transaction processing system, it may be worthwhile to consider data-
dependent routing to alleviate contention.

UNIT - 1 MAPPING THE DW ARCHITECTURE


7/16/2019 18
INTO MULTIPROCESSOR ARCHITECTURE

You might also like