0% found this document useful (0 votes)

31 views32 pages

3 Join Optimization

The document provides an overview of Database Management Systems (DBMS) and various relational algebra operations, including union, cross-product, and joins. It discusses the requirements for these operations and outlines different join techniques such as nested-loops, single-loop, sort-merge, and hash-joins, along with their performance implications. Additionally, it emphasizes the importance of query processing strategies in optimizing database operations.

Uploaded by

Mx A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views32 pages

3 Join Optimization

Uploaded by

Mx A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Database Management System (DBMS)

Web Embedded Interactive

Applications Forms SQL SQL

SQL Commands
DBMS
Query
Evaluation Engine

Files and Access Methods

Concurrency Recovery
Control Buffer Manager Manager
Disk Space Manager

Database Data Indexes Catalog

1
Relational Algebra Operations

2
Union () - Example
SID SName Rating Age
22 Dustin 7 45
S1  S2
31 Lubber 8 55
58 Rusty 10 35
S1
SID SName Rating Age
SID SName Rating Age 22 Dustin 7 45
28 Yuppy 9 35 28 Yuppy 9 35
31 Lubber 8 55 31 Lubber 8 55
44 Guppy 5 35 44 Guppy 5 35
58 Rusty 10 35 58 Rusty 10 35

3
Union()
 Takes two input relations, which must be union-
compatible:
1. Same number of fields
2. Corresponding fields have the same type

 Same requirements for:

1. Intersection ()
2. Set-difference (−)

4
Cross-product () - Example
S1 × R1
SID SName Rating Age SID BID Day
22 Dustin 7 45 22 101 10/10/96
31 Lubber 8 55 58 103 11/12/96
58 Rusty 10 35
R1
S1
SID1 SName Rating Age SID2 BID Day
22 Dustin 7 45 22 101 10/10/96
22 Dustin 7 45 58 103 11/12/96
31 Lubber 8 55 22 101 10/10/96
31 Lubber 8 55 58 103 11/12/96
58 Rusty 10 35 22 101 10/10/96
58 Rusty 10 35 58 103 11/12/96

5
Cross-product ()
 Each row of S1 is paired with each row of R1

 Result schema has one field per field of S1 and R1

 Field names inherited if possible

 Conflict: Both S1 and R1 have a field called “SID”

 Rename to SID1 and SID2

6
Relational Algebra Operations

7
Joins
 Condition/Theta Join Defined as: R condition S
 Equijoin: a special case of condition join where the
condition contains only equalities
 Natural Join: Equijoin on all common fields

 Join can be defined as cross-product followed by

selection:
R condition S = condition (RS)

 But a join result has fewer tuples than cross-product!

 might be able to compute more efficiently (next slides..)
8
Condition Join - Example
SID SName Rating Age SID BID Day
22 Dustin 7 45 22 101 10/10/96
31 Lubber 8 55 58 103 11/12/96
58 Rusty 10 35
R1
S1
S1 S1.sid<R1.sid R1

SID1 SName Rating Age SID2 BID Day

22 Dustin 7 45 58 103 11/12/96
31 Lubber 8 55 58 103 11/12/96

9
Equijoin - Example
SID SName Rating Age SID BID Day
22 Dustin 7 45 22 101 10/10/96
31 Lubber 8 55 58 103 11/12/96
58 Rusty 10 35
R1
S1
S1 S1.sid=R1.sid R1

SID SName Rating Age BID Day

22 Dustin 7 45 101 10/10/96
58 Rusty 10 35 103 11/12/96

10
Query Processing: Natural Joins

1. Nested-loops join (tuple and block – based)

2. Single-loop join (index nested loops join)

3. Sort-merge join

4. Hash-join

11
Join

Student

Transcript

12
Nested-loops join (tuple-based)
 Join relations T and S
 A is the common attribute in T,
B is the common attribute in S

For each record t in T /outer loop/

{
For each record s in S /*inner loop*/
{
if (t.A = s.B)
add <t,s> to result
}
}

13
Nested-loops join

Read Block 1st Iteration Read Block

2nd Iteration
Read Block 3rd Iteration Read Block

4th Iteration

Relation T: Read Block

NT records in BT blocks

Read Block
Total number of block reads=

BT + Number of iterations (??) ×

Relation S:
BS =
NS records in BS blocks
BT + NT × BS
14
Nested-loops join
 Join relations T and S
 T: NT=50 records stored in BT=10 disk blocks
 S: NS=6000 records stored in BS=2000 disk blocs
 Relation T (outer loop) is scanned one time
 I/O cost = BT
 Relation S (inner loop) is scanned NT times
 I/O cost = NT x BS
 Per-tuple Implementation:
 I/O Cost = BT + NT BS = 10 + 50 * 2000 block access
 If block access = 10 ms, total cost = 16.6 mins!

15
Nested-loops join
 In per-tuple Implementation:
For each record t in T
For each record s in S
Compare record s with record t /*in-memory operation*/
 However, because of block access:
 an entire block of records is already read from T in memory

 Idea: compare tuple s with the:

 entire block of records from T  reduce iterations

 Page-at-a-time implementation Next slide…

16
Nested-loops join (page-based)
For each block Tblock in T { /*outer loop*/
For each block Sblock in S { /*inner loop*/

For each record t in Tblock { /* in-memory matching*/

For each record s in Sblock {
if (t.A = s.B)
add <t,s> to result
}
}
} /*end inner loop*/
} /*end outer loop*/

17
Nested-loops join: page-at-a-time

Read Block Read Block

1st Iteration

Read Block Read Block

2nd Iteration

Relation T: Read Block

NT records in BT blocks

Read Block
Total number of block reads=
BT + Number of iterations (??) ×
BS = Relation S:
NS records in BS blocks
BT + BT × BS
18
Nested-loops join: page-at-a-time
 Join relations T and S
 T: NT=50 records stored in BT=10 disk blocks
 S: NS=6000 records stored in BS=2000 disk blocs
 Relation T (outer loop) is scanned one time
 I/O cost = BT
 Relation S (inner loop) is scanned BT times
 I/O cost = BT x BS
 Per-page Implementation:
 I/O Cost = BT + BT BS = 10 + 10 * 2000 block access
 If block access = 10 ms, total cost = 3.3 mins!

19
Block nested-loops join
 Observation: having more tuples from T in memory
reduces the number of iterations 
 However, In both of the previous implementations:
 We read only one block from T 

 Idea: Read a chunk of blocks from T instead of only one!

 But how many?!

20
Block nested-loops join
 Let buffer size be NB memory blocks, we need:
 One block for buffering result, and
 One block for reading from the inner file (i.e., S)
 Then, remaining NB-2
 Read as much as:

NB-2 blocks at a time from the outer file (i.e., T)

21
Block nested-loops join

For each NB-2 blocks Tblocks in T { /outer loop/

For each block Sblock in S { /*inner loop*/

For each record t in Tblocks { /* in-memory matching*/

For each record s in Sblock {
if (t.A = s.B)
add <t,s> to result
}
}
} /*end inner loop*/
} /*end outer loop*/

22
Block nested-loops join

Relation S:
NS records in BS blocks
Relation T:
NT records in BT blocks
23
Block nested-loops join

Read Blocks Read Block

1st Iteration
Read Block

Read Blocks Read Block

2nd Iteration
Read Block

Relation S:
NS records in BS blocks
Relation T:
NT records in BT blocks
24
Block nested-loops join

Read Blocks Total number of block reads=

1st Iteration BT + Number of iterations (??) ×

BS
Number of iterations
Read Blocks = number of chunks= BT / (NB-2)

2nd Iteration
Total = BT + (BT /(NB-2)) × BS

Relation T:
NT records in BT blocks
25
Order matters!
 When, T: outer & S: inner
 BT + BS * (BT/(NB-2))
 When: T: inner & S: outer
 BS + BT * (BS/(NB-2))
 In general:
BOuter + BInner × (BOuter/(NB-2))

 If BOuter < Binner:

 Has no impact on the term: BInner * (BOuter/(NB-2))
 Reduces the term: Bouter

 Overall, the smaller file should be the outer file

26
Query Processing

1. Nested-loops join

2. Single-loop join (index loop join)

3. Sort-merge join

4. Hash-join

27
Single-loop join (index loop join)
 Join relations T and S
 A is the common attribute in T,
B is the common attribute in S

 Only works if an index exists for one of the two join

attributes (A or B).
 Assume an index on attribute B in relation S, then:

For each record t in T /outer loop/

{
Locate records s from S, that satisfy:
t.A = s.B
}
28
Single-loop join
SID Name Age SID CID GPA
Probe: 546007
546007 Peter 18 546007 INFS 18
Read Block
546100 Bob 19 546007 MMDS 19

546200 Ann 21 Probe: 546100 546200 INFS 21

Read Block
546207 Jane 20 546100 ENGG 20

546007 ENGG 24

546200 MMDS 18

Probe: 546200
546007 ELEC 27

546200 ELEC 20

B+ tree index

29
Single-loop join performance
 If for outer relation:
 number of blocks BOuter ,and
 number of records NOuter

 If for inner relation:

 number of index levels LIndex

 Cost in number block accesses:

BOuter + (NOuter × (LIndex + 1))
 Order matters for single-loop join as well

30
Query Processing: Natural Joins

1. Nested-loop join

2. Single-loop join

3. Sort-merge join

4. Hash-join

31
Putting it together

Nested-loop?
Single-loop?
 S.sname Inner and outer?

S.id = E.sid

 S.gpa > 4  E.cid = CSE454

Linear Search? Linear Search?

Binary Search? Binary Search?
Index? S E Index?

Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
No ratings yet
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
37 pages
Lesson 06
No ratings yet
Lesson 06
44 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
QEII
No ratings yet
QEII
44 pages
05 Optimization
No ratings yet
05 Optimization
58 pages
DBMS UNIT 4 Part 1
No ratings yet
DBMS UNIT 4 Part 1
15 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
Course08 - RelEval
No ratings yet
Course08 - RelEval
22 pages
Query Processing for CS Students
No ratings yet
Query Processing for CS Students
47 pages
Problem Solving 3
No ratings yet
Problem Solving 3
3 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
54 pages
DBMS R19 Unit Iv
No ratings yet
DBMS R19 Unit Iv
25 pages
Database Query Processing Guide
No ratings yet
Database Query Processing Guide
3 pages
DBMS 10 Joins v2
No ratings yet
DBMS 10 Joins v2
38 pages
Chapter 1 Part II
No ratings yet
Chapter 1 Part II
22 pages
Database Query Optimization Guide
No ratings yet
Database Query Optimization Guide
38 pages
Lec 11
No ratings yet
Lec 11
43 pages
Solution 03
No ratings yet
Solution 03
6 pages
Query Processing + Optimization: Outline: Operator Evaluation Strategies
No ratings yet
Query Processing + Optimization: Outline: Operator Evaluation Strategies
53 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Q Evaluation
No ratings yet
Q Evaluation
17 pages
BCS Topic
No ratings yet
BCS Topic
66 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
No ratings yet
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
38 pages
Evaluation of Relational Operations: Chapter 14, Part A (Joins)
No ratings yet
Evaluation of Relational Operations: Chapter 14, Part A (Joins)
6 pages
Query Execution
No ratings yet
Query Execution
87 pages
13 QP1
No ratings yet
13 QP1
33 pages
Unit 3 - DBMS
No ratings yet
Unit 3 - DBMS
15 pages
L10-Query Evaluaion
No ratings yet
L10-Query Evaluaion
50 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
77 pages
Unit 3
No ratings yet
Unit 3
63 pages
Relational Operators
No ratings yet
Relational Operators
114 pages
This
No ratings yet
This
8 pages
Dbms Query Evaluation
No ratings yet
Dbms Query Evaluation
28 pages
CSE 444: Database Internals: Section 4: Query Optimizer
No ratings yet
CSE 444: Database Internals: Section 4: Query Optimizer
16 pages
Query Processing
No ratings yet
Query Processing
77 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
Chapter 13
No ratings yet
Chapter 13
24 pages
Correction of Final Exam 24-25
No ratings yet
Correction of Final Exam 24-25
5 pages
Cs411fa09 Hw4 Sol
No ratings yet
Cs411fa09 Hw4 Sol
8 pages
Response DB 3
No ratings yet
Response DB 3
6 pages
Query Processing
No ratings yet
Query Processing
39 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
45 pages
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
No ratings yet
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
20 pages
Hash Tables and Query Execution: March 1st, 2004
No ratings yet
Hash Tables and Query Execution: March 1st, 2004
32 pages
Data Warehousing: Need For Speed: Join Techniques
No ratings yet
Data Warehousing: Need For Speed: Join Techniques
22 pages
Query Processing & Optimization
No ratings yet
Query Processing & Optimization
31 pages
Query Processing Exercises
No ratings yet
Query Processing Exercises
5 pages
ADBMS
No ratings yet
ADBMS
15 pages
Chapter15 2
No ratings yet
Chapter15 2
34 pages
Final Review
No ratings yet
Final Review
96 pages
Efficient Database Operations Guide
No ratings yet
Efficient Database Operations Guide
4 pages
hw3 Sols
No ratings yet
hw3 Sols
5 pages
Chapter 2-Query Processing - 110554
No ratings yet
Chapter 2-Query Processing - 110554
38 pages
CSE 444 Practice Problems
No ratings yet
CSE 444 Practice Problems
8 pages
Vu Lec 35
No ratings yet
Vu Lec 35
42 pages
Conference Management System
No ratings yet
Conference Management System
24 pages
Final Business Architecture Presentation - v1 0
No ratings yet
Final Business Architecture Presentation - v1 0
36 pages
SAP EWM Value Added Services Kit-to-Stock: Slide 1
No ratings yet
SAP EWM Value Added Services Kit-to-Stock: Slide 1
22 pages
I.interfaces and Packages
No ratings yet
I.interfaces and Packages
19 pages
Web Accessibility Audit Report
No ratings yet
Web Accessibility Audit Report
12 pages
Fmp12 Tutorial
No ratings yet
Fmp12 Tutorial
82 pages
(Splunk Case Study) (Splunk Case Study)
No ratings yet
(Splunk Case Study) (Splunk Case Study)
16 pages
RESTful Web API Design With Node - Js - Second Edition - Sample Chapter
0% (1)
RESTful Web API Design With Node - Js - Second Edition - Sample Chapter
17 pages
Entity-Relationship Diagram
No ratings yet
Entity-Relationship Diagram
7 pages
Linux-Server-Upgrade - Document-RHEL7-to-RHEL8 1
No ratings yet
Linux-Server-Upgrade - Document-RHEL7-to-RHEL8 1
5 pages
WireGuard Setup Guide for IT Pros
No ratings yet
WireGuard Setup Guide for IT Pros
24 pages
InterSystems IRIS Data Platform-Unified Platform For Powering Real-Time Data-Intensive Applications-Whitepaper
No ratings yet
InterSystems IRIS Data Platform-Unified Platform For Powering Real-Time Data-Intensive Applications-Whitepaper
12 pages
Citrix WEM Service
No ratings yet
Citrix WEM Service
583 pages
Introduction To Big Data Analytics
No ratings yet
Introduction To Big Data Analytics
47 pages
Pramod's Resume 2021
No ratings yet
Pramod's Resume 2021
2 pages
REPOSITORY
No ratings yet
REPOSITORY
17 pages
SharePoint 2019 Setup Guide
No ratings yet
SharePoint 2019 Setup Guide
37 pages
Description: Expense Management System (EMS)
No ratings yet
Description: Expense Management System (EMS)
1 page
Project
No ratings yet
Project
7 pages
StreamProcessingAndAnalytics Handout
No ratings yet
StreamProcessingAndAnalytics Handout
7 pages
Rais12 SM CH08
No ratings yet
Rais12 SM CH08
27 pages
SAP Activate Phases
No ratings yet
SAP Activate Phases
1 page
VBA For Beginners: VBA/Excel Connecting Excel To Access Using VBA
No ratings yet
VBA For Beginners: VBA/Excel Connecting Excel To Access Using VBA
22 pages
ERP Insights for Business Leaders
No ratings yet
ERP Insights for Business Leaders
22 pages
HP Fortify Source Code Analyzer: Installation Guide
No ratings yet
HP Fortify Source Code Analyzer: Installation Guide
18 pages
A Guide To Using Eloquent ORM in Laravel Scotch PDF
No ratings yet
A Guide To Using Eloquent ORM in Laravel Scotch PDF
35 pages
Ebook Fast Data Architectures For Streaming Applications 2
No ratings yet
Ebook Fast Data Architectures For Streaming Applications 2
58 pages
MS-DOS Command Guide
100% (1)
MS-DOS Command Guide
21 pages
4 Windows 7 32 Bit - VPN Installation and Usage Procedure
No ratings yet
4 Windows 7 32 Bit - VPN Installation and Usage Procedure
8 pages
18csmp68 Lab Manual
No ratings yet
18csmp68 Lab Manual
40 pages

3 Join Optimization

Uploaded by

3 Join Optimization

Uploaded by

Database Management System (DBMS)

Web Embedded Interactive

Files and Access Methods

Database Data Indexes Catalog

 Same requirements for:

 Result schema has one field per field of S1 and R1

 Field names inherited if possible

 Conflict: Both S1 and R1 have a field called “SID”

 Rename to SID1 and SID2

 Join can be defined as cross-product followed by

 But a join result has fewer tuples than cross-product!

SID1 SName Rating Age SID2 BID Day

SID SName Rating Age BID Day

1. Nested-loops join (tuple and block – based)

2. Single-loop join (index nested loops join)

For each record t in T /*outer loop*/

Read Block 1st Iteration Read Block

Relation T: Read Block

BT + Number of iterations (??) ×

 Idea: compare tuple s with the:

 Page-at-a-time implementation Next slide…

For each record t in Tblock { /* in-memory matching*/

Read Block Read Block

Read Block Read Block

Relation T: Read Block

 Idea: Read a chunk of blocks from T instead of only one!

NB-2 blocks at a time from the outer file (i.e., T)

For each NB-2 blocks Tblocks in T { /*outer loop*/

For each record t in Tblocks { /* in-memory matching*/

Read Blocks Read Block

Read Blocks Read Block

Read Blocks Total number of block reads=

1st Iteration BT + Number of iterations (??) ×

 If BOuter < Binner:

 Overall, the smaller file should be the outer file

2. Single-loop join (index loop join)

 Only works if an index exists for one of the two join

For each record t in T /*outer loop*/

546200 Ann 21 Probe: 546100 546200 INFS 21

 If for inner relation:

 Cost in number block accesses:

 S.gpa > 4  E.cid = CSE454

Linear Search? Linear Search?

You might also like

For each record t in T /outer loop/

For each NB-2 blocks Tblocks in T { /outer loop/

For each record t in T /outer loop/