0% found this document useful (0 votes)
61 views31 pages

Introduction To DBMS Internals: Alvin Cheung Aditya Parameswaran

This document contains a summary of the key components and architecture of a Database Management System (DBMS). It discusses: 1) The layered architecture of a DBMS and how each layer abstracts the lower layers to manage complexity and performance. 2) How SQL queries are parsed, optimized, and executed using relational operators to operate on records and files. 3) The role of buffer management, files and index management, and disk space management in providing efficient data storage and retrieval. 4) Issues of concurrency control and recovery that are related to storage and memory management in a DBMS. 5) An overview of different storage media like disks and flash storage, their components, performance
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views31 pages

Introduction To DBMS Internals: Alvin Cheung Aditya Parameswaran

This document contains a summary of the key components and architecture of a Database Management System (DBMS). It discusses: 1) The layered architecture of a DBMS and how each layer abstracts the lower layers to manage complexity and performance. 2) How SQL queries are parsed, optimized, and executed using relational operators to operate on records and files. 3) The role of buffer management, files and index management, and disk space management in providing efficient data storage and retrieval. 4) Issues of concurrency control and recovery that are related to storage and memory management in a DBMS. 5) An overview of different storage media like disks and flash storage, their components, performance
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Introduction to DBMS Internals

DBMS Architecture
Data storage

Alvin Cheung
Aditya Parameswaran
Reading: R & G Chapter 9
Course Overview
• Unit 1: Relational model and SQL
• Unit 2: Storage and indexing
• Unit 3: Query execution
• Unit 4: Query optimization
• Unit 5: Transactions
• Unit 6: Recovery
• Unit 7: Conceptual design
• Unit 8: Advanced topics (time permitting)
Slide Deck Title
DBMS Architecture
Architecture of a DBMS: SQL Client
• How is a SQL query executed?

SQL Client

Database
Management
System

File System Slide Deck Title


DBMS: Parsing & Optimization
Purpose:
Parse, check, and verify the SQL
SQL Client
SELECT S.sid, S.sname, R.bid
FROM Sailors R, Reserves R
WHERE S.sid = R.sid and S.age > 30 Query Parsing & Optimization
GROUP BY age

And translate into an efficient Database


relational query plan Management
System

File System Slide Deck Title


DBMS: Relational Operators
Purpose: Execute query plan by
operating on records and files
SQL Client

Query Parsing & Optimization

Relational Operators
GroupBy (Age)
Database
Indexed Join Management
System
Heap Scan Indexed Scan
Reserves Sailors
File System Slide Deck Title
DBMS: Files and Index Management
Purpose: Organize tables and
Records as groups of pages in
a logical file SQL Client
SSN Last First Age Salary
Name Name

123 Adams Elmo 31 $400


Query Parsing & Optimization
443 Grouc Oscar 32 $300
h
Relational Operators
244 Oz Bert 55 $140

134 Sande
rs
Ernie 55 $400
Database
Files and Index Management

Management
System

File System Slide Deck Title


DBMS: Buffer Management
Purpose:
Provide the illusion of operating
in memory SQL Client

Query Parsing & Optimization

Page Page Page Relational Operators


1 2 3
Database
Files and Index Management

Management
Buffer Management
RAM System

File System Slide Deck Title


Disk Space Management
DBMS: Disk Space Management
Purpose: Translate page requests into
physical bytes on one or more device(s)
SQL Client

Query Parsing & Optimization

Block Block Block Relational Operators


1 2 3 Database
Files and Index Management

Management
Buffer Management
System
Disk Space Management

File System Slide Deck Title


Architecture of a DBMS
• Organized in layers
• Each layer abstracts the layer below
• Manage complexity SQL Client
• Performance assumptions
• Example of good systems design Query Parsing & Optimization

Relational Operators
• Many non-relational DBMSs are Database
Files and Index Management
structured similarly Management
Buffer Management
System
Disk Space Management

File System Slide Deck Title


DBMS: Concurrency & Recovery
Two cross-cutting issues related to
storage and memory management:
SQL Client

Query Parsing & Optimization

Relational Operators

Concurrency Control Database


Files and Index Management

Recovery
Management
Buffer Management
System
Disk Space Management

File System Slide Deck Title


STORAGE MEDIA
Disks
• Most database systems were originally designed for
magnetic “spinning” disks
• Disk are a mechanical anachronism!
• Instilled design ideas that apply to using solid state disks as well

• Major implications:
• Disk API:
• READ: transfer “page” of data from disk to RAM.
• WRITE: transfer “page” of data from RAM to disk.
• No random reads / writes!!
• Both API calls are very, very slow! Slide Deck Title
• Plan carefully!
Storage Hierarchy

.5 ns Small, Fast Registers


L1 cache reference

L1 Cache

L2 Cache

100.0 ns – Currently used data


main memory reference RAM

SSD Likewise if available


.5 ns – Backup and logs
20x106 ns to read 1MB Big, Slow Disk Secondary & tertiary storage
sequentially Slide Deck Title
Economics
• $1000 at NewEgg 2018:
• Mag Disk: ~40TB for $1000
• SSD: ~2.3TB for $1000 TB/$1000
• RAM: 80GB for $1000 50
40
30
20
10
0
RAM SSD Hard Drive

Slide Deck Title


Components of a Disk
• Platters spin (say 15000 rpm)
• Arm assembly moved in or out to position a head
on a desired track Disk head
Spindle

• Tracks under heads make


a “cylinder”
Platters

Tracks
Slide Deck Title
Arm assembly
Components of a Disk
• Platters spin (say 15000 rpm)
• Arm assembly moved in or out to position a head
on a desired track Disk head
Spindle

• Tracks under heads make


a “cylinder”
Arm movement Platters

Tracks
Slide Deck Title
Arm assembly
Components of a Disk
• Platters spin (say 15000 rpm)
• Arm assembly moved in or out to position a head
on a desired track Disk head
Spindle

• Tracks under heads make Sector


a “cylinder”
• Only one head reads/writes at any one time
Platters
• Block/page size is a multiple of (fixed) sector size Arm movement
Tracks
Slide Deck Title
Arm assembly
An Analogy

Slide Deck Title


Accessing a Disk page
• Time to access (read/write) a disk block:
• seek time (moving arms to position disk head on track)
• ~2-3 ms on average
• rotational delay (waiting for block to rotate under head)
• ~0-4 ms (15000 RPM)
• transfer time (actually moving data to/from disk surface)
• ~0.25 ms per 64KB page

• Key to lower I/O cost: reduce seek/rotational delays

Slide Deck Title


Flash (SSD)
• Organized into “cells”
• Current generation (NAND)
• Random reads and writes, but:
• Fine-grain reads (4-8K reads), coarse-grain writes
(1-2MB writes)

Slide Deck Title


Flash (SSD), Pt. 2
• So… read is fast and predictable
• 4KB random reads: ~500MB/sec

• But write is not!


• 4KB random writes: ~120 MB/sec

• Why? Only 2k-3k erasures before failure

• so keep moving write units around (“wear leveling”)

Slide Deck Title


DISK SPACE MANAGEMENT
Block Level Storage
• Read and Write large chunks of sequential bytes
• Sequentially: “Next” disk block is fastest
• Maximize usage of data per Read/Write
• “Amortize” seek delays (HDDs) and writes (SSDs):
if you’re going all the way to Pluto, pack the spaceship full!
• Predict future behavior
• Cache popular blocks
• Pre-fetch likely-to-be-accessed blocks
• Buffer writes to sequential blocks
• More on these as we go Slide Deck Title
A Note on Terminology
• Block = Unit of transfer for disk read/write
• 64KB – 128KB is a good number today
• Book says 4KB
• We’ll use this unit for all storage devices
• Page: a common synonym for “block”
• In some texts, “page” = a block-sized chunk of RAM

• We’ll treat “block” and “page” as synonyms


Slide Deck Title
Disk Space Management
• Lowest layer of DBMS, manages space
on disk SQL Client

• Purpose: Query Parsing & Optimization


• Map pages to locations on disk Relational Operators
• Load pages from disk to memory Database
Files and Index Management

• Save pages back to disk & ensuring writes Management


Buffer Management
System
Disk Space Management
• Higher levels call upon this layer to:
• Read/write a page File System
Slide Deck Title
• Allocate/de-allocate logical pages
Disk Space Management:
Requesting Pages
• page = getFirstPage(“Sailors”);
while (!done) {
process(page);
page = page.nextPage();
}
• Physical details hidden from higher levels of system
• Higher levels may “safely” assume nextPage is fast
• Hence sequential runs of pages are quick to scan
Slide Deck Title
Disk Space Management: Implementation

• Proposal 1: Talk to the storage device directly


• Could be very fast if you knew the device well
• Hard to program when each device has its own API
• What happens when devices change?

Slide Deck Title


Disk Space Management: Implementation
• Proposal 2: Run our own over filesystem (FS)
• Bypass the OS, allocate single large “contiguous” file on
an empty disk
• assume sequential/nearby byte access are fast
• Most FS optimize disk layout for sequential access
• Gives us more or less what we want if we start with an
empty disk
• DBMS “file” may span multiple FS files on multiple
disks/machines Slide Deck Title
Disks and Files: Summary
• Magnetic (hard) disks and SSDs
• Basic HDD and SSD mechanics
• Concept of “near” pages and how it relates to cost of
access

• Relative cost of
• Random vs. sequential disk access (10x)
• Disk (Pluto) vs RAM (Sacramento) vs. registers (your head)
• Big, big differences! Slide Deck Title
Files: Summary
• DB File storage
• Typically over FS file(s)

• Disk space manager loads and stores pages


• Block level reasoning
• Abstracts device and file system; provides fast
“next page”
Slide Deck Title

You might also like