03 Storage1
03 Storage1
Systems
            Database Storage:
            Files & Pages
15-445/645 FALL 2024                    PROF. ANDY PAVLO
                                             ADMINISTRIVIA
                                Homework #1 is due September 8th @ 11:59pm
                                                 LAST CLASS
                                We now understand what a database looks like at a
                                logical level and how to write queries to read/write
                                data (e.g., using SQL).
                                          TODAY'S AGENDA
                                Background
                                File Storage
                                Page Layout
                                Tuple Layout
                                DB Flash Talk: Neon
                                    DISK-BASED ARCHITECTURE
                                The DBMS assumes that the primary storage
                                location of the database is on non-volatile disk.
                                            STORAGE HIERARCHY
                                                                       Faster
                                                     CPU Registers
                                                                      Smaller
                                                                     Expensive
                                                     CPU Caches
                            Volatile
                         Random Access
                        Byte-Addressable             DRAM
                           Non-Volatile                SSD
                        Sequential Access
                        Block-Addressable
                                                       HDD
                                                                      Slower
                                                 Network Storage      Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                          Faster
                                            28
                                            CPU
                                                        CPU Registers
                                                                         Smaller
                                                  CPU                   Expensive
                                                        CPU Caches
Memory DRAM
SSD
                                Disk                      HDD
                                                                         Slower
                                                    Network Storage      Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
Memory DRAM
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
                                       Memory             DRAM
                                                    Persistent Memory
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
                                       Memory             DRAM
                                                    Persistent Memory
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
                                       Memory             DRAM
                                                    Persistent Memory
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
                                       Memory             DRAM
                                                    Persistent Memory
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
                                       Memory             DRAM
                                                        CXL Type 3
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                         STORAGE HIERARCHY
                                                                            Faster
                                            28
                                            CPU
                                                         CPU Registers
                                                                           Smaller
                                                  CPU                     Expensive
                                                         CPU Caches
                                       Memory             DRAM
                                                        CXL Type 3
                                                           SSD
                                                   Fast Network Storage
                                Disk                       HDD
                                                                           Slower
                                                    Network Storage        Larger
5-445/645 (Fall 2024)
                                            ACCESS TIMES
                                Latency Numbers Every Programmer Should Know
                                                    1 ns L1 Cache Ref
                                                   4 ns L2 Cache Ref
                                                 100 ns DRAM
                                               16,000 ns SSD
                                           2,000,000 ns HDD
                                         ~50,000,000 ns Network Storage
                                        1,000,000,000 ns Tape Archives
                                                                               Source: Colin Scott
                                                                                      Colin Scott
                                              ACCESS TIMES
                                Latency Numbers Every Programmer Should Know
                                            1 ns L1 Cache Ref     1 sec
                                           4 ns L2 Cache Ref      4 sec
                                         100 ns DRAM              100 sec
                                       16,000 ns SSD              4.4 hours
                                   2,000,000 ns HDD               3.3 weeks
                                 ~50,000,000 ns Network Storage   1.5 years
                                1,000,000,000 ns Tape Archives    31.7 years
                                                                               Source: Colin Scott
                                                                                      Colin Scott
                                                             DISK-ORIENTED DBMS
                                                                                                                       Lectures #13-14
                                                                                          Get Page #2
                                 Lecture #6                                                                               Execution
                                                                                                                          Engine
                                                                                            Pointer to Page #2
                                 Buffer Pool
                                                 Directory   Header
                                                                                                                     Interpret Page #2 layout
                                                                 2                        Update Page #2
                        Memory
                                                                               Lecture #6
                                 Database File
1 2 3 4 5 … Pages
                         Disk
5-445/645 (Fall 2024)
                                         DATABASE STORAGE
                                Problem #1: How the DBMS represents the   ← Today
                                database in files on disk.
                                                 FILE STORAGE
                                The DBMS stores a database as one or more files on
                                disk typically in a proprietary format.
                                → The OS does not know anything about the contents of
                                  these files.
                                → We will discuss portable file formats next week…
                                             STORAGE MANAGER
                                The storage manager is responsible for maintaining
                                a database's files.
                                → Some do their own scheduling for reads and writes to
                                  improve spatial and temporal locality of pages.
                                               DATABASE PAGES
                                A page is a fixed-size block of data.
                                → It can contain tuples, meta-data, indexes, log records…
                                → Most systems do not mix page types.
                                → Some systems require a page to be self-contained.
                                             DATABASE PAGES
                        There are three different notions of   Default DB Page Sizes
                        "pages" in a DBMS:
                        → Hardware Page (usually 4KB)           4KB
                        → OS Page (usually 4KB, x64 2MB/1GB)
                        → Database Page (512B-32KB)
                                                                            HEAP FILE
                                      A heap file is an unordered collection of pages with
                                      tuples that are stored in random order.
                                      → Create / Get / Write / Delete Page
                                      → Must also support iterating over all pages.
                        Get Page #2
                                                                                                             …
5-445/645 (Fall 2024)
                                                     HEAP FILE
                                A heap file is an unordered collection of pages with
                                tuples that are stored in random order.
                                → Create / Get / Write / Delete Page
                                → Must also support iterating over all pages.
                                       Page
              Get Page #23           Directory
                                           TODAY'S AGENDA
                                File Storage
                                Page Layout
                                Tuple Layout
                                                      PAGE HEADER
                        Every page contains a header of meta-       Page
                        data about the page's contents.
                        →       Page Size                           Header
                        →       Checksum
                        →       DBMS Version
                        →       Transaction Visibility              Data
                        →       Compression / Encoding Meta-data
                        →       Schema Information
                        →       Data Summary / Sketches
                                                     PAGE LAYOUT
                                 For any page storage architecture, we now need to
                                 decide how to organize the data inside of the page.
                                   → We are still assuming that we are only storing tuples in a
                          Lecture #5 row-oriented storage model.
                                                     PAGE LAYOUT
                                 For any page storage architecture, we now need to
                                 decide how to organize the data inside of the page.
                                   → We are still assuming that we are only storing tuples in a
                          Lecture #5 row-oriented storage model.
                                     TUPLE-ORIENTED STORAGE
                        How to store tuples in a page?            Page
                        Strawman Idea: Keep track of the      Num Tuples = 0
                        number of tuples in a page and then
                        just append a new tuple to the end.
                                      TUPLE-ORIENTED STORAGE
                        How to store tuples in a page?             Page
                        Strawman Idea: Keep track of the       Num Tuples = 30
                        number of tuples in a page and then       Tuple #1
                        just append a new tuple to the end.
                                                                  Tuple #2
                        → What happens if we delete a tuple?
                                                                  Tuple #3
                                      TUPLE-ORIENTED STORAGE
                        How to store tuples in a page?              Page
                        Strawman Idea: Keep track of the       Num Tuples = 230
                        number of tuples in a page and then       Tuple #1
                        just append a new tuple to the end.
                        → What happens if we delete a tuple?
                                                                  Tuple #3
                                      TUPLE-ORIENTED STORAGE
                        How to store tuples in a page?             Page
                        Strawman Idea: Keep track of the       Num Tuples = 30
                        number of tuples in a page and then       Tuple #1
                        just append a new tuple to the end.
                                                                  Tuple #4
                        → What happens if we delete a tuple?
                                                                  Tuple #3
                                      TUPLE-ORIENTED STORAGE
                        How to store tuples in a page?              Page
                        Strawman Idea: Keep track of the        Num Tuples = 30
                        number of tuples in a page and then        Tuple #1
                        just append a new tuple to the end.
                                                                   Tuple #4
                        → What happens if we delete a tuple?
                        → What happens if we have a variable-      Tuple #3
                          length attribute?
                                                  SLOTTED PAGES
                        The most common layout scheme is                           Slot Array
                        called slotted pages.                                   1 2 3 4 5 6 7
                                                                       Header
                        The slot array maps "slots" to the
                        tuples' starting position offsets.
                                                  SLOTTED PAGES
                        The most common layout scheme is                           Slot Array
                        called slotted pages.                                   1 2 3 4 5 6 7
                                                                       Header
                        The slot array maps "slots" to the
                        tuples' starting position offsets.
                                                  SLOTTED PAGES
                        The most common layout scheme is                           Slot Array
                        called slotted pages.                                   1 2 3 4 5 6 7
                                                                       Header
                        The slot array maps "slots" to the
                        tuples' starting position offsets.
                                                  SLOTTED PAGES
                        The most common layout scheme is                           Slot Array
                        called slotted pages.                                   1 2 3 4 5 6 7
                                                                       Header
                        The slot array maps "slots" to the
                        tuples' starting position offsets.
                                                  SLOTTED PAGES
                        The most common layout scheme is                           Slot Array
                        called slotted pages.                                   1 2 3 4 5 6 7
                                                                       Header
                        The slot array maps "slots" to the
                        tuples' starting position offsets.
                                                  SLOTTED PAGES
                        The most common layout scheme is                           Slot Array
                        called slotted pages.                                   1 2 3 4 5 6 7
                                                                       Header
                        The slot array maps "slots" to the
                        tuples' starting position offsets.
                                                    RECORD IDS
                        The DBMS assigns each logical tuple a
                        unique record identifier that
                                                                               CTID (6-bytes)
                        represents its physical location in the
                        database.
                        → File Id, Page Id, Slot #
                        → Most DBMSs do not store ids in tuple.
                        → SQLite uses ROWID as the true primary              ROWID (8-bytes)
                                       ROWID
                                           TODAY'S AGENDA
                                File Storage
                                Page Layout
                                Tuple Layout
                                                TUPLE LAYOUT
                                A tuple is essentially a sequence of bytes.
                                → These bytes do not have to be contiguous.
                                                  TUPLE HEADER
                        Each tuple is prefixed with a header               Tuple
                        that contains meta-data about it.         Header   Attribute Data
                        → Visibility info (concurrency control)
                        → Bit Map for NULL values.
                                                TUPLE DATA
                        Attributes are typically stored in the                Tuple
                        order that you specify them when you     Header   a   b   c   d   e
                        create the table.
                                                CONCLUSION
                                Database is organized in pages.
                                Different ways to track pages.
                                Different ways to store pages.
                                Different ways to store tuples.
                                               NEXT CLASS
                                Log-Structured Storage
                                Index-Organized Storage
                                Value Representation
                                Catalogs