0% found this document useful (0 votes)
28 views65 pages

DB Storage3

Database

Uploaded by

batch22.cs.12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views65 pages

DB Storage3

Database

Uploaded by

batch22.cs.12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

LAST CLASS

We discussed alternatives to tuple-oriented


storage scheme.
→ Log-structured storage
→ Index-organized storage

These approaches are ideal for write-heavy


(INSERT/UPDATE/DELETE) workloads.
But the most important query for an application
may be read (SELECT ) performance…
DATABASE WORKLOADS
• On-Line Transaction Processing (OLTP)
• → Fast operations that only read/update a small amount of
• data each time.

• On-Line Analytical Processing (OLAP)


• → Complex queries that read a lot of data to compute
• aggregates.

• Hybrid Transaction + Analytical Processing


• → OLTP + OLAP together on the same database instance
DATABASE WORKLOADS
Complex
Operation Complexity

OLAP
HTAP
OLTP
Simple
Write-Heavy Read-Heavy

Workload Focus Source: Mike Stonebraker


WIKIPEDIA EXAMPLE
CREATE TABLE useracct ( CREATE TABLE pages (
userID INT PRIMARY KEY, pageID INT PRIMARY KEY,
userName VARCHAR UNIQUE, title VARCHAR UNIQUE,
⋮ latest INT
); ⮱ REFERENCES revisions (revID),
);

CREATE TABLE revisions (


revID INT PRIMARY KEY,
userID INT REFERENCES useracct (userID),
pageID INT REFERENCES pages (pageID),
content TEXT,
updated DATETIME
);
OLTP
On-line Transaction Processing: SELECT *
→ Simple queries that read/update a small FROM pages WHERE P.pageID =
amount of data that is related to a single ?
entity in the database.
UPDATE useracct
SET lastLogin = NOW(),
This is usually the kind of application hostname = ?
that people build first. WHERE userID = ?

INSERT INTO revisions VALUES


(?,?…,?)
OLAP
On-line Analytical Processing: SELECT COUNT(U.lastLogin),
→ Complex queries that read large portions EXTRACT(month FROM
U.lastLogin) AS month
of the database spanning multiple entities. FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY
You execute these workloads on the EXTRACT(month FROM U.lastLogin)

data you have collected from your


OLTP application(s).
STORAGE MODELS
A DBMS's storage model specifies how it
physically organizes tuples on disk and in memory.
→ Can have different performance characteristics based on
the target workload (OLTP vs. OLAP).
→ Influences the design choices of the rest of the DBMS.

Choice #1:N-ary Storage Model (NSM)


Choice #2: Decomposition Storage Model (DSM)
Choice #3: Hybrid Storage Model (PAX)
N-ARY STORAGE MODEL (NSM)
• The DBMS stores (almost) all attributes for a single tuple
contiguously in a single page.
• → Also known as a "row store"

• Ideal for OLTP workloads where queries are more likely to access individual
entities and execute write-heavy workloads.

• NSM database page sizes are typically some constant multiple of 4


KB hardware pages.
• → Oracle (4 KB), Postgres (8 KB), MySQL (16 KB)
NSM: PHYSICAL ORGANIZATION
Col A Col B Col C
A disk-oriented NSM system stores a Row #0 a0 b0 c0
tuple's fixed-length and variable- Row #1 a1 b1 c1
Row #2 a2 b2 c2
length attributes contiguously in a Row #3 a3 b3 c3
single slotted page. Row #4 a4 b4 c4
Row #5 a5 b5 c5

The tuple's record id (page#, slot#) is


how the DBMS uniquely identifies a /lot AYYay
physical tuple.

Database Page
header

header a0 b0 c0
NSM: OLTP EXAMPLE
SELECT * FROM useracct
WHERE userName = ? Lecture #8
AND userPass = ?
Index

N/M Dis# Page


header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin


Database File

header userID userName userPass hostname lastLogin

header - - - - -

Dis#
NSM: OLTP EXAMPLE
SELECT * FROM useracct
WHERE userName = ? Lecture #8
AND userPass = ?
Index
INSERT INTO useracct
VALUES (?,?,…?)
N/M Dis# Page
header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin


Database File

header userID userName userPass hostname lastLogin

header
header user
- ID userName
- userPass
- hostname
- lastLogin
-

Dis#
NSM: OLAP EXAMPLE
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

N/M Dis# Page


header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin


Database File

header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin

Dis#
NSM: OLAP EXAMPLE
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

N/M Dis# Page


header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin


Database File

header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin

Dis#
NSM: OLAP EXAMPLE
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

N/M Dis# Page


header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin


Database File

header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin

Dis# Useless Data


NSM: SUMMARY
• Advantages
• → Fast inserts, updates, and deletes.
• → Good for queries that need the entire tuple (OLTP).
• → Can use index-oriented physical storage for clustering.

• Disadvantages
• → Not good for scanning large portions of the table and/or
• a subset of the attributes.
• → Terrible memory locality in access patterns.
• → Not ideal for compression because of multiple value
• domains within a single page.
DECOMPOSITION STORAGE MODEL (DSM)
The DBMS stores a single attribute for all tuples
contiguously in a block of data.
→ Also known as a "column store"

Ideal for OLAP workloads where read-only


queries perform large scans over a subset of the
table’s attributes.

DBMS is responsible for combining/splitting a


tuple's attributes when reading/writing.
DSM: PHYSICAL ORGANIZATION
Col A Col B Col C
Store attributes and meta-data (e.g., Row #0 a0 b0 c0
nulls) in separate arrays of Fixed- Row #1 a1 b1 c1
Row #2
length values. a2 b2 c2
Row #3 a3 b3 c3
→ Most systems identify unique physical Row #4 a4 b4 c4
tuples using offsets into these arrays. Row #5 a5 b5 c5
→ Need to handle variable-length values…
header null bitmap

File #1
Maintain a separate file per attribute a0 a1 a2 a3 a4 a5

with a dedicated header area for meta- header null bitmap

File #2
data about entire column. b0 b1 b2 b3 b4 b5

header null bitmap

File #3
c0 c1 c2 c3 c4
c5
DSM: DATABASE EXAMPLE
The DBMS stores the values of a single attribute
across multiple tuples contiguously in a page.
→ Also known as a "column store".

header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin


DSM: DATABASE EXAMPLE
The DBMS stores the values of a single attribute
across multiple tuples contiguously in a page.
→ Also known as a "column store".

D/M Dis# Page


header hostname hostname hostname hostname

userID lastLogin hostname hostname hostname hostname hostname hostname


Database File

hostname hostname hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname


userName
Dis# userPass
DSM: OLAP EXAMPL E
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

D/M Dis# Page


header hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname


Database File

hostname hostname hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname

Dis#
DSM: OLAP EXAMPL E
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

D/M Dis# Page


header hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname


Database File

hostname hostname hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname

Dis#
DSM: OLAP EXAMPL E
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

D/M Dis# Page


header lastLogin lastLogin lastLogin lastLogin

lastLogin lastLogin lastLogin lastLogin lastLogin lastLogin


Database File

lastLogin lastLogin lastLogin lastLogin lastLogin lastLogin

lastLogin lastLogin lastLogin lastLogin lastLogin lastLogin

Dis#
DSM: TUPLE IDENTIFICATION
Choice #1: Fixed-length OFFsets
→ Each value is the same length for an attribute.
Choice #2: Embedded Tuple Ids
→ Each value is stored with its tuple id in a column.

Offsets EmbeddedIds
A B C D A B C D
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
DSM: VARIABLE -LENGTH DATA
Padding variable-length fields to ensure they are
fixed-length is wasteful, especially for large
attributes.

A better approach is to use dictionary compression


to convert repetitive variable-length data into
fixed-length values (typically 32-bit integers).
→ More on this next week.
DECOMPOSITION STORAGE MODEL (DSM)
• Advantages
• → Reduces the amount wasted I/O per query because the
• DBMS only reads the data that it needs.
• → Faster query processing because of increased locality and cached data
reuse.
• → Better data compression (more on this later)

• Disadvantages
• → Slow for point queries, inserts, updates, and deletes because of
tuple splitting/stitching/reorganization.
OBSERVATION
• OLAP queries almost never access a single column in a table by itself.
• → At some point during query execution, the DBMS must
• get other columns and stitch the original tuple back together.
• But we still need to store data in a columnar
• format to get the storage + execution benefits.

• We need columnar scheme that still stores attributes separately but


keeps the data for each tuple physically close to each other…
PAX STORAGE MODEL
Partition Attributes Across (PAX) is a hybrid
storage model that vertically partitions attributes
within a database page.
→ This is what Paraquet and Orc use.

The goal is to get the benefit of faster processing


on columnar storage while retaining the spatial
locality benefits of row storage.
PAX: PHYSICAL ORGANIZATION
Col A Col B Col C
Horizontally partition rows into Row #0 a0 b0 c0
groups. Then vertically partition their Row #1 a1 b1 c1
Row #2 a2 b2 c2
attributes into columns. Row #3 a3 b3 c3
Row #4 a4 b4 c4
Global header contains directory with Row #5 a5 b5 c5

the offsets to the file's row groups. header


→ This is stored in the footer if the file is header

Row GYoup
immutable (Parquet, Orc). a0 a1 a2 b0 b1 b2

PAX File
c0 c1 c2

Each row group contains its own


meta-data header about its contents.
PAX: PHYSICAL ORGANIZATION
Col A Col B Col C
Horizontally partition rows into Row #0 a0 b0 c0
groups. Then vertically partition their Row #1 a1 b1 c1
Row #2 a2 b2 c2
attributes into columns. Row #3 a3 b3 c3
Row #4 a4 b4 c4
Global header contains directory with Row #5 a5 b5 c5

the offsets to the file's row groups. header


→ This is stored in the footer if the file is header

Row Group
immutable (Parquet, Orc). a0 a1 a2 b0 b1 b2

PAX File
c0 c1 c2

Each row group contains its own header

Row Group
meta-data header about its contents. a3 a4 a5 b3 b4 b5
c3 c4 c5
OBSERVATION
• I/O is the main bottleneck if the DBMS fetches data from disk during
query execution.

• The DBMS can compress pages to increase the utility of the data
moved per I/O operation.

• Key trade-off is speed vs. compression ratio


• → Compressing the database reduces DRAM requirements.
• → It may decrease CPU costs during query execution.
DATABASE COMPRESSION
Goal #1: Must produce fixed-length values.
→ Only exception is var-length data stored in separate pool.

Goal #2: Postpone decompression for as long as


possible during query execution.
→ Also known as late materialization.

Goal #3: Must be a lossless scheme.


LOSSLESS VS. LOSSY COMPRESSION
When a DBMS uses compression, it is always
lossless because people don't like losing data.

Any kind of lossy compression must be performed


at the application level.
COMPRESSION GRANULARITY
Choice #1: Block-level
→ Compress a block of tuples for the same table.
Choice #2: Tuple-level
→ Compress the contents of the entire tuple (NSM-only).
Choice #3: Attribute-level
→ Compress a single attribute within one tuple (overflow).
→ Can target multiple attributes for the same tuple.
Choice #4: Column-level
→ Compress multiple values for one or more attributes
stored for multiple tuples (DSM-only).
NAÏVE COMPRESSION
• Compress data using a general-purpose algorithm. Scope of compression
is only based on the data provided as input.
• → LZO (1996), LZ4 (2011), Snappy (2011),
• Oracle OZIP (2014), Zstd (2015)

• Considerations
• → Computational overhead
• → Compress vs. decompress speed.
MYSQL INNODB COMPRESSION
Buffer Pool Database File
Write mod log mod log
Compressed Page0 Compressed Page0 [1,2,4,8] KB
Read
mod log
Read Uncompressed
16KB Compressed Page1
Page0
mod log
Compressed Page2

Source: MySQL 5.7 Documentation


NAÏVE COMPRESSION
The DBMS must decompress data first before it
can be read and (potentially) modified.
→ This limits the "scope" of the compression scheme.

These schemes also do not consider the high-level


meaning or semantics of the data.
OBSERVATION
Ideally, we want the DBMS to operate on
compressed data without decompressing it first.

Database Magic!
SELECT * FROM users SELECT * FROM users
WHERE name = 'Andy' WHERE name = XX

NAME SALARY NAME SALARY


Andy 99999 XX AA
DJ 2PL 88888 YY BB
COMPRESSION GRANULARITY
• Choice #1:Block-level
• → Compress a block of tuples for the same table.
• Choice #2:Tuple-level
• → Compress the contents of the entire tuple (NSM-only).
• Choice #3: Attribute-level
• → Compress a single attribute within one tuple (overflow).
• → Can target multiple attributes for the same tuple.
• Choice #4: Column-level
• → Compress multiple values for one or more attributes
• stored for multiple tuples (DSM-only).
COLUMNAR COMPRESSION
Run-length Encoding
Bit-Packing Encoding
Bitmap Encoding
Delta Encoding
Incremental Encoding
Dictionary Encoding
RUN- LENGTH ENCODING
• Compress runs of the same value in a single column into triplets:
• → The value of the attribute.
• → The start position in the column segment.
• → The # of elements in the run.

• Requires the columns to be sorted intelligently to


• maximize compression opportunities.
RUN- LENGTH ENCODING
OYiginal Data CompYessed Data
id isDead id isDead
1 Y 1 (Y,0,3)
2 Y 2 (N,3,1)
3 Y 3 (Y,4,1)
4 N 4 (N,5,1)
6 Y 6 (Y,6,2)
7 N 7 RLE Triplet
8 Y 8 - Value
9 Y 9
- Offset
- Length
RUN- LENGTH ENCODING
CompYessed Data
id isDead
1 (Y,0,3)
2 (N,3,1)
SELECT isDead, COUNT(*)
FROM users 3 (Y,4,1)
GROUP BY isDead 4 (N,5,1)
6 (Y,6,2)
7 RLE Triplet
8 - Value
9
- Offset
- Length
RUN- LENGTH ENCODING
Original Data Compressed Data
id isDead id isDead
1 Y 1 (Y,0,3)
2 Y 2 (N,3,1)
3 Y 3 (Y,4,1)
4 N 4 (N,5,1)
6 Y 6 (Y,6,2)
7 N 7 RLE Triplet
8 Y 8 - Value
9 Y 9
- Offset
- Length
RUN- LENGTH ENCODING
Original Data Compressed Data
id isDead id isDead
1 Y 1 (Y,0,6)
2 Y 2 (N,7,2)
3 Y 3
6 Y 6
8 Y 8
9 Y 9
4 N 4
7 N 7
BIT PACKING
If the values for an integer attribute is Original Data
smaller than the range of its given
int32
data type size, then reduce the 13
number of bits to represent each 191
value. 56
92
Use bit-shifting tricks to operate on 81
120
multiple values in a single word. 231
172
BIT PACKING
If the values for an integer attribute is Original Data Original:
8 × 32-bits =
smaller than the range of its given 256 bits
int32
data type size, then reduce the 13 00000000 00000000 00000000 00001101
00001101

number of bits to represent each 191 00000000 00000000 00000000 10111111


10111111

value. 56 00000000 00000000 00000000 00111000


00111000

92 00000000 00000000 00000000 01011100


01011100

Use bit-shifting tricks to operate on 81 00000000 00000000 00000000 01010001


01010001

120 00000000 00000000 00000000 01111000


01111000
multiple values in a single word. 231 00000000 00000000 00000000 11100111
11100111

172 00000000 00000000 00000000 10101100


10101100
BIT PACKING
If the values for an integer attribute is Original Data Original:
8 × 32-bits =
smaller than the range of its given 256 bits
int32
data type size, then reduce the 13 00000000 00000000 00000000 00001101

number of bits to represent each 191 00000000 00000000 00000000 10111111

value. 56 00000000 00000000 00000000 00111000

92 00000000 00000000 00000000 01011100

Use bit-shifting tricks to operate on 81 00000000 00000000 00000000 01010001

120 00000000 00000000 00000000 01111000


multiple values in a single word. 231 00000000 00000000 00000000 11100111

172 00000000 00000000 00000000 10101100


BIT PACKING
If the values for an integer attribute is Original Data Original:
8 × 32-bits =
smaller than the range of its given 256 bits
int32
data type size, then reduce the 13 00001101

number of bits to represent each 191 10111111

value. 56 00111000

92 01011100

Use bit-shifting tricks to operate on 81 01010001

120 01111000
multiple values in a single word. 231 11100111

172 10101100

Compressed:
8 × 8-bits =
64 bits
MOST LY ENCODING
A variation of bit packing for when an attribute's
values are "mostly" less than the largest size, store
them with smaller data type.
→ The remaining values that cannot be compressed are
stored in their raw form.
Original Data Compressed Data
int32 mostly8 offset value
Original: 13 13 3 99999999 Compressed:
8 × 32-bits = 191 181 (8 × 8-bits) +
99999999 XXX 16-bits + 32-bits
256 bits 92 92
81 81 = 112 bits
120 120
231 231
172 172
Source: Redshift Documentation
BITMAP ENCODING
Store a separate bitmap for each unique value for
an attribute where an offset in the vector
corresponds to a tuple.
→ The ith position in the Bitmap corresponds to the ith tuple
in the table.
→ Typically segmented into chunks to avoid allocating large
blocks of contiguous memory.

Only practical if the value cardinality is low.


Some DBMSs provide bitmap indexes.
BITMAP ENCODING
Original Data
id isDead
1 Y
2 Y
3 Y
4 N
6 Y
7 N
8 Y
9 Y
BITMAP ENCODING
Original Data Compressed Data
id isDead isDead
id Y N
1 Y
2 1 1 0
Y
3 2 1 0
Y
4 3 1 0
N
6 4 0 1
Y
7 6 1 0
N
8 7 0 1
Y
9 8 1 0
Y
9 1 0
BITMAP ENCODING Compressed:
16bits + 18bits =
Original Data Compressed Data 34 bits
id isDead
isDead
id Y N
2 × 8-bits =
1 Y 16 bits
2 Y 1 1 0

3 Y 2 1 0

4 N
Original: 3 1 0
9 × 8-bits = 4 0 1 9 × 2-bits =
6 Y
72 bits 6 1 0 18 bits
7 N
8 Y 7 0 1

9 Y 8 1 0
9 1 0
BITMAP ENCODING: EXAMPLE
Assume we have 10 million tuples.
43,000 zip codes in the US.
→ 10000000 × 32-bits = 40 MB CREATE TABLE customer (
→ 10000000 × 43000 = 53.75 GB id INT PRIMARY KEY,
name VARCHAR(32),
email VARCHAR(64),
Every time the application inserts a address VARCHAR(64),
new tuple, the DBMS must extend zip_code INT
43,000 different bitmaps. );
DELTA ENCODING
Recording the difference between values that
follow each other in the same column.
→ Store base value in-line or in a separate look-up table.

Original Data Compressed Data


time64 temp time64 temp
12:00 99.5 12:00 99.5
12:01 99.4 +1 -0.1
12:02 99.5 +1 +0.1
12:03 99.6 +1 +0.1
12:04 99.4 +1 -0.2
DELTA ENCODING
Recording the difference between values that
follow each other in the same column.
→ Store base value in-line or in a separate look-up table.
→ Combine with RLE to get even better compression ratios.

Original Data Compressed Data Compressed Data


time64 temp time64 temp time64 temp
12:00 99.5 12:00 99.5 12:00 99.5
12:01 99.4 +1 -0.1 (+1,4) -0.1
12:02 99.5 +1 +0.1 +0.1
12:03 99.6 +1 +0.1 +0.1
12:04 99.4 +1 -0.2 -0.2
DELTA ENCODING
Recording the difference between values that
follow each other in the same column.
→ Store base value in-line or in a separate look-up table.
→ Combine with RLE to get even better compression ratios.

Original Data Compressed Data Compressed Data


time64 temp time64 temp time64 temp
12:00 99.5 12:00 99.5 12:00 99.5
12:01 99.4 +1 -0.1 (+1,4) -0.1
12:02 99.5 +1 +0.1 +0.1
12:03 99.6 +1 +0.1 +0.1
12:04 99.4 +1 -0.2 -0.2
5 × 64-bits 64-bits + (4 × 16-bits) 64-bits + (2 × 16-bits)
= 320 bits = 128 bits = 96 bits
DICTIONARY COMPRESSION
Replace frequent values with smaller fixed-length
codes and then maintain a mapping (dictionary)
from the codes to the original values
→ Typically, one code per attribute value.
→ Most widely used native compression scheme in DBMSs.

The ideal dictionary scheme supports fast


encoding and decoding for both point and range
queries.
DICTIONARY: EXAMPLE

SELECT * FROM users SELECT * FROM users


WHERE name = 'Andy' WHERE name = 30

Original Data Compressed Data

Dictionary
name name value code
Andrea 10 Andrea 10
Prashanth 20 Prashanth 20
Andy 30 Andy 30
Matt 40 Matt 40
Prashanth 20
DICTIONARY: ENCODING / DECODING
A dictionary needs to support two operations:
→ Encode/Locate: For a given uncompressed value,
convert it into its compressed form.
→ Decode/Extract: For a given compressed value, convert
it back into its original form.
DICTIONARY: ORDER -PRESERVING
The encoded values need to support the same
collation as the original values.

SELECT * FROM users SELECT * FROM users


WHERE name LIKE 'And%' WHERE name BETWEEN 10 AND 20

Original Data Compressed Data


name name value code

Dictionary
Andrea 10 Andrea 10

sorted
Prashanth 40 Andy 20
Andy 20 Matt 30
Matt 30 Prashanth 40
Prashanth 40
ORDER- PRESERVING ENCODING
SELECT name FROM users Still must perform scan on column
WHERE name LIKE 'And%'

SELECT DISTINCT name


FROM users Only need toaccess dictionary
WHERE name LIKE 'And%'

Original Data Compressed Data


name name value code

Dictionary
Andrea 10 Andrea 10

sorted
Prashanth 40 Andy 20
Andy 20 Matt 30
Matt 30 Prashanth 40
Prashanth 40
DICTIONARY: DATA STRUCTURES
Choice #1:Array
→ One array of variable length strings and another array
with pointers that maps to string offsets.
→ Expensive to update so only usable in immutable files.

Choice #2: Hash Table


→ Fast and compact.
→ Unable to support range and prefix queries.

Choice #3: B+Tree


→ Slower than a hash table and takes more memory.
→ Can support range and prefix queries.
DICTIONARY: ARRAY
Original Data Compressed Data
First sort the values and then store
them sequentially in a byte array. name name
→ Need to also store the size of the value if Andrea 0
they are variable-length. Wan 17
Andy 7
Matt 12
Replace the original data with
dictionary codes that are the (byte)
offset into this array. len|val
6|Andrea
4|Andy
4|Matt
3|Wan
CONCLUSION
It is important to choose the right storage model
for the target workload:
→ OLTP = Row Store
→ OLAP = Column Store

DBMSs can combine different approaches for even


better compression.
Dictionary encoding is probably the most useful
scheme because it does not require pre-sorting.

You might also like