0% found this document useful (0 votes)

28 views65 pages

DB Storage3

Database

Uploaded by

batch22.cs.12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views65 pages

DB Storage3

Database

Uploaded by

batch22.cs.12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

LAST CLASS

We discussed alternatives to tuple-oriented

storage scheme.
→ Log-structured storage
→ Index-organized storage

These approaches are ideal for write-heavy

(INSERT/UPDATE/DELETE) workloads.
But the most important query for an application
may be read (SELECT ) performance…
DATABASE WORKLOADS
• On-Line Transaction Processing (OLTP)
• → Fast operations that only read/update a small amount of
• data each time.

• On-Line Analytical Processing (OLAP)

• → Complex queries that read a lot of data to compute
• aggregates.

• Hybrid Transaction + Analytical Processing

• → OLTP + OLAP together on the same database instance
DATABASE WORKLOADS
Complex
Operation Complexity

OLAP
HTAP
OLTP
Simple
Write-Heavy Read-Heavy

Workload Focus Source: Mike Stonebraker

WIKIPEDIA EXAMPLE
CREATE TABLE useracct ( CREATE TABLE pages (
userID INT PRIMARY KEY, pageID INT PRIMARY KEY,
userName VARCHAR UNIQUE, title VARCHAR UNIQUE,
⋮ latest INT
); ⮱ REFERENCES revisions (revID),
);

CREATE TABLE revisions (

revID INT PRIMARY KEY,
userID INT REFERENCES useracct (userID),
pageID INT REFERENCES pages (pageID),
content TEXT,
updated DATETIME
);
OLTP
On-line Transaction Processing: SELECT *
→ Simple queries that read/update a small FROM pages WHERE P.pageID =
amount of data that is related to a single ?
entity in the database.
UPDATE useracct
SET lastLogin = NOW(),
This is usually the kind of application hostname = ?
that people build first. WHERE userID = ?

INSERT INTO revisions VALUES

(?,?…,?)
OLAP
On-line Analytical Processing: SELECT COUNT(U.lastLogin),
→ Complex queries that read large portions EXTRACT(month FROM
U.lastLogin) AS month
of the database spanning multiple entities. FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY
You execute these workloads on the EXTRACT(month FROM U.lastLogin)

data you have collected from your

OLTP application(s).
STORAGE MODELS
A DBMS's storage model specifies how it
physically organizes tuples on disk and in memory.
→ Can have different performance characteristics based on
the target workload (OLTP vs. OLAP).
→ Influences the design choices of the rest of the DBMS.

Choice #1:N-ary Storage Model (NSM)

Choice #2: Decomposition Storage Model (DSM)
Choice #3: Hybrid Storage Model (PAX)
N-ARY STORAGE MODEL (NSM)
• The DBMS stores (almost) all attributes for a single tuple
contiguously in a single page.
• → Also known as a "row store"

• Ideal for OLTP workloads where queries are more likely to access individual
entities and execute write-heavy workloads.

• NSM database page sizes are typically some constant multiple of 4

KB hardware pages.
• → Oracle (4 KB), Postgres (8 KB), MySQL (16 KB)
NSM: PHYSICAL ORGANIZATION
Col A Col B Col C
A disk-oriented NSM system stores a Row #0 a0 b0 c0
tuple's fixed-length and variable- Row #1 a1 b1 c1
Row #2 a2 b2 c2
length attributes contiguously in a Row #3 a3 b3 c3
single slotted page. Row #4 a4 b4 c4
Row #5 a5 b5 c5

The tuple's record id (page#, slot#) is

how the DBMS uniquely identifies a /lot AYYay
physical tuple.

Database Page
header

header a0 b0 c0
NSM: OLTP EXAMPLE
SELECT * FROM useracct
WHERE userName = ? Lecture #8
AND userPass = ?
Index

N/M Dis# Page

header userID userName userPass hostname lastLogin

Database File

header userID userName userPass hostname lastLogin

header - - - - -

Dis#
NSM: OLTP EXAMPLE
SELECT * FROM useracct
WHERE userName = ? Lecture #8
AND userPass = ?
Index
INSERT INTO useracct
VALUES (?,?,…?)
N/M Dis# Page
header userID userName userPass hostname lastLogin

header userID userName userPass hostname lastLogin

Database File

header userID userName userPass hostname lastLogin

header
header user
- ID userName
- userPass
- hostname
- lastLogin
-

Dis#
NSM: OLAP EXAMPLE
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

N/M Dis# Page

header userID userName userPass hostname lastLogin

Database File

header userID userName userPass hostname lastLogin

Dis#
NSM: OLAP EXAMPLE
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

N/M Dis# Page

header userID userName userPass hostname lastLogin

Database File

header userID userName userPass hostname lastLogin

Dis#
NSM: OLAP EXAMPLE
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

N/M Dis# Page

header userID userName userPass hostname lastLogin

Database File

header userID userName userPass hostname lastLogin

Dis# Useless Data

NSM: SUMMARY
• Advantages
• → Fast inserts, updates, and deletes.
• → Good for queries that need the entire tuple (OLTP).
• → Can use index-oriented physical storage for clustering.

• Disadvantages
• → Not good for scanning large portions of the table and/or
• a subset of the attributes.
• → Terrible memory locality in access patterns.
• → Not ideal for compression because of multiple value
• domains within a single page.
DECOMPOSITION STORAGE MODEL (DSM)
The DBMS stores a single attribute for all tuples
contiguously in a block of data.
→ Also known as a "column store"

Ideal for OLAP workloads where read-only

queries perform large scans over a subset of the
table’s attributes.

DBMS is responsible for combining/splitting a

tuple's attributes when reading/writing.
DSM: PHYSICAL ORGANIZATION
Col A Col B Col C
Store attributes and meta-data (e.g., Row #0 a0 b0 c0
nulls) in separate arrays of Fixed- Row #1 a1 b1 c1
Row #2
length values. a2 b2 c2
Row #3 a3 b3 c3
→ Most systems identify unique physical Row #4 a4 b4 c4
tuples using offsets into these arrays. Row #5 a5 b5 c5
→ Need to handle variable-length values…
header null bitmap

File #1
Maintain a separate file per attribute a0 a1 a2 a3 a4 a5

with a dedicated header area for meta- header null bitmap

File #2
data about entire column. b0 b1 b2 b3 b4 b5

header null bitmap

File #3
c0 c1 c2 c3 c4
c5
DSM: DATABASE EXAMPLE
The DBMS stores the values of a single attribute
across multiple tuples contiguously in a page.
→ Also known as a "column store".

header userID userName userPass hostname lastLogin

DSM: DATABASE EXAMPLE
The DBMS stores the values of a single attribute
across multiple tuples contiguously in a page.
→ Also known as a "column store".

D/M Dis# Page

header hostname hostname hostname hostname

userID lastLogin hostname hostname hostname hostname hostname hostname

Database File

hostname hostname hostname hostname hostname hostname

userName
Dis# userPass
DSM: OLAP EXAMPL E
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

D/M Dis# Page

header hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname

Database File

hostname hostname hostname hostname hostname hostname

Dis#
DSM: OLAP EXAMPL E
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

D/M Dis# Page

header hostname hostname hostname hostname

hostname hostname hostname hostname hostname hostname

Database File

hostname hostname hostname hostname hostname hostname

Dis#
DSM: OLAP EXAMPL E
SELECT COUNT(U.lastLogin),
EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)

D/M Dis# Page

header lastLogin lastLogin lastLogin lastLogin

lastLogin lastLogin lastLogin lastLogin lastLogin lastLogin

Database File

lastLogin lastLogin lastLogin lastLogin lastLogin lastLogin

Dis#
DSM: TUPLE IDENTIFICATION
Choice #1: Fixed-length OFFsets
→ Each value is the same length for an attribute.
Choice #2: Embedded Tuple Ids
→ Each value is stored with its tuple id in a column.

Offsets EmbeddedIds
A B C D A B C D
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
DSM: VARIABLE -LENGTH DATA
Padding variable-length fields to ensure they are
fixed-length is wasteful, especially for large
attributes.

A better approach is to use dictionary compression

to convert repetitive variable-length data into
fixed-length values (typically 32-bit integers).
→ More on this next week.
DECOMPOSITION STORAGE MODEL (DSM)
• Advantages
• → Reduces the amount wasted I/O per query because the
• DBMS only reads the data that it needs.
• → Faster query processing because of increased locality and cached data
reuse.
• → Better data compression (more on this later)

• Disadvantages
• → Slow for point queries, inserts, updates, and deletes because of
tuple splitting/stitching/reorganization.
OBSERVATION
• OLAP queries almost never access a single column in a table by itself.
• → At some point during query execution, the DBMS must
• get other columns and stitch the original tuple back together.
• But we still need to store data in a columnar
• format to get the storage + execution benefits.

• We need columnar scheme that still stores attributes separately but

keeps the data for each tuple physically close to each other…
PAX STORAGE MODEL
Partition Attributes Across (PAX) is a hybrid
storage model that vertically partitions attributes
within a database page.
→ This is what Paraquet and Orc use.

The goal is to get the benefit of faster processing

on columnar storage while retaining the spatial
locality benefits of row storage.
PAX: PHYSICAL ORGANIZATION
Col A Col B Col C
Horizontally partition rows into Row #0 a0 b0 c0
groups. Then vertically partition their Row #1 a1 b1 c1
Row #2 a2 b2 c2
attributes into columns. Row #3 a3 b3 c3
Row #4 a4 b4 c4
Global header contains directory with Row #5 a5 b5 c5

the offsets to the file's row groups. header

→ This is stored in the footer if the file is header

Row GYoup
immutable (Parquet, Orc). a0 a1 a2 b0 b1 b2

PAX File
c0 c1 c2

Each row group contains its own

meta-data header about its contents.
PAX: PHYSICAL ORGANIZATION
Col A Col B Col C
Horizontally partition rows into Row #0 a0 b0 c0
groups. Then vertically partition their Row #1 a1 b1 c1
Row #2 a2 b2 c2
attributes into columns. Row #3 a3 b3 c3
Row #4 a4 b4 c4
Global header contains directory with Row #5 a5 b5 c5

the offsets to the file's row groups. header

→ This is stored in the footer if the file is header

Row Group
immutable (Parquet, Orc). a0 a1 a2 b0 b1 b2

PAX File
c0 c1 c2

Each row group contains its own header

Row Group
meta-data header about its contents. a3 a4 a5 b3 b4 b5
c3 c4 c5
OBSERVATION
• I/O is the main bottleneck if the DBMS fetches data from disk during
query execution.

• The DBMS can compress pages to increase the utility of the data
moved per I/O operation.

• Key trade-off is speed vs. compression ratio

• → Compressing the database reduces DRAM requirements.
• → It may decrease CPU costs during query execution.
DATABASE COMPRESSION
Goal #1: Must produce fixed-length values.
→ Only exception is var-length data stored in separate pool.

Goal #2: Postpone decompression for as long as

possible during query execution.
→ Also known as late materialization.

Goal #3: Must be a lossless scheme.

LOSSLESS VS. LOSSY COMPRESSION
When a DBMS uses compression, it is always
lossless because people don't like losing data.

Any kind of lossy compression must be performed

at the application level.
COMPRESSION GRANULARITY
Choice #1: Block-level
→ Compress a block of tuples for the same table.
Choice #2: Tuple-level
→ Compress the contents of the entire tuple (NSM-only).
Choice #3: Attribute-level
→ Compress a single attribute within one tuple (overflow).
→ Can target multiple attributes for the same tuple.
Choice #4: Column-level
→ Compress multiple values for one or more attributes
stored for multiple tuples (DSM-only).
NAÏVE COMPRESSION
• Compress data using a general-purpose algorithm. Scope of compression
is only based on the data provided as input.
• → LZO (1996), LZ4 (2011), Snappy (2011),
• Oracle OZIP (2014), Zstd (2015)

• Considerations
• → Computational overhead
• → Compress vs. decompress speed.
MYSQL INNODB COMPRESSION
Buffer Pool Database File
Write mod log mod log
Compressed Page0 Compressed Page0 [1,2,4,8] KB
Read
mod log
Read Uncompressed
16KB Compressed Page1
Page0
mod log
Compressed Page2

Source: MySQL 5.7 Documentation

NAÏVE COMPRESSION
The DBMS must decompress data first before it
can be read and (potentially) modified.
→ This limits the "scope" of the compression scheme.

These schemes also do not consider the high-level

meaning or semantics of the data.
OBSERVATION
Ideally, we want the DBMS to operate on
compressed data without decompressing it first.

Database Magic!
SELECT * FROM users SELECT * FROM users
WHERE name = 'Andy' WHERE name = XX

NAME SALARY NAME SALARY

Andy 99999 XX AA
DJ 2PL 88888 YY BB
COMPRESSION GRANULARITY
• Choice #1:Block-level
• → Compress a block of tuples for the same table.
• Choice #2:Tuple-level
• → Compress the contents of the entire tuple (NSM-only).
• Choice #3: Attribute-level
• → Compress a single attribute within one tuple (overflow).
• → Can target multiple attributes for the same tuple.
• Choice #4: Column-level
• → Compress multiple values for one or more attributes
• stored for multiple tuples (DSM-only).
COLUMNAR COMPRESSION
Run-length Encoding
Bit-Packing Encoding
Bitmap Encoding
Delta Encoding
Incremental Encoding
Dictionary Encoding
RUN- LENGTH ENCODING
• Compress runs of the same value in a single column into triplets:
• → The value of the attribute.
• → The start position in the column segment.
• → The # of elements in the run.

• Requires the columns to be sorted intelligently to

• maximize compression opportunities.
RUN- LENGTH ENCODING
OYiginal Data CompYessed Data
id isDead id isDead
1 Y 1 (Y,0,3)
2 Y 2 (N,3,1)
3 Y 3 (Y,4,1)
4 N 4 (N,5,1)
6 Y 6 (Y,6,2)
7 N 7 RLE Triplet
8 Y 8 - Value
9 Y 9
- Offset
- Length
RUN- LENGTH ENCODING
CompYessed Data
id isDead
1 (Y,0,3)
2 (N,3,1)
SELECT isDead, COUNT(*)
FROM users 3 (Y,4,1)
GROUP BY isDead 4 (N,5,1)
6 (Y,6,2)
7 RLE Triplet
8 - Value
9
- Offset
- Length
RUN- LENGTH ENCODING
Original Data Compressed Data
id isDead id isDead
1 Y 1 (Y,0,3)
2 Y 2 (N,3,1)
3 Y 3 (Y,4,1)
4 N 4 (N,5,1)
6 Y 6 (Y,6,2)
7 N 7 RLE Triplet
8 Y 8 - Value
9 Y 9
- Offset
- Length
RUN- LENGTH ENCODING
Original Data Compressed Data
id isDead id isDead
1 Y 1 (Y,0,6)
2 Y 2 (N,7,2)
3 Y 3
6 Y 6
8 Y 8
9 Y 9
4 N 4
7 N 7
BIT PACKING
If the values for an integer attribute is Original Data
smaller than the range of its given
int32
data type size, then reduce the 13
number of bits to represent each 191
value. 56
92
Use bit-shifting tricks to operate on 81
120
multiple values in a single word. 231
172
BIT PACKING
If the values for an integer attribute is Original Data Original:
8 × 32-bits =
smaller than the range of its given 256 bits
int32
data type size, then reduce the 13 00000000 00000000 00000000 00001101
00001101

number of bits to represent each 191 00000000 00000000 00000000 10111111

10111111

value. 56 00000000 00000000 00000000 00111000

00111000

92 00000000 00000000 00000000 01011100

01011100

Use bit-shifting tricks to operate on 81 00000000 00000000 00000000 01010001

01010001

120 00000000 00000000 00000000 01111000

01111000
multiple values in a single word. 231 00000000 00000000 00000000 11100111
11100111

172 00000000 00000000 00000000 10101100

10101100
BIT PACKING
If the values for an integer attribute is Original Data Original:
8 × 32-bits =
smaller than the range of its given 256 bits
int32
data type size, then reduce the 13 00000000 00000000 00000000 00001101

number of bits to represent each 191 00000000 00000000 00000000 10111111

value. 56 00000000 00000000 00000000 00111000

92 00000000 00000000 00000000 01011100

Use bit-shifting tricks to operate on 81 00000000 00000000 00000000 01010001

120 00000000 00000000 00000000 01111000

multiple values in a single word. 231 00000000 00000000 00000000 11100111

172 00000000 00000000 00000000 10101100

BIT PACKING
If the values for an integer attribute is Original Data Original:
8 × 32-bits =
smaller than the range of its given 256 bits
int32
data type size, then reduce the 13 00001101

number of bits to represent each 191 10111111

value. 56 00111000

92 01011100

Use bit-shifting tricks to operate on 81 01010001

120 01111000
multiple values in a single word. 231 11100111

172 10101100

Compressed:
8 × 8-bits =
64 bits
MOST LY ENCODING
A variation of bit packing for when an attribute's
values are "mostly" less than the largest size, store
them with smaller data type.
→ The remaining values that cannot be compressed are
stored in their raw form.
Original Data Compressed Data
int32 mostly8 offset value
Original: 13 13 3 99999999 Compressed:
8 × 32-bits = 191 181 (8 × 8-bits) +
99999999 XXX 16-bits + 32-bits
256 bits 92 92
81 81 = 112 bits
120 120
231 231
172 172
Source: Redshift Documentation
BITMAP ENCODING
Store a separate bitmap for each unique value for
an attribute where an offset in the vector
corresponds to a tuple.
→ The ith position in the Bitmap corresponds to the ith tuple
in the table.
→ Typically segmented into chunks to avoid allocating large
blocks of contiguous memory.

Only practical if the value cardinality is low.

Some DBMSs provide bitmap indexes.
BITMAP ENCODING
Original Data
id isDead
1 Y
2 Y
3 Y
4 N
6 Y
7 N
8 Y
9 Y
BITMAP ENCODING
Original Data Compressed Data
id isDead isDead
id Y N
1 Y
2 1 1 0
Y
3 2 1 0
Y
4 3 1 0
N
6 4 0 1
Y
7 6 1 0
N
8 7 0 1
Y
9 8 1 0
Y
9 1 0
BITMAP ENCODING Compressed:
16bits + 18bits =
Original Data Compressed Data 34 bits
id isDead
isDead
id Y N
2 × 8-bits =
1 Y 16 bits
2 Y 1 1 0

3 Y 2 1 0

4 N
Original: 3 1 0
9 × 8-bits = 4 0 1 9 × 2-bits =
6 Y
72 bits 6 1 0 18 bits
7 N
8 Y 7 0 1

9 Y 8 1 0
9 1 0
BITMAP ENCODING: EXAMPLE
Assume we have 10 million tuples.
43,000 zip codes in the US.
→ 10000000 × 32-bits = 40 MB CREATE TABLE customer (
→ 10000000 × 43000 = 53.75 GB id INT PRIMARY KEY,
name VARCHAR(32),
email VARCHAR(64),
Every time the application inserts a address VARCHAR(64),
new tuple, the DBMS must extend zip_code INT
43,000 different bitmaps. );
DELTA ENCODING
Recording the difference between values that
follow each other in the same column.
→ Store base value in-line or in a separate look-up table.

Original Data Compressed Data

time64 temp time64 temp
12:00 99.5 12:00 99.5
12:01 99.4 +1 -0.1
12:02 99.5 +1 +0.1
12:03 99.6 +1 +0.1
12:04 99.4 +1 -0.2
DELTA ENCODING
Recording the difference between values that
follow each other in the same column.
→ Store base value in-line or in a separate look-up table.
→ Combine with RLE to get even better compression ratios.

Original Data Compressed Data Compressed Data

time64 temp time64 temp time64 temp
12:00 99.5 12:00 99.5 12:00 99.5
12:01 99.4 +1 -0.1 (+1,4) -0.1
12:02 99.5 +1 +0.1 +0.1
12:03 99.6 +1 +0.1 +0.1
12:04 99.4 +1 -0.2 -0.2
DELTA ENCODING
Recording the difference between values that
follow each other in the same column.
→ Store base value in-line or in a separate look-up table.
→ Combine with RLE to get even better compression ratios.

Original Data Compressed Data Compressed Data

time64 temp time64 temp time64 temp
12:00 99.5 12:00 99.5 12:00 99.5
12:01 99.4 +1 -0.1 (+1,4) -0.1
12:02 99.5 +1 +0.1 +0.1
12:03 99.6 +1 +0.1 +0.1
12:04 99.4 +1 -0.2 -0.2
5 × 64-bits 64-bits + (4 × 16-bits) 64-bits + (2 × 16-bits)
= 320 bits = 128 bits = 96 bits
DICTIONARY COMPRESSION
Replace frequent values with smaller fixed-length
codes and then maintain a mapping (dictionary)
from the codes to the original values
→ Typically, one code per attribute value.
→ Most widely used native compression scheme in DBMSs.

The ideal dictionary scheme supports fast

encoding and decoding for both point and range
queries.
DICTIONARY: EXAMPLE

SELECT * FROM users SELECT * FROM users

WHERE name = 'Andy' WHERE name = 30

Original Data Compressed Data

Dictionary
name name value code
Andrea 10 Andrea 10
Prashanth 20 Prashanth 20
Andy 30 Andy 30
Matt 40 Matt 40
Prashanth 20
DICTIONARY: ENCODING / DECODING
A dictionary needs to support two operations:
→ Encode/Locate: For a given uncompressed value,
convert it into its compressed form.
→ Decode/Extract: For a given compressed value, convert
it back into its original form.
DICTIONARY: ORDER -PRESERVING
The encoded values need to support the same
collation as the original values.

SELECT * FROM users SELECT * FROM users

WHERE name LIKE 'And%' WHERE name BETWEEN 10 AND 20

Original Data Compressed Data

name name value code

Dictionary
Andrea 10 Andrea 10

sorted
Prashanth 40 Andy 20
Andy 20 Matt 30
Matt 30 Prashanth 40
Prashanth 40
ORDER- PRESERVING ENCODING
SELECT name FROM users Still must perform scan on column
WHERE name LIKE 'And%'

SELECT DISTINCT name

FROM users Only need toaccess dictionary
WHERE name LIKE 'And%'

Original Data Compressed Data

name name value code

Dictionary
Andrea 10 Andrea 10

sorted
Prashanth 40 Andy 20
Andy 20 Matt 30
Matt 30 Prashanth 40
Prashanth 40
DICTIONARY: DATA STRUCTURES
Choice #1:Array
→ One array of variable length strings and another array
with pointers that maps to string offsets.
→ Expensive to update so only usable in immutable files.

Choice #2: Hash Table

→ Fast and compact.
→ Unable to support range and prefix queries.

Choice #3: B+Tree

→ Slower than a hash table and takes more memory.
→ Can support range and prefix queries.
DICTIONARY: ARRAY
Original Data Compressed Data
First sort the values and then store
them sequentially in a byte array. name name
→ Need to also store the size of the value if Andrea 0
they are variable-length. Wan 17
Andy 7
Matt 12
Replace the original data with
dictionary codes that are the (byte)
offset into this array. len|val
6|Andrea
4|Andy
4|Matt
3|Wan
CONCLUSION
It is important to choose the right storage model
for the target workload:
→ OLTP = Row Store
→ OLAP = Column Store

DBMSs can combine different approaches for even

better compression.
Dictionary encoding is probably the most useful
scheme because it does not require pre-sorting.

05 Storage3
No ratings yet
05 Storage3
76 pages
Lec 23
No ratings yet
Lec 23
28 pages
06 Storage3
No ratings yet
06 Storage3
5 pages
3 Storage
No ratings yet
3 Storage
34 pages
Database Management System Chapter 2
No ratings yet
Database Management System Chapter 2
19 pages
04-Storage2 2
No ratings yet
04-Storage2 2
4 pages
05 Storage2
No ratings yet
05 Storage2
4 pages
DBMS Internals: How Does It All Work?
No ratings yet
DBMS Internals: How Does It All Work?
94 pages
Bengal College of Engineering and Technology
No ratings yet
Bengal College of Engineering and Technology
12 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
22 pages
08 Storage
No ratings yet
08 Storage
43 pages
Trends in Software Industry Lecture
No ratings yet
Trends in Software Industry Lecture
73 pages
Chapter 4
No ratings yet
Chapter 4
47 pages
Column Vs Row
No ratings yet
Column Vs Row
64 pages
Column vs. Row Stores: A Deep Dive
No ratings yet
Column vs. Row Stores: A Deep Dive
64 pages
Dbmsconcepts Intro
No ratings yet
Dbmsconcepts Intro
50 pages
Primers - RDBMS My SQL
No ratings yet
Primers - RDBMS My SQL
105 pages
Efficient Database Storage Methods
No ratings yet
Efficient Database Storage Methods
21 pages
Knowing The Internals - Who Needs SQL Server Anyway - Mark Rasmussen
No ratings yet
Knowing The Internals - Who Needs SQL Server Anyway - Mark Rasmussen
98 pages
Architecture and Implementation of Database Systems HS 07 Indexing
No ratings yet
Architecture and Implementation of Database Systems HS 07 Indexing
9 pages
CH 11
No ratings yet
CH 11
50 pages
DBMS Guide for Students & Banks
No ratings yet
DBMS Guide for Students & Banks
35 pages
DBMS Series Part-1
No ratings yet
DBMS Series Part-1
487 pages
DE
No ratings yet
DE
104 pages
Ch07 Database
No ratings yet
Ch07 Database
37 pages
DB101 Storage Formats
No ratings yet
DB101 Storage Formats
15 pages
Database Management Short Notes
No ratings yet
Database Management Short Notes
5 pages
01 Dbms Storage
No ratings yet
01 Dbms Storage
61 pages
SQL Book-1
No ratings yet
SQL Book-1
198 pages
SQL Oracle11g Notes
No ratings yet
SQL Oracle11g Notes
125 pages
Unit 4
No ratings yet
Unit 4
18 pages
Dbms
No ratings yet
Dbms
2 pages
Reducing The Storage Overhead of Main-Memory OLTP Databases With Hybrid Indexes
No ratings yet
Reducing The Storage Overhead of Main-Memory OLTP Databases With Hybrid Indexes
15 pages
Oracle
No ratings yet
Oracle
46 pages
DBMS File Organization
No ratings yet
DBMS File Organization
60 pages
1 - Content - Introduction To Database, Normalization, DDL, DML
No ratings yet
1 - Content - Introduction To Database, Normalization, DDL, DML
37 pages
Column Vs Row
No ratings yet
Column Vs Row
64 pages
Database Management System Basics
No ratings yet
Database Management System Basics
12 pages
Unit 1 - Basics of RDBMS
No ratings yet
Unit 1 - Basics of RDBMS
11 pages
Database Concepts
No ratings yet
Database Concepts
16 pages
Database Management FAQs
No ratings yet
Database Management FAQs
14 pages
11zon DBMS1 PDF
No ratings yet
11zon DBMS1 PDF
20 pages
RDBMS Essentials for IT Students
No ratings yet
RDBMS Essentials for IT Students
76 pages
DBMS Interview Questions Guide
No ratings yet
DBMS Interview Questions Guide
25 pages
Data Parallelism
No ratings yet
Data Parallelism
33 pages
Dba Notes
No ratings yet
Dba Notes
202 pages
Database Management System
100% (2)
Database Management System
120 pages
W5 Storage Files Indexing pt1
No ratings yet
W5 Storage Files Indexing pt1
61 pages
Bengal College of Engineering and Technology: Report On Storage Strategies
No ratings yet
Bengal College of Engineering and Technology: Report On Storage Strategies
15 pages
Lect 1-2pdf
No ratings yet
Lect 1-2pdf
55 pages
1 Introduction
No ratings yet
1 Introduction
9 pages
Lect 1-2pdf
No ratings yet
Lect 1-2pdf
55 pages
FortiSIEM 6.7.6 User Guide
No ratings yet
FortiSIEM 6.7.6 User Guide
1,134 pages
Mics New Suggested Ican
No ratings yet
Mics New Suggested Ican
456 pages
Exam Dbms
No ratings yet
Exam Dbms
7 pages
SYS600 - Programming Language SCIL PDF
No ratings yet
SYS600 - Programming Language SCIL PDF
394 pages
Pega For SAs - Chapter 1 - Savable Datapage-2
No ratings yet
Pega For SAs - Chapter 1 - Savable Datapage-2
14 pages
BBA Info Systems Course Outline
No ratings yet
BBA Info Systems Course Outline
4 pages
INFORMATIONS SYSTEMS AUDIT NOTES Muchelule
No ratings yet
INFORMATIONS SYSTEMS AUDIT NOTES Muchelule
59 pages
Robert Dennis Fink
No ratings yet
Robert Dennis Fink
3 pages
Migration To Microsoft SQL Server 2014 Using SSMA
No ratings yet
Migration To Microsoft SQL Server 2014 Using SSMA
20 pages
Hibernate by JavaScaler
No ratings yet
Hibernate by JavaScaler
126 pages
Synopsis
No ratings yet
Synopsis
2 pages
Chapter 1
No ratings yet
Chapter 1
26 pages
Workflow Manual in English
No ratings yet
Workflow Manual in English
43 pages
Advanced TCP Socket
No ratings yet
Advanced TCP Socket
57 pages
Business Intellegence Report April 2014
No ratings yet
Business Intellegence Report April 2014
39 pages
Introduction To Computer Sci and Eng Ce171 DR Abdel Fatao
No ratings yet
Introduction To Computer Sci and Eng Ce171 DR Abdel Fatao
4 pages
IBM vs Oracle Servers: TCO Comparison
No ratings yet
IBM vs Oracle Servers: TCO Comparison
25 pages
Lecture 1 DB
No ratings yet
Lecture 1 DB
3 pages
Database Actions Problem
No ratings yet
Database Actions Problem
2 pages
Video Retrieval via MPEG Compression
No ratings yet
Video Retrieval via MPEG Compression
9 pages
RESTful API With FastAPI and Python 1699462757
No ratings yet
RESTful API With FastAPI and Python 1699462757
12 pages
Ilovepdf Merged (5) Removed
No ratings yet
Ilovepdf Merged (5) Removed
41 pages
Oracle.1Z0-931-20.v2021-05-25.q21: Show Answer
No ratings yet
Oracle.1Z0-931-20.v2021-05-25.q21: Show Answer
10 pages
Sion Uvm
No ratings yet
Sion Uvm
171 pages
InfoSphere CDC For Oracle Configurations
No ratings yet
InfoSphere CDC For Oracle Configurations
12 pages
Online Car Driving School Management System-1
No ratings yet
Online Car Driving School Management System-1
35 pages
Lavish Grover: Email: Mobile: 95019-77785
No ratings yet
Lavish Grover: Email: Mobile: 95019-77785
4 pages
Chapter 3 - EMT
No ratings yet
Chapter 3 - EMT
44 pages
Car Rental System for CSE Students
No ratings yet
Car Rental System for CSE Students
25 pages
Enterprise Data Catalog User Guide: Informatica 10.2.2
No ratings yet
Enterprise Data Catalog User Guide: Informatica 10.2.2
107 pages