0% found this document useful (0 votes)

141 views99 pages

Cassandra Introduction

This document provides an introduction and overview of Apache Cassandra. It discusses Cassandra's history starting at Facebook, key design principles around simplicity and handling failures, data modeling differences compared to relational databases, the CQL query language, accessing Cassandra using various drivers and tools, cluster setup and sizes, and the future of Cassandra including new features in version 3.0. The presentation encourages participants to get involved in the open source Cassandra community.

Uploaded by

Nikhil Erande

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views99 pages

Cassandra Introduction

Uploaded by

Nikhil Erande

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 99

Introduction to 

 
Apache

1
Me

Robert Stupp 
Freelancer, Coder, Architect 
@snazy snazy@snazy.de

Contributor to Apache Cassandra, 

3.0 UDFs (CASSANDRA-7395 + related)

Databases, Network, Backend

2
Agenda
Apache Cassandra History

Design Principles

Outstanding differences

CQL Intro

Access C*

Clusters

Cassandra Future

3
Apache Cassandra
History

4
Apache Cassandra 
started at Facebook

inspired by

Note: Facebook initially had 

two data centers.

5
2.1 released in Sep 2014

6
Apache Cassandra
Design Principles

7
Hardware failures 
can and will occur!

Cassandra handles failures.

From single node to whole data center.
From client to server.
8
The complicated part 
when learning Cassandra,

is to understand
Cassandra’s simplicity

9
Keep it simple
all nodes are equal

master-less architecture

no name nodes

no SPOF (single point of failure)

no read before modify 

(prevent race conditions)

10
Keep it running

No need to take cluster down … e.g.

during maintenance

during software update

Rolling restart is your friend

11
Outstanding
Differences

12
Cassandra

Highly scalable 
runs with a few nodes 
up to 1000+ nodes cluster!

Linear scalability (proven!)

Multi datacenter aware (world-wide!)

No SPOF

13
Cassandra @ Apple

14
Linear Scalability

15
Scaling Cassandra

More data? 
-> add more nodes

Faster access? 
-> add more nodes

16
Read / Write
performance

Reads are fast

Writes are even faster

17
Durability

Writes are durable - period.

18
Availability @
Netflix
Chaos 
Monkey

kills nodes randomly

19
Availability @
Netflix

Chaos 
Gorilla

kill regions randomly

20
Availability @
Netflix

Chaos 
Kong

kills whole data centers

21
Availability @
Netflix

http://de.slideshare.net/planetcassandra/
active-active-c-behind-the-scenes-at-
netflix
22
32 node cluster (Rasperry PIs)
@DataStax

23
Most outstanding
Great documentation

Many blog posts

Many presentations

Many videos

Regular webinars

Huge, active and healthy community

24
Data Distribution

25
DHT

Data is organized in a

 
„Distributed Hash Table“

(hash over row key)

26
DHT

7 1

6 2

5 3

27
Replication

28
Replication Factor 2
Row A
0

7 1

6 2

Row B
5 3

29
Replication Factor 3
Row A
0

7 1

6 2

Row B
5 3

30
Consistency

Consistency defined per request

Several consistency levels (CLs) 

for different needs

31
Eventual consistency

is not
hopefully consistent

EC means there’s a time gap until updates

are consistently readable

32
Consistency Levels
ANY (only for writes)

ONE, LOCAL_ONE,

TWO, THREE, (not recommended)

ALL, (not recommended)

QUORUM, LOCAL_QUORUM, EACH_QUORUM

SERIAL, LOCAL_SERIAL

33
Consistency

Data is always replicated

CL defines how many replicas must

fulfill the request

34
Write
Write
0

7 1

6 2

5 3

35
Write
Write
0

7 1

6 2

5 3

36
Mutli DC setup
DC 1 DC 2

37
Multi DC replication
Write
DC 1 DC 2

38
Mutli DC replication
Write
DC 1 DC 2

39
Mutli DC replication
Write
DC 1 DC 2

40
Replication & 
Consistency

Define # of replicas 
using replication factor

Define required consistency 

per request

41
CQL Introduction

CQL = Cassandra query language

42
“CQL is SQL 
minus joins, 
minus subqueries, 
plus collections” 
 
(plus user types, 
plus tuple types)

43
Why CQL?

Introduces a schema to Cassandra

Familiar syntax

Easy to understand

DML operations are atomic

44
Data model 
(hierarchical view)
Keyspace (schema)

Table (column family)

Row

partition key (part of primary key)

static columns

clustering key (part of primary key)

columns

45
CQL / DDL

Similar to SQL

CREATE TABLE …

ALTER TABLE …

DROP TABLE …

46
CQL / DML

Similar to SQL

INSERT …

UPDATE …

DELETE …

SELECT …

47
CQL / BATCH

Group related modifications 

(INSERT, UPDATE, DELETE)

Atomic operation

48
CQL types
boolean, int (32bit), bigint (64bit),

float, double,

decimal ("BigDecimal"), 
varint ("BigInteger"),

ascii, text (= varchar), blob,

inet, timestamp, uuid, timeuuid

49
CQL collection
types
list < foo >

set < foo >

map < foo , bar >

Since C* 2.1 collections can contain

any type - even other collections.

50
CQL composite
types

user types (C* 2.1) 

are composite types with named fields

tuple types (C* 2.1) 

are unstructured lists of values

51
CQL / user types

CREATE TYPE address ( 

street text, 
zip int, 
city text); 
 
CREATE TABLE users ( 
username text, 
addresses map<text, address>, 
...

52
Cassandra 
Data Modeling
Access by key 
no access by arbitrary WHERE clause

Duplicate data (it’s ok!)

Aggregate data

Build application maintained indexes

53
RDBMS modeling

54
C* modeling

55
Data Modeling 
with RDBMS
Driven by

"How can I store

something right?"
"What answers 
do I have?"
56
Data Modeling 
with NoSQL
Driven by

"How can I access

something right?" 
 

"What questions 
do I have?"
57
Data Modeling
Basics

Work top-down. Think about:

What does the application do?

What are the access patterns?

Now design data model

58
Data Modeling

http://de.slideshare.net/planetcassandra/
cassandra-day-sv-2014-fundamentals-
of-apache-cassandra-data-modeling

http://de.slideshare.net/planetcassandra/
data-modeling-with-travis-price

59
Accessing
Cassandra

60
Command Line

cqlsh 
CQL shell

nodetool 
node/cluster administration

61
GUI: DevCenter

Visual query tool

62
Stress test?

Cassandra 2.1 comes with improved

stress tool

Simulate read+write workload

Uses configurable data

Works against older C* versions, too

63
DataStax APLv2 
Open Source Drivers
for Java

for Python

for C#

for Scala / Spark

https://github.com/datastax/
or http://www.datastax.com/download
64
Native protocol

C*’s own net protocol for clients

Request multiplexing

Schema change notifications

Cluster change notifications

65
Third Party Drivers

for huge number of languages

66
Mappers

High level mappers exist at least for

Java

Special case: Scala 

due to its strong+complex type
model (DataStax OSS Spark driver)

67
Spark + Hadoop

Yes - works really good

Note: Spark is about 100x faster

68
Clusters

69
Cluster sizes

C* works with a few nodes

C* works with several hundred /

thousand nodes

70
Cluster setup

Configure for multiple data centers

Plan for multi-DC setup :)

71
Cluster experience

Remember: A single Cassandra

clusters works over multiple data
centers all over the world

„Desaster proven“

Hurricanes

Amazon DC outages

72
Apache Cassandra 
Future

73
Cassandra 3.0 
(in development)
User Defined Functions
Subject 
Aggregate functions to 
change!!!
Functional indexes

Workload recording + playback

Better SSTables, Fully off-heap row cache, Better

serial consistency

Indexes w/ high cardinality

74
Get active !

75
Cassandra Community

http://cassandra.apache.org/

http://planetcassandra.org/ - Blog

http://www.slideshare.net/
planetcassandra/presentations

http://de.slideshare.net/DataStax/
presentations

76
Cassandra Community
https://www.youtube.com/user/
PlanetCassandra

https://www.youtube.com/user/DataStax

http://www.datastax.com/dev/blog/

http://www.datastax.com/docs/

Users Mailing List 

users@cassandra.apache.org

77
Free C* Training!

http://planetcassandra.org/cassandra-
training/
78
Get involved!

Ask questions, 
submit RFEs or experiences to

user mailing list

user@cassandra.apache.org

Answers arrive quickly!

79
Live Demo
User Defined Functions

80
C* 3.0 UDFs

Users create functions using 

CREATE FUNCTION … 
LANGUAGE …  
AS …

Java, JavaScript, Scala, Groovy,

JRuby, Jython

Functions work on all nodes

81
C* 3.0 UDFs

Example

CREATE FUNCTION sin(input double) 

RETURNS double 
LANGUAGE javascript 
AS 'Math.sin(input)';

This is JavaScript!

82
UDFs for what?
Targeted for C* 3.0

Own aggregation code - e.g. 

SELECT sum(value) FROM table 
WHERE …;

Functional indexes - e.g. 

CREATE INDEX idx 
ON table ( myFunction(colname) );

83
Thanks 
for your attention

Download Apache Cassandra at

http://cassandra.apache.org/

Robert Stupp 
@snazy 
snazy@snazy.de 
de.slideshare.net/RobertStupp
84
Q & A

85
86
BACKUP SLIDES
User-Defined-Functions 
Demo

87
88
89
90
91
92
93
94
95
96
97
98
99

An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
No ratings yet
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
20 pages
Learn Cassandra
100% (2)
Learn Cassandra
37 pages
Apache Cassandra: Database
No ratings yet
Apache Cassandra: Database
55 pages
Cassandra Database Overview
No ratings yet
Cassandra Database Overview
37 pages
Dzone Refcard 153 Apache Cassandra 2020
No ratings yet
Dzone Refcard 153 Apache Cassandra 2020
11 pages
Cassandra Presentation Final
100% (3)
Cassandra Presentation Final
71 pages
DSX Developer Ebook4 FINAL PDF
No ratings yet
DSX Developer Ebook4 FINAL PDF
27 pages
Cassandra
No ratings yet
Cassandra
31 pages
9 TH
No ratings yet
9 TH
33 pages
Apache Cassandra: by Chethan Gowda
No ratings yet
Apache Cassandra: by Chethan Gowda
12 pages
Nosql Column-Family Stores
No ratings yet
Nosql Column-Family Stores
30 pages
Cassandra and DataStax Enterprise Essentials
No ratings yet
Cassandra and DataStax Enterprise Essentials
38 pages
Cassandra
No ratings yet
Cassandra
7 pages
Class 3 Cassandra
No ratings yet
Class 3 Cassandra
64 pages
Cassandra Complete Notes
No ratings yet
Cassandra Complete Notes
5 pages
Introduction to Cassandra Basics
No ratings yet
Introduction to Cassandra Basics
27 pages
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
No ratings yet
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
4 pages
Cassandra Quick Guide
No ratings yet
Cassandra Quick Guide
60 pages
Cassandra
No ratings yet
Cassandra
25 pages
Cassendra
100% (1)
Cassendra
21 pages
Wide-Column Stores: Big Data Management Phil Bartie
No ratings yet
Wide-Column Stores: Big Data Management Phil Bartie
46 pages
Cassandra Design Patterns - Sample Chapter
No ratings yet
Cassandra Design Patterns - Sample Chapter
32 pages
NoSQL Apache Cassandra
No ratings yet
NoSQL Apache Cassandra
159 pages
Apache Cassandra Nosql SonuJha 04
No ratings yet
Apache Cassandra Nosql SonuJha 04
14 pages
Cassandra & Datastax Enterprise Essentials: Documentation
No ratings yet
Cassandra & Datastax Enterprise Essentials: Documentation
37 pages
Getting Started
No ratings yet
Getting Started
38 pages
Introductiontocassandra 180218073404
No ratings yet
Introductiontocassandra 180218073404
37 pages
Cassandra Data Base1
No ratings yet
Cassandra Data Base1
9 pages
CQL 33
No ratings yet
CQL 33
199 pages
04 Introduction To CassandraDB
No ratings yet
04 Introduction To CassandraDB
19 pages
Cassandra PPT Final
No ratings yet
Cassandra PPT Final
23 pages
W120911A
No ratings yet
W120911A
8 pages
Rangkum Handson
No ratings yet
Rangkum Handson
20 pages
Ch3 Nosql Wordpress
No ratings yet
Ch3 Nosql Wordpress
15 pages
Module 4
No ratings yet
Module 4
22 pages
Cassandra CQL Commands
No ratings yet
Cassandra CQL Commands
16 pages
Cassandra Preview
No ratings yet
Cassandra Preview
9 pages
Intro To NoSQL
No ratings yet
Intro To NoSQL
18 pages
Learning Apache Cassandra - Sample Chapter
No ratings yet
Learning Apache Cassandra - Sample Chapter
20 pages
Project PPT (8 Sem)
No ratings yet
Project PPT (8 Sem)
16 pages
Apache Cassandra
No ratings yet
Apache Cassandra
7 pages
Introduction to Apache Cassandra
No ratings yet
Introduction to Apache Cassandra
10 pages
Apache Cassandra Database - Instaclustr
No ratings yet
Apache Cassandra Database - Instaclustr
8 pages
Cassandra Data Model
No ratings yet
Cassandra Data Model
17 pages
Introduction To Cassandra
No ratings yet
Introduction To Cassandra
47 pages
Big Data 76-100
No ratings yet
Big Data 76-100
25 pages
Cassandra Notes
No ratings yet
Cassandra Notes
6 pages
Cassandra Interview Questions Answers
No ratings yet
Cassandra Interview Questions Answers
10 pages
Key - Value - Database - (2) (1) (Read-Only)
No ratings yet
Key - Value - Database - (2) (1) (Read-Only)
48 pages
Intro To Data Science - Week 10 - LAQ's
No ratings yet
Intro To Data Science - Week 10 - LAQ's
4 pages
No SQL
No ratings yet
No SQL
49 pages
Cassandra for Database Developers
No ratings yet
Cassandra for Database Developers
15 pages
Apache Cassandra
No ratings yet
Apache Cassandra
3 pages
Thanks: With More Than 1000 Students/ Professors, Subject Experts and Editors Contributing To It Every Day
No ratings yet
Thanks: With More Than 1000 Students/ Professors, Subject Experts and Editors Contributing To It Every Day
27 pages
Nosql Cassandra Database: What Is Apache Cassandra?
No ratings yet
Nosql Cassandra Database: What Is Apache Cassandra?
4 pages
Chapter 3 - Columnar DB
No ratings yet
Chapter 3 - Columnar DB
26 pages
Cqlsh-20 Update
No ratings yet
Cqlsh-20 Update
9 pages
Cassandra Tutorial
100% (3)
Cassandra Tutorial
111 pages
8086 Microprocessor Architecture Guide
No ratings yet
8086 Microprocessor Architecture Guide
40 pages
Lab Manual 10
No ratings yet
Lab Manual 10
2 pages
REVISION FOR THE FINAL EXAM II - B
No ratings yet
REVISION FOR THE FINAL EXAM II - B
2 pages
Wish If Only
No ratings yet
Wish If Only
3 pages
Kapil Patel Black Smith PDF
No ratings yet
Kapil Patel Black Smith PDF
11 pages
Norse Gods: Aesir and Vanir Overview
No ratings yet
Norse Gods: Aesir and Vanir Overview
6 pages
Qs Titan HostKeys
No ratings yet
Qs Titan HostKeys
8 pages
BSBY PVT Hospital 2
No ratings yet
BSBY PVT Hospital 2
2 pages
English Resource Magazine
No ratings yet
English Resource Magazine
60 pages
Triple S: Study Plan - Strategy - Success: 300 BEST Computer Awareness Mcqs
No ratings yet
Triple S: Study Plan - Strategy - Success: 300 BEST Computer Awareness Mcqs
92 pages
Alhamdulilah Hirobil Alamin
No ratings yet
Alhamdulilah Hirobil Alamin
4 pages
Combined Arts: Performance & Media
No ratings yet
Combined Arts: Performance & Media
6 pages
Prompt Engineering by Google - Cheat Sheet - April 2025
No ratings yet
Prompt Engineering by Google - Cheat Sheet - April 2025
1 page
Document 8 1
No ratings yet
Document 8 1
7 pages
Midas Civil API
No ratings yet
Midas Civil API
5 pages
Ghostbusters Answers
No ratings yet
Ghostbusters Answers
4 pages
The Seven Sorrows
67% (3)
The Seven Sorrows
2 pages
Ancient Rome Board Game Project
No ratings yet
Ancient Rome Board Game Project
2 pages
Textbook Controversies and The Demand Janaki Nair
No ratings yet
Textbook Controversies and The Demand Janaki Nair
20 pages
CONTENTS, FORMAT AND STRUCTURE - Fm-1170-Subsidiary-Arrangement-Code-10-Labelled
No ratings yet
CONTENTS, FORMAT AND STRUCTURE - Fm-1170-Subsidiary-Arrangement-Code-10-Labelled
18 pages
Music Lesson Plan 2
No ratings yet
Music Lesson Plan 2
2 pages
Genesys Interaction Workspace Plug-In For Siebel
No ratings yet
Genesys Interaction Workspace Plug-In For Siebel
2 pages
Polynomials PYQ Question List Class9
No ratings yet
Polynomials PYQ Question List Class9
2 pages
Write The Missing Words of The Verb To Be (Affirmative Form)
No ratings yet
Write The Missing Words of The Verb To Be (Affirmative Form)
1 page
Pandas Practicals - Term-1
100% (1)
Pandas Practicals - Term-1
18 pages
Database Management System Course
No ratings yet
Database Management System Course
37 pages
Geomatica 2018 Full Install
No ratings yet
Geomatica 2018 Full Install
2 pages
Operating Systems Quiz Guide
No ratings yet
Operating Systems Quiz Guide
64 pages
Excel Formulas & Functions
No ratings yet
Excel Formulas & Functions
81 pages
Madanapalle Institute of Technology & Science
No ratings yet
Madanapalle Institute of Technology & Science
40 pages