100% found this document useful (1 vote)

447 views16 pages

8.2. CQL Exercises

The document provides details about the data that will be used in a course on data analysis. It includes information on two datasets - the Movielens dataset containing movie ratings from 6,000 users on 4,000 movies, and a randomly generated weather dataset with 500,000 entries. It then describes the structure of the Movielens data files and provides sample extracts. Finally, it outlines exercises that will be completed using CQL to interact with the data, including creating and modifying keyspaces and tables.

Uploaded by

Alexandra Ureche

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

447 views16 pages

8.2. CQL Exercises

Uploaded by

Alexandra Ureche

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Valentina.Crisan@academyofdata.

com

Throughout the course we will work with 2 sets of data:

Movielens data set (real data): 6k users, 4k movies, 1 million ratings - we copy only 10k
entries
Weather data (random generated) 500k entries

Movielens data details:

Movies.csv extract

mid title genres year

1 Toy Story (1995) {Animation,Children's,Comedy} 1995

2 Jumanji (1995) {Adventure,Children's,Fantasy} 1995

3 Grumpier Old Men (1995) {Comedy,Romance} 1995

4 Waiting to Exhale (1995) {Comedy,Drama} 1995

5 Father of the Bride Part II (1995) {Comedy} 1995

Users.csv extract

uid gender age ocupation zip

1 F 1 10 48067

2 M 56 16 70072

3 M 25 15 55117

4 M 45 7 0246

Ratings2.csv

ruid rmid rating timestamp

1 1193 5 978300760

1 661 3 978302109

1 914 3 978301968

1 3408 4 978300275

1 2355 5 978824291

Denormalized data - Ratings.csv data

uid age gender ocupation zip rating datetime mid title year genres

5234 25 F 0 60657 1 2000-06-19 20:40:40 31 Dangerous Minds (1995) 1995 {Drama}

Valentina.Crisan@academyofdata.com

1835 25 M 19 11501 3 2000-11-22 08:11:31 31 Dangerous Minds (1995) 1995 {Drama}

6035 25 F 1 78734 2 2000-04-26 01:25:35 31 Dangerous Minds (1995) 1995 {Drama}

4637 35 F 9 48302 3 2000-07-19 18:48:41 31 Dangerous Minds (1995) 1995 {Drama}

3841 45 M 18 26101 3 2000-08-11 13:14:31 31 Dangerous Minds (1995) 1995 {Drama}

2242 18 M 4 60115 5 2000-11-19 02:40:43 31 Dangerous Minds (1995) 1995 {Drama}

weather-data.csv

w_id | event_time | temperature

------------+---------------------------------+-------------

PNOSCPQCNX | 2015-01-01 00:02:54.000000+0000 | 20

PNOSCPQCNX | 2015-01-01 00:10:00.000000+0000 | 23

PNOSCPQCNX | 2015-01-01 00:29:53.000000+0000 | 20

PNOSCPQCNX | 2015-01-01 01:04:22.000000+0000 | 12

PNOSCPQCNX | 2015-01-01 01:17:15.000000+0000 | 18

PNOSCPQCNX | 2015-01-01 02:52:25.000000+0000 | 23

PNOSCPQCNX | 2015-01-01 03:09:17.000000+0000 | 11

PNOSCPQCNX | 2015-01-01 03:14:57.000000+0000 | 13

PNOSCPQCNX | 2015-01-01 04:13:33.000000+0000 | 14

PNOSCPQCNX | 2015-01-01 04:47:56.000000+0000 | 13

All the above files can be found in the data directory
(both cloud and local installations).

cd /data
Ls -l

Note: for the intro part we play a bit with different
data types and we insert data manually, before we
work with the above data.
Valentina.Crisan@academyofdata.com

CQL exercises
Reference for syntax questions:
http://docs.datastax.com/en/cql/3.3/cql/cql_using/useAboutCQL.html

Run Cqlsh

1. Create a Keyspace - for local installations use

metro_systems, but for cloud installations use your name
(e.g. my keyspace is valentina_c) -- since all cloud users need
to create a separate working space. You will use that name
instead of metro_systems in the commands below.

CREATE KEYSPACE metro_systems WITH REPLICATION = { 'class' :

'SimpleStrategy' , 'replication_factor' : 1 };

Desc keyspaces // will list all keyspaces on the cluster

describe metro_systems; // you can see the Replication factor & strategy
set at keyspace level

Exit cqlsh and run nodetool status command - this should give you an image of your cluster
(even if it’s a 1 node cluster). Note the datacenter name.

Re-enter Cqlsh

Let’s create another keyspaceNetwork Topology replication strategy and Replication factor
1.

CREATE KEYSPACE metro_systems1 WITH REPLICATION = { 'class' :

'NetworkTopologyStrategy', 'datacenter1' : 1 };

describe metro_systems1;

Let’s alter a created keyspace and change the property durable writes:

ALTER KEYSPACE metro_systems1 WITH REPLICATION = { 'class' :

'NetworkTopologyStrategy' , 'datacenter1' : 1} and DURABLE_WRITES =
false;

OR if we want to modify the replication factor from 1 to 3:

ALTER KEYSPACE metro_systems1 WITH REPLICATION = { 'class' :

'NetworkTopologyStrategy' , 'datacenter1' : 3};
Valentina.Crisan@academyofdata.com

ALTER KEYSPACE valentina_c_1 W

ITH REPLICATION = { 'class' :
'NetworkTopologyStrategy' , ' datacenter1' : 3};

Desc metro_systems1

Let’s drop keyspace metro_systems1:

DROP KEYSPACE metro_systems1;

Stop: let’s explain what means durable writes option in the keyspace description?

2. Let’s create and modify a column family

First, connect to the created keyspace: i.e. all further CQL commands will be executed in the
context of the chosen keyspace.

USE metro_systems;

2.1. Create Table users with the next structure: userid (uuid), gender (text), age (int),
occupation (text), zipcode (int) and compound primary key: partition key userid and cluster
key gender.

CREATE TABLE users (

userid uuid,
name text,
gender text,
age int,
occupation text,
zipcode int,
PRIMARY KEY ((userid),name));

DESCRIBE users; // you can see all properties of the table

2.2. ALTER: Add/Delete/Rename columns, Alter Column type;

ALTER TABLE users ADD relatives int; // add a new column to the table

SELECT * FROM users;

ALTER TABLE users RENAME userid TO uid; //rename column

Valentina.Crisan@academyofdata.com

ALTER TABLE users ALTER relatives type blob; // alter column type from int to blob

Describe users; // to see the types of the columns

Please note: Alter columns types, some changes will not be possible, like for example:
varchar to ascii.

ALTER TABLE users DROP relatives; //delete the new added column to the table

SELECT * FROM users;

Describe users;

DROP TABLE users; // deletes all the data of the table including the table structure &
indexes

Desc tables; // no tables

2.3. Insert values in the created table

Assuming you dropped the table users previously, let’s re-create it.

CREATE TABLE users (

uid uuid,
name text,
gender text,
age int,
occupation text,
zipcode int,
PRIMARY KEY ((uid),name));

We have a table, now let’s insert data manually. We will learn later also how to insert data
from a file, but for the moment let’s do some manual inserts/updates.

Observation: the UUID can be generated automatically with uuid ();

INSERT INTO users (uid, name, age, gender, occupation, zipcode) VALUES
(5132b130-ae79-11e4-ab27-0800200c9a66, 'Andrew', 25, 'M', 'plummer',
11501);

INSERT INTO users (uid, name, a

ge, gender, occupation, zipcode) VALUES
(UUID(),'Dana', 24, 'F','data s cientist', 11501);
Valentina.Crisan@academyofdata.com

SELECT * FROM users;

TRACING ON; // allows to see the events in background for Cassandra.

INSERT INTO users (uid, name, age, gender, occupation, zipcode) VALUES
(5132b130-ae79-11e4-ab27-0800200c9a66, 'Lili', 24, 'F', 'actress', 11503);
// note that I am inserting data in the same partition as the first
insert. Given that the Clustering Key is different the data will be added
to the existing partition.

TRACING OFF

SELECT * FROM users; // remember the way Cassandra stores data is
different than the CQL view.

Remember for the next section: Cassandra doesn’t read (by default) before
performing writes.

2.4. Let’s do also some upserts

UPDATE u sers SET a

ge = 3 5 WHERE uid = 5
132b130-ae79-11e4-ab27-0800200c9a66 and
name = ' Andrew'; / / we u pdate the age o f an existing user

Select * from users; // to see the change

UPDATE users SET age = 35 WHERE uid = 5132b130-ae79-11e4-ab27-0800200c9a77 and
name = 'Andrew'; // the UUID - partition key - is not existent thus the update
will do an insert

SELECT * from users; // let’s check that everything went well

UPDATE u sers SET a ge = 44 WHERE uid = 5132b130-ae79-11e4-ab27-0800200c9a66 and
name = ' Elaine'; / / the partition key existing but not with this clustering
key, a n ew entry i nside an existing partition will be generated

Select * from users;

Select * from users where uid = 5132b130-ae79-11e4-ab27-0800200c9a66; // to
see all entries in this partition

Lets do some more inserts. The first row inserted will have a new partition and cluster key. The
second row will have the same partition key but a different clustering key.

insert into users (uid, name, age, gender, occupation, zipcode) values
(uuid(),'John',35,'M','data analyst',11502); // see the uuid generated here
and use it for the next insert

insert into users (uid, name, age, gender, occupation, zipcode) values
(90278d0f-6053-4c87-a1f3-69a05853d2a8,'Elaine',34,'F','engineer',11502);
Valentina.Crisan@academyofdata.com

Select * from users;

Insert into users (uid,name, age) values(

90278d0f-6053-4c87-a1f3-69a05853d2a8, 'Elaine',43); // see how insert can
be used as an update as well

Before we move on, type exit in cqlsh and lets see some stats for our table.

Nodetool cfstats keyspace.table //generic command, see below for metro_systems

keyspace

Nodetool cfstats metro_systems.users // for cloud users use yourname.users

See that data is still in memtable, no SStables generated, see also the estimated number
of partitions (Number of keys (estimate))

2.5. Lightweight operations

Re-enter cqlsh and your respective keyspace

Use metro_systems;

Let’s try a an update on another non existing entry but with LWT:

UPDATE users SET age = 3

5 WHERE uid = 5132b130-ae79-11e4-ab27-0800200c9a88
and name = 'Andrew' IF E XISTS;

Now let’s try an update on an existing entry with LWT:

UPDATE users SET age = 4

4 WHERE uid = 5132b130-ae79-11e4-ab27-0800200c9a77
and name = 'Andrew' IF E XISTS;

Exit CQL

Let’s force a flush to write the data on disk in an SStable. Run

Nodetool flush metro_systems users // For cloud users please make sure you use
your own keyspace

Go to cd /var/lib/data/cassandra/ . Choose your keyspace directory and the directory
of the table. Use ls -l to list the tables generated after flush.

IF you would like to see how an SStable looks like you can use : sstabledump
nameofthedatafile // e.g. sstabledump mc-1-big-Data.db

Sstabledump -e nameofthedatafile // will show us the partitions in the specific SStable
Valentina.Crisan@academyofdata.com

STOP here for theory update. Queries section.

3. Let’s do some queries on the data - we will come back to
Queries, this is just to get us going with the syntax

Cd /

Re-enter CQL with Cqlsh command and use your keyspace;

SELECT * FROM users where uid = 5132b130-ae79-11e4-ab27-0800200c9a66;

// You can also pick the columns to display instead of choosing all data.

SELECT age,gender FROM users where uid =

5132b130-ae79-11e4-ab27-0800200c9a66;

SELECT * FROM users LIMIT 3; // you can limit the number of returned
values

//You can fine-tune the display order using the ORDER BY clause, either descending
or ascending. The partition key must be defined in the WHERE clause and the ORDER
BY clause defines the clustering column to use for ordering. Order By works only on
clustering keys.

SELECT * FROM users where uid = 5132b130-ae79-11e4-ab27-0800200c9a66 ORDER

BY name DESC;

// You can query for a partition key that is part of a list of partition keys

SELECT * FROM users where uid IN (5132b130-ae79-11e4-ab27-0800200c9a66,

edb032e8-5c4e-403f-8dfc-43c2dcd69dba);

You can return the number of records that fulfills a request:

SELECT COUNT(*) FROM users where uid =

5132b130-ae79-11e4-ab27-0800200c9a66;

You can ask for the distinct partitions in a table (DISTINCT works only on partition
keys):

select distinct uid from users;

And also count how many distinct partitions exist:

Valentina.Crisan@academyofdata.com

select distinct uid, count(*) from users;

3.1. CREATE AN INDEX

SELECT * FROM users WHERE gender = 'M'; // WILL NOT WORK

//let’s create an index

NDEX users_by_gender ON users(gender);

CREATE I
SELECT * FROM users WHERE gender = 'F';

//lets count how many records are in the query outcome

SELECT count(*) FROM users WHERE gender = 'F';

Tracing on;

select * from users where gender = 'F'; // to see all the operations C* is doing in the
background (even for a 1 node cluster you can see loads of operations).

CAN YOU SEE THAT THERE ARE 2 TABLES queried ON THE NODE IN ORDER TO GET
THE DATA: on the index table and on the users table
Executing single-partition query on users.users_by_gender
Executing single-partition query on users -- appears 3 times. Can you tell me why?
We will come back on this point in section Indexing in C*.

4.Collections

SETS

ALTER TABLE users ADD emails set <text>;

Let’s insert some data in the set:

insert into users (uid,name,age,emails,gender,occupation,zipcode)

values(uuid(),'Lilian',
18,{'lilian@gmail.com','lilian_briggs@oxford.com'},'F', 'student', 11503);

Select * from users;

Let’s add some data to existing entries:

UPDATE users SET emails = emails + {'elaine@yahoo.com'} WHERE uid =
89d3a708-0571-440a-8a9b-997856c17789 AND name = 'Elaine';
Valentina.Crisan@academyofdata.com

Let’s delete some data from existing entries:

UPDATE users SET emails = emails - {'elaine@yahoo.com'} WHERE uid =
89d3a708-0571-440a-8a9b-997856c17789 AND name = 'Elaine';

SELECT * FROM USERS;

create index on users(emails);

select * from users where emails CONTAINs 'lilian@gmail.com';

LISTS

LET’S ADD A LIST COLUMN TO TABLE USERS:

ALTER TABLE users ADD top_movies list<text>;

LET’S ADD SOME DATA TO EXISTING ENTRIES:

update users set top_movies = ['titanic','gatsby'] where uid =

89d3a708-0571-440a-8a9b-997856c17789 and name = 'Elaine';

Select * from users;

update users set top_movies = ['romeo and juliet']+ top_movies where uid =
89d3a708-0571-440a-8a9b-997856c17789 and name = 'Elaine';

Select * from users;

update users set top_movies = top_movies +

['the renevant'] where uid =
89d3a708-0571-440a-8a9b-997856c17789 and n ame = 'Elaine';

Let’s replace position 1 Titanic with Blood Diamonds:

update users set top_movies[1]='blood diamonds' w

here uid =
89d3a708-0571-440a-8a9b-997856c17789 and name = ' Elaine';

Select * from users;

Let’s delete entry 1 from the list. //blood diamonds

delete top_movies[1] from users where uid =

89d3a708-0571-440a-8a9b-997856c17789 and name = 'Elaine';
Valentina.Crisan@academyofdata.com

select name , top_movies from users where uid =

89d3a708-0571-440a-8a9b-997856c17789 and name = 'Elaine'; // we select only the
collection

You can index a collection, let’s query on a movie name:

create index top_movies on users(top_movies);

select * from users where top_movies contains 'gatsby';

MAPS
LET’S ADD A MAP COLUMN TO TABLE USERS:

ALTER TABLE users ADD movies map<timestamp, text>;

LET’S INSERT DATA:

update users set movies = {'2016-06-05':'rated

movie','2016-06-06':'screened trailer'} where uid =
89d3a708-0571-440a-8a9b-997856c17789 and name = 'Elaine';

Select movies from users;

You can use a collection as a primary key.

STOP here for theory update.

5. Copy data from/to a CSV file

Let’s start working on some real data.

5.1. Copy from a CSV file

Drop table users;

Let’s re-create table users with below syntax and import data from a CSV file.

CREATE TABLE users(

uid int,
age int,
gender text,
occupation int,
zipcode text,
PRIMARY KEY (uid));
Valentina.Crisan@academyofdata.com

WE HAVE IN THE DATA DIRECTORY (CLOUD OR LOCAL INSTALLATIONS) THE CSV
FILES THAT WE WILL BE WORKING WITH. WE NEED Users.csv

copy users(uid,gender,age,occupation,zipcode) from '/data/users.csv' with

header = true;

select * from users limit 10;

5.2. Copy to a CSV file

COPY users(uid,gender,age) TO '/data/yourname.csv' WITH HEADER=TRUE;

6.1 Collections as Primary key

Did you know we can have a collection as part Primary Key? So if we want to
know movies by genres - we could define Genre as part of the Primary key.

CREATE TABLE movies_by_genres (

movieid int,
title text,
genres frozen <set<text>>,
Year int,
PRIMARY KEY (genres, movieid));

copy movies_by_genres(movieid,title,genres,year) from '/data/movies.csv' with header =

true;

Select * from movies_by_genres limit 10;

6.2. Lets also do an exercise on User Defined Types

CREATE TYPE address (

Street text,
Number int,
Flat text);

CREATE TABLE users_udt (

Uid int,
Address frozen<address>,
Primary key (uid));
Valentina.Crisan@academyofdata.com

// When using the frozen keyword, you cannot update parts of a
user-defined type value. The entire value must be overwritten. Cassandra
treats the value of a frozen, user-defined type like a blob.

INSERT INTO users_udt(uid,address) VALUES (1, {street : 'Pipera', number :

34, flat : '34C'});

Select * from users_udt;

//Let’s see if we can select only parts of the UDT
select address.street from users_udt where uid = 1;

// Let’s see if we can update parts of the UDT

update users_udt_1 set address.street = 'Vitan' where uid = 1; // will not
work

Important:From 3.4.2: User-defined types may now be stored in a non-frozen
form, allowing individual fields to be updated and deleted in UPDATE
statements and DELETE statements, respectively. (CASSANDRA-7423). Let’s
test.

CREATE TABLE users_udt_1 (

Uid int,
Address address,
Primary key (uid));

INSERT INTO users_udt_1(uid,address) VALUES (1, {street : 'Pipera', number

: 34, flat : '34C'});

Select * from users_udt_1;

update users_udt_1 set address.street = 'Vitan' where uid = 1;

Select * from users_udt_1;

Valentina.Crisan@academyofdata.com

7. Let’s do some data deletes:

SELECT * from users limit 10;

Delete value of a cell:

select * from users where uid = 4317;

delete age from users where uid = 4317;

select * from users where uid = 4317;

Delete a partition:

delete from users where uid = 4317;

Before we move on

//see table statistics

nodetool cfstats metro_systems.users

// see number of keys = number of partitions.

fhistograms metro_systems users

nodetool c
or
nodetool t ablehistograms metro_systems users

Select token(uid) from users1 where uid=1; // // we want to see which token is
allocated for this partition

Basic data model -- update theory

Exercise 8: Time series: weather stations

CREATE TABLE temperature (
w_id text,
event_time timestamp,
temperature int,
PRIMARY KEY (w_id,event_time)
) with clustering order by (event_time DESC);

Let’s import data in the temperature table:

Valentina.Crisan@academyofdata.com

copy temperature(w_id,event_time,temperature) from '/data/weatherdata.csv' with

header = true and maxrows = 100000;

Select * from temperature limit 30;

Let’s look for the data (time & temp) on a single weather station, how would the
query look like?

select event_time, temperature from temperature where w_id = 'PNOSCPQCNX';

What about all the temperatures between 2016-01-10 03:49:46 and 2016-01-11
16:29:59?

select temperature from temperature where w_id = 'PNOSCPQCNX' AND

event_time > '2016-01-10 03:49:46' AND event_time < '2016-01-11 16:29:59';

Average temperatures between 2016-01-10 03:49:46 and 2016-01-11 16:29:59?

select avg(temperature) from temperature where w_id = 'PNOSCPQCNX' AND

event_time > '2016-01-10 03:49:46' AND event_time < ' 2016-01-11 16:29:59';

Create table temperature_by_day ( solution for wide rows)

CREATE TABLE temperature_by_day (

w_id text,
date date,
event_time timestamp,
temperature int,
PRIMARY KEY ((w_id,date),event_time)
);

Let’s import data in the new table:

copy temperature_by_day (w_id,event_time,date,temperature) from

'/data/weatherdata-ext.csv' with header = true and maxrows = 100000;

Query the weather data for the single day and then 2 consecutive days
SELECT * FROM temperature_by_day WHERE w_id='FFOERIRWOL' AND date='2015-03-19';
SELECT * FROM temperature_by_day WHERE w_id='FFOERIRWOL' AND date in
('2015-03-19','2015-03-18');
Valentina.Crisan@academyofdata.com

Let’s see how to query using a non primary key column:

This query will not work - because we are trying to query without partition key:

select * from temperature_by_day where temperature = 29;

In order to work either we allow filtering of data:

select * from temperature_by_day where temperature = 2 9 a llow filtering; // use
this with caution, brings all the table data locally i n o ne coordinator.

Or we create an index on temperature

ndex on temperature_by_day(temperature);

Create i
select * from temperature_by_day where temperature = 29

Deleting old entries - WE CAN SET EXPIRING DATA AT TABLE LEVEL OR
PER ENTRY

EXPIRE DATA PER TABLE

CREATE TABLE latest_temperatures (

w_id text,
event_time timestamp,
temperature int,
PRIMARY KEY (w_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC) AND default_time_to_live = 10;

copy latest_temperatures(w_id,event_time,temperature) from '/data/weatherdata.csv'

with header = true and maxrows = 10000;

Select * from latest_temperatures

Repeat select in 10 sec

EXPIRE DATA PER ENTRY

INSERT INTO latest_temperatures(w_id,event_time,temperature) VALUES

('12345', '2015-09-17 21:29:35', 20) USING TTL 20;

Select * from latest_temperatures; // in 20 s

Keys in Rdbms With Examples
No ratings yet
Keys in Rdbms With Examples
11 pages
Evaluation of Relational Algebra Expressions: What Is It?
No ratings yet
Evaluation of Relational Algebra Expressions: What Is It?
5 pages
DWDM r23 Batch
No ratings yet
DWDM r23 Batch
140 pages
Introduction of DBMS
No ratings yet
Introduction of DBMS
83 pages
Dbms Aicte Lab
No ratings yet
Dbms Aicte Lab
42 pages
Database Management Systems Lab Manual
No ratings yet
Database Management Systems Lab Manual
40 pages
Relational Database Management System
No ratings yet
Relational Database Management System
77 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Job Scheduling1
No ratings yet
Job Scheduling1
36 pages
SQL Exercises Using The Postgresql Dbms
No ratings yet
SQL Exercises Using The Postgresql Dbms
15 pages
Integrity and Domain Constraints
No ratings yet
Integrity and Domain Constraints
25 pages
Question For Practice Oracle (11g) With Answers
100% (4)
Question For Practice Oracle (11g) With Answers
32 pages
Lab 3
No ratings yet
Lab 3
3 pages
Dbms Lab Manual Mysql
No ratings yet
Dbms Lab Manual Mysql
85 pages
MongoDB-Assignment 3
No ratings yet
MongoDB-Assignment 3
4 pages
Lab - 7-MERN Stack
No ratings yet
Lab - 7-MERN Stack
3 pages
PL/SQL and SQL Procedures & Queries
No ratings yet
PL/SQL and SQL Procedures & Queries
6 pages
DBMS Lab # 5 SQL Constraints
No ratings yet
DBMS Lab # 5 SQL Constraints
14 pages
NoSQL Course for B.Tech Students
No ratings yet
NoSQL Course for B.Tech Students
85 pages
Intro to LINQ: C# 2008 Mastery Quiz
No ratings yet
Intro to LINQ: C# 2008 Mastery Quiz
7 pages
DWDM Question Bank (R23)
100% (1)
DWDM Question Bank (R23)
6 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
4 pages
Query Processing & Optimization
No ratings yet
Query Processing & Optimization
77 pages
Object Oriented Programming Assignment 3
No ratings yet
Object Oriented Programming Assignment 3
7 pages
Database Lab for CSE Students
No ratings yet
Database Lab for CSE Students
3 pages
Data Warehousing and Data Mining Syllabus
No ratings yet
Data Warehousing and Data Mining Syllabus
2 pages
Space and Time Trade-Off - PPT
No ratings yet
Space and Time Trade-Off - PPT
29 pages
F U-4 PDF
No ratings yet
F U-4 PDF
48 pages
Important Topics For BSC-IT Sem-6 (E-Next - In)
No ratings yet
Important Topics For BSC-IT Sem-6 (E-Next - In)
2 pages
Assignment For DS (Stack and Recursion)
No ratings yet
Assignment For DS (Stack and Recursion)
11 pages
Dot Net Lab Report
No ratings yet
Dot Net Lab Report
25 pages
DBMS Unit 1 - Question Bank
100% (1)
DBMS Unit 1 - Question Bank
19 pages
Node Js Lab Manual
No ratings yet
Node Js Lab Manual
108 pages
Object-Relational & NoSQL Databases
No ratings yet
Object-Relational & NoSQL Databases
46 pages
Lab 1-Oop Classes and Objects
No ratings yet
Lab 1-Oop Classes and Objects
5 pages
Data Mining and Warehousing
100% (3)
Data Mining and Warehousing
30 pages
DBMS Vs DataWarehouse
No ratings yet
DBMS Vs DataWarehouse
2 pages
Csci3134 Exercises OOP
No ratings yet
Csci3134 Exercises OOP
14 pages
Dbms Question Bank Unit I
100% (1)
Dbms Question Bank Unit I
2 pages
III Sem Syllabus RNSIT New
No ratings yet
III Sem Syllabus RNSIT New
19 pages
B. DBMS (Answer) : I. Multi Choice
No ratings yet
B. DBMS (Answer) : I. Multi Choice
2 pages
Dbms PPT For Chapter 7
No ratings yet
Dbms PPT For Chapter 7
45 pages
Lambda Functions & Sorting Guide
No ratings yet
Lambda Functions & Sorting Guide
5 pages
8 Advanced Interaction Modeling: Here Are Answers For An Electronic Gasoline Pump. Figure A8.1 Shows A Use Case Diagram
No ratings yet
8 Advanced Interaction Modeling: Here Are Answers For An Electronic Gasoline Pump. Figure A8.1 Shows A Use Case Diagram
9 pages
Prgm5 & PRGM 6
No ratings yet
Prgm5 & PRGM 6
5 pages
Data Warehousing Lab Manual 2021
No ratings yet
Data Warehousing Lab Manual 2021
48 pages
Roadmap Java
No ratings yet
Roadmap Java
9 pages
JavaScript Lecture Notes BCA-5
No ratings yet
JavaScript Lecture Notes BCA-5
13 pages
Unit Iv Implementation Techniques
No ratings yet
Unit Iv Implementation Techniques
91 pages
Cp4152 Database Practice Lab Manual R 2021
No ratings yet
Cp4152 Database Practice Lab Manual R 2021
48 pages
DSA Lab Practical: Linked Lists, Stack, Queue, BST
No ratings yet
DSA Lab Practical: Linked Lists, Stack, Queue, BST
33 pages
Data Mining Lab Guide
33% (3)
Data Mining Lab Guide
44 pages
Module 6 Lecture 1 (Advance Topics)
No ratings yet
Module 6 Lecture 1 (Advance Topics)
18 pages
MVC Questions & Answers
No ratings yet
MVC Questions & Answers
162 pages
NoSQL Databases Course Guide
No ratings yet
NoSQL Databases Course Guide
10 pages
Cse3009 - Nosql-Databases - Eth - 2.0 - 5 - Cse3009 - No SQL Database - Revised - 2.1 (5th-2.0 - 6th Bos-2.1-Approved in 5th A.c)
No ratings yet
Cse3009 - Nosql-Databases - Eth - 2.0 - 5 - Cse3009 - No SQL Database - Revised - 2.1 (5th-2.0 - 6th Bos-2.1-Approved in 5th A.c)
4 pages
Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
No ratings yet
Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
4 pages
Neo4j & Cypher for Moncton Transit
No ratings yet
Neo4j & Cypher for Moncton Transit
20 pages
DS220 v6 Exercises
No ratings yet
DS220 v6 Exercises
33 pages
Chapter 7
No ratings yet
Chapter 7
48 pages
ZTE ZXSDR B8200 Product Description
No ratings yet
ZTE ZXSDR B8200 Product Description
31 pages
Success Story Guide for NGOs
No ratings yet
Success Story Guide for NGOs
1 page
RBS Installation Check List
No ratings yet
RBS Installation Check List
8 pages
7230 Compressor
100% (2)
7230 Compressor
57 pages
Determination of The Surface Tension of Liquid Stainless Steel
No ratings yet
Determination of The Surface Tension of Liquid Stainless Steel
5 pages
PCM 55 Saw
No ratings yet
PCM 55 Saw
7 pages
Dinowar Rules English v0 731
No ratings yet
Dinowar Rules English v0 731
20 pages
Siemens DC Servo
No ratings yet
Siemens DC Servo
3 pages
Summary Icpna I08
No ratings yet
Summary Icpna I08
6 pages
Impact of Strategic Sourcing
No ratings yet
Impact of Strategic Sourcing
182 pages
Rothwell Towards The Fifth Generation
100% (2)
Rothwell Towards The Fifth Generation
25 pages
PDF-produkte Neu-VACUREMA Inline 2013 09 en
0% (1)
PDF-produkte Neu-VACUREMA Inline 2013 09 en
9 pages
IAS Physics SB1 Assessment 1C
No ratings yet
IAS Physics SB1 Assessment 1C
4 pages
NRC98DV Installation Manual 2017
No ratings yet
NRC98DV Installation Manual 2017
42 pages
Mesopotamia Essay
100% (2)
Mesopotamia Essay
9 pages
RAD Content Specification
No ratings yet
RAD Content Specification
14 pages
Admingd Iwsva 5.6 GM
No ratings yet
Admingd Iwsva 5.6 GM
763 pages
ISTQB Foundation Level (CTFL) Syllabus
No ratings yet
ISTQB Foundation Level (CTFL) Syllabus
11 pages
Hyper V Replication
No ratings yet
Hyper V Replication
25 pages
Mosfet: Presented by Vivek Krishna Kannan Siddarth Ram Mohan Arun
No ratings yet
Mosfet: Presented by Vivek Krishna Kannan Siddarth Ram Mohan Arun
15 pages
Fosroc Structuro W425: A Superior High Performance Concrete Hyperplasticiser Based On Polycarboxylate Technology
No ratings yet
Fosroc Structuro W425: A Superior High Performance Concrete Hyperplasticiser Based On Polycarboxylate Technology
2 pages
LIEBHERR L566 L 550 XPower Tunnel - L 576 XPower Tunnel G6
No ratings yet
LIEBHERR L566 L 550 XPower Tunnel - L 576 XPower Tunnel G6
10 pages
Lucke & Arthur 2011 Plastic Pipe Pressures in Siphonic Roof Drainage Systems
No ratings yet
Lucke & Arthur 2011 Plastic Pipe Pressures in Siphonic Roof Drainage Systems
15 pages
PVM-20L1 PVM-14L1 PVM-9L1: Features
No ratings yet
PVM-20L1 PVM-14L1 PVM-9L1: Features
4 pages
Stabbing Board Safety Checklist
No ratings yet
Stabbing Board Safety Checklist
1 page
Korean AI Agency Pitch Deck XL by Slidesgo
No ratings yet
Korean AI Agency Pitch Deck XL by Slidesgo
10 pages
Ifm Catalogue Australia 2010
No ratings yet
Ifm Catalogue Australia 2010
237 pages
Part - B: Common Data For Question 1 and 2
No ratings yet
Part - B: Common Data For Question 1 and 2
15 pages
Astm F50 92 1996
No ratings yet
Astm F50 92 1996
2 pages
Account Statement
No ratings yet
Account Statement
2 pages

8.2. CQL Exercises

Uploaded by

8.2. CQL Exercises

Uploaded by

Valentina.Crisan@academyofdata.

Movielens​ ​data​ ​details:

mid title genres year

1 Toy​ ​Story​ ​(1995) {Animation,Children's,Comedy} 1995

2 Jumanji​ ​(1995) {Adventure,Children's,Fantasy} 1995

3 Grumpier​ ​Old​ ​Men​ ​(1995) {Comedy,Romance} 1995

4 Waiting​ ​to​ ​Exhale​ ​(1995) {Comedy,Drama} 1995

5 Father​ ​of​ ​the​ ​Bride​ ​Part​ ​II​ ​(1995) {Comedy} 1995

uid gender age ocupation zip

ruid rmid rating timestamp

Denormalized​ ​data​ ​-​ ​Ratings.csv​ ​data

5234 25 F 0 60657 1 2000-06-19​ ​20:40:40 31 Dangerous​ ​Minds​ ​(1995) 1995 {Drama}

1835 25 M 19 11501 3 2000-11-22​ ​08:11:31 31 Dangerous​ ​Minds​ ​(1995) 1995 {Drama}

6035 25 F 1 78734 2 2000-04-26​ ​01:25:35 31 Dangerous​ ​Minds​ ​(1995) 1995 {Drama}

4637 35 F 9 48302 3 2000-07-19​ ​18:48:41 31 Dangerous​ ​Minds​ ​(1995) 1995 {Drama}

3841 45 M 18 26101 3 2000-08-11​ ​13:14:31 31 Dangerous​ ​Minds​ ​(1995) 1995 {Drama}

2242 18 M 4 60115 5 2000-11-19​ ​02:40:43 31 Dangerous​ ​Minds​ ​(1995) 1995 {Drama}

w_id​​ ​ ​ ​ ​ ​ ​ ​|​ ​event_time​​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​|​ ​temperature

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​00:02:54.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​20

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​00:10:00.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​23

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​00:29:53.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​20

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​01:04:22.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​12

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​01:17:15.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​18

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​02:52:25.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​23

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​03:09:17.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​11

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​03:14:57.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​13

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​04:13:33.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​14

​ ​PNOSCPQCNX​​ ​|​ ​2015-01-01​ ​04:47:56.000000+0000​​ ​|​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​13

1. Create​ ​a​ ​Keyspace​ ​-​ ​for​ ​local​ ​installations​ ​use

CREATE​ ​KEYSPACE​ ​metro_systems​ ​WITH​ ​REPLICATION​ ​=​ ​{​ ​'class'​ ​:

CREATE​ ​KEYSPACE​ ​metro_systems1​ ​ ​WITH​ ​REPLICATION​ ​=​ ​{​ ​'class'​ ​:

ALTER​ ​KEYSPACE​ ​metro_systems1​ ​WITH​ ​REPLICATION​ ​=​ ​{​ ​'class'​ ​:

ALTER​ ​KEYSPACE​ ​metro_systems1​ ​WITH​ ​REPLICATION​ ​=​ ​{​ ​'class'​ ​:

ALTER​ ​KEYSPACE​ ​valentina_c_1​ W

Let’s​ ​drop​ ​keyspace​ ​metro_systems1​:

DROP​ ​KEYSPACE​ ​metro_systems1;

2.​ ​Let’s​ ​create​ ​and​ ​modify​ ​a​ ​column​ ​family

CREATE​ ​TABLE​ ​users​ ​(

2.2.​ ​ ​ALTER:​ ​Add/Delete/Rename​ ​columns,​ ​Alter​ ​Column​ ​type;

SELECT​ ​*​ ​FROM​ ​users;

ALTER​ ​TABLE​ ​users​ ​RENAME​ ​userid​ ​TO​ ​uid;​​ ​//rename​ ​column

SELECT​ ​*​ ​FROM​ ​users;

Desc​ ​tables;​ ​ ​//​ ​no​ ​tables

2.3.​ ​Insert​ ​values​ ​in​ ​the​ ​created​ ​table

CREATE​ ​TABLE​ ​users​ ​(

INSERT​ ​INTO​ ​users​ ​(uid,​ ​name,​ a

SELECT​ ​*​ ​FROM​ ​users;

2.4.​ ​Let’s​ ​do​ ​also​ ​some​ ​upserts

UPDATE​ u ​ sers​ ​SET​ a

Select​ ​*​ ​from​ ​users;​ ​//​ ​to​ ​see​ ​the​ ​change

Select​ ​*​ ​from​ ​users;

Select​ ​*​ ​from​ ​users;

Insert​ ​into​ ​users​ ​(uid,name,​ ​age)​ ​values(

Nodetool​ ​cfstats​ ​keyspace.table​ ​//generic​ ​command,​ ​see​ ​below​ ​for​ ​metro_systems

Nodetool​ ​cfstats​ ​metro_systems.users​ ​ ​//​ ​for​ ​cloud​ ​users​ ​use​ ​yourname.users

2.5.​ ​Lightweight​ ​operations

Re-enter​ ​cqlsh​ ​and​ ​your​ ​respective​ ​keyspace

UPDATE​ ​users​ ​SET​ ​age​ ​=​ 3

UPDATE​ ​users​ ​SET​ ​age​ ​=​ 4

STOP​ ​here​ ​for​ ​theory​ ​update.​ ​Queries​ ​section.

Re-enter​ ​CQL​ ​with​ ​Cqlsh​ ​command​ ​and​ ​use​ ​your​ ​keyspace;

SELECT​ ​*​ ​FROM​ ​users​ ​where​ ​uid​ ​=​ ​5132b130-ae79-11e4-ab27-0800200c9a66;

SELECT​ ​age,gender​ ​FROM​ ​users​ ​where​ ​uid​ ​=

SELECT​ ​*​ ​FROM​ ​users​ ​where​ ​uid​ ​=​ ​5132b130-ae79-11e4-ab27-0800200c9a66​ ​ORDER

SELECT​ ​*​ ​FROM​ ​users​ ​where​ ​uid​ ​IN​ ​(5132b130-ae79-11e4-ab27-0800200c9a66,

SELECT​ ​COUNT(*)​ ​FROM​ ​users​ ​where​ ​uid​ ​=

select​ ​distinct​ ​uid​ ​from​ ​users;

And​ ​also​ ​count​ ​how​ ​many​ ​distinct​ ​partitions​ ​exist:

select​ ​distinct​ ​uid,​ ​count(*)​ ​from​ ​users;

3.1.​ ​CREATE​ ​AN​ ​INDEX

//let’s​ ​create​ ​an​ ​index

​ NDEX​ ​users_by_gender​ ​ON​ ​users(gender);

SELECT​ ​count(*)​ ​FROM​ ​users​ ​WHERE​ ​gender​ ​=​ ​'F';

ALTER​ ​TABLE​ ​users​ ​ADD​ ​emails​ ​set​ ​<text>;

Let’s​ ​insert​ ​some​ ​data​ ​in​ ​the​ ​set:

insert​ ​into​ ​users​ ​(uid,name,age,emails,gender,occupation,zipcode)

Movielens data details:

1 Toy Story (1995) {Animation,Children's,Comedy} 1995

2 Jumanji (1995) {Adventure,Children's,Fantasy} 1995

3 Grumpier Old Men (1995) {Comedy,Romance} 1995

4 Waiting to Exhale (1995) {Comedy,Drama} 1995

5 Father of the Bride Part II (1995) {Comedy} 1995

Denormalized data - Ratings.csv data

5234 25 F 0 60657 1 2000-06-19 20:40:40 31 Dangerous Minds (1995) 1995 {Drama}

1835 25 M 19 11501 3 2000-11-22 08:11:31 31 Dangerous Minds (1995) 1995 {Drama}

6035 25 F 1 78734 2 2000-04-26 01:25:35 31 Dangerous Minds (1995) 1995 {Drama}

4637 35 F 9 48302 3 2000-07-19 18:48:41 31 Dangerous Minds (1995) 1995 {Drama}

3841 45 M 18 26101 3 2000-08-11 13:14:31 31 Dangerous Minds (1995) 1995 {Drama}

2242 18 M 4 60115 5 2000-11-19 02:40:43 31 Dangerous Minds (1995) 1995 {Drama}

w_id | event_time | temperature

PNOSCPQCNX | 2015-01-01 00:02:54.000000+0000 | 20

PNOSCPQCNX | 2015-01-01 00:10:00.000000+0000 | 23

PNOSCPQCNX | 2015-01-01 00:29:53.000000+0000 | 20

PNOSCPQCNX | 2015-01-01 01:04:22.000000+0000 | 12

PNOSCPQCNX | 2015-01-01 01:17:15.000000+0000 | 18

PNOSCPQCNX | 2015-01-01 02:52:25.000000+0000 | 23

PNOSCPQCNX | 2015-01-01 03:09:17.000000+0000 | 11

PNOSCPQCNX | 2015-01-01 03:14:57.000000+0000 | 13

PNOSCPQCNX | 2015-01-01 04:13:33.000000+0000 | 14

PNOSCPQCNX | 2015-01-01 04:47:56.000000+0000 | 13

1. Create a Keyspace - for local installations use

CREATE KEYSPACE metro_systems WITH REPLICATION = { 'class' :

CREATE KEYSPACE metro_systems1 WITH REPLICATION = { 'class' :

ALTER KEYSPACE metro_systems1 WITH REPLICATION = { 'class' :

ALTER KEYSPACE metro_systems1 WITH REPLICATION = { 'class' :

ALTER KEYSPACE valentina_c_1 W

Let’s drop keyspace metro_systems1:

DROP KEYSPACE metro_systems1;

2. Let’s create and modify a column family

CREATE TABLE users (

2.2. ALTER: Add/Delete/Rename columns, Alter Column type;

SELECT * FROM users;

ALTER TABLE users RENAME userid TO uid; //rename column

SELECT * FROM users;

Desc tables; // no tables

2.3. Insert values in the created table

CREATE TABLE users (

INSERT INTO users (uid, name, a

SELECT * FROM users;

2.4. Let’s do also some upserts

UPDATE u sers SET a

Select * from users; // to see the change

Select * from users;

Select * from users;

Insert into users (uid,name, age) values(

Nodetool cfstats keyspace.table //generic command, see below for metro_systems

Nodetool cfstats metro_systems.users // for cloud users use yourname.users

2.5. Lightweight operations

Re-enter cqlsh and your respective keyspace

UPDATE users SET age = 3

UPDATE users SET age = 4

STOP here for theory update. Queries section.

Re-enter CQL with Cqlsh command and use your keyspace;

SELECT * FROM users where uid = 5132b130-ae79-11e4-ab27-0800200c9a66;

SELECT age,gender FROM users where uid =

SELECT * FROM users where uid = 5132b130-ae79-11e4-ab27-0800200c9a66 ORDER

SELECT * FROM users where uid IN (5132b130-ae79-11e4-ab27-0800200c9a66,

SELECT COUNT(*) FROM users where uid =

select distinct uid from users;

And also count how many distinct partitions exist:

select distinct uid, count(*) from users;

3.1. CREATE AN INDEX

//let’s create an index

NDEX users_by_gender ON users(gender);

SELECT count(*) FROM users WHERE gender = 'F';

ALTER TABLE users ADD emails set <text>;

Let’s insert some data in the set:

insert into users (uid,name,age,emails,gender,occupation,zipcode)

Select * from users;

Let’s add some data to existing entries:

Let’s delete some data from existing entries:

SELECT * FROM USERS;

create index on users(emails);

select * from users where emails CONTAINs 'lilian@gmail.com';

LET’S ADD A LIST COLUMN TO TABLE USERS:

ALTER TABLE users ADD top_movies list<text>;