Hbase

HBase is a non-relational, scalable database built on HDFS. It is based on Google's BigTable and allows CRUD operations. HBase uses a master-region server architecture with Zookeeper for coordination. Data is stored in tables with rows identified by keys and columns grouped into column families that can contain many versions. HBase integrates with tools like Spark, Hive and Pig for large-scale data access and population.

Uploaded by

Bora Yüret

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views15 pages

Hbase

Uploaded by

Bora Yüret

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

HBASE

Non-relational, scalable database

built on HDFS
Based on Google’s BigTable
CRUD

■ Create
■ Read
■ Update
■ Delete
■ There is no query language, only CRUD API’s!
HBase architecture
Zookeeper HMaster
Zookeeper HMaster
Zookeeper HMaster

Region Region Region Region

Server Server Server Server
Auto-sharding!

HDFS
HBase data model

■ Fast access to any given ROW

■ A ROW is referenced by a unique KEY
■ Each ROW has some small number of COLUMN FAMILIES
■ A COLUMN FAMILY may contain arbitrary COLUMNS
■ You can have a very large number of COLUMNS in a COLUMN FAMILY
■ Each CELL can have many VERSIONS with given timestamps
■ Sparse data is A-OK – missing columns in a row consume no storage.
Example: One row of a web table
Contents column family Anchor column family
Key Contents: Anchor:cnnsi.com Anchor:my.look.ca
com.cnn.www <html><head>
<html><head> “CNN” “CNN.com”
CNN…
<html><head>
CNN…
CNN…
Some ways to access HBase

■ HBase shell
■ Java API
– Wrappers for Python, Scala, etc.
■ Spark, Hive, Pig
■ REST service
■ Thrift service
■ Avro service
LET’S PLAY WITH
HBASE
Creating a HBase table with Python via REST
What are we doing?
■ Create a HBase table for movie ratings by user
■ Then show we can quickly query it for individual users
■ Good example of sparse data

Column family: rating

Rating:50 Rating:33 Rating:223

UserID 1 5 5
How are we doing it?

Python client

REST service

HBase

HDFS
Let’s do this
HBASE / PIG
INTEGRATION
Populating HBase at scale
Integrating Pig with HBase

■ Must create HBase table ahead of time

■ Your relation must have a unique key as its first column, followed by
subsequent columns as you want them saved in Hbase
■ USING clause allows you to STORE into an HBase table
■ Can work at scale – Hbase is transactional on rows
Let’s do this

BDA Unit-5
No ratings yet
BDA Unit-5
31 pages
CCS334 BDA - Unit 5
No ratings yet
CCS334 BDA - Unit 5
27 pages
Ba Iift 17-18
No ratings yet
Ba Iift 17-18
40 pages
HBASE
No ratings yet
HBASE
18 pages
Big Data UNIT 5 Own
No ratings yet
Big Data UNIT 5 Own
18 pages
Hbase - in Detail: Pushpinder Singh Paxcel Technologies
No ratings yet
Hbase - in Detail: Pushpinder Singh Paxcel Technologies
32 pages
Bda - Unit 5
No ratings yet
Bda - Unit 5
30 pages
Apache HBase Tutorial & Setup Guide
No ratings yet
Apache HBase Tutorial & Setup Guide
19 pages
Big Data Analytics Unit-5
No ratings yet
Big Data Analytics Unit-5
28 pages
HBase and Hive at StumbleUpon Presentation
No ratings yet
HBase and Hive at StumbleUpon Presentation
22 pages
Unit 5 Hbase
No ratings yet
Unit 5 Hbase
15 pages
Unit 5 Lecture No-3 (Hbase)
No ratings yet
Unit 5 Lecture No-3 (Hbase)
35 pages
Hadoop Week 6
No ratings yet
Hadoop Week 6
38 pages
HBase - Tutorial
No ratings yet
HBase - Tutorial
14 pages
HBase: Data Management & Architecture
No ratings yet
HBase: Data Management & Architecture
36 pages
Lesson 6 NoSQL Databases HBase
100% (1)
Lesson 6 NoSQL Databases HBase
47 pages
Unit 1 P2 HBase
No ratings yet
Unit 1 P2 HBase
22 pages
S Pig Hive HBase
No ratings yet
S Pig Hive HBase
19 pages
HBase
No ratings yet
HBase
27 pages
Wa0005.
No ratings yet
Wa0005.
53 pages
Hbase
No ratings yet
Hbase
6 pages
9 HBase
No ratings yet
9 HBase
77 pages
HBASE
No ratings yet
HBASE
11 pages
Hbase What Is Hbase?
No ratings yet
Hbase What Is Hbase?
2 pages
Unit 5 Lecture No-3 (Hbase)
No ratings yet
Unit 5 Lecture No-3 (Hbase)
35 pages
Hadoop Tools for Data Experts
No ratings yet
Hadoop Tools for Data Experts
15 pages
What Is HBASE
No ratings yet
What Is HBASE
2 pages
Hbase - Quick Guide Hbase - Overview
No ratings yet
Hbase - Quick Guide Hbase - Overview
53 pages
10 HBase
No ratings yet
10 HBase
13 pages
2 Unit 5
No ratings yet
2 Unit 5
24 pages
Big Data: Week - 11
No ratings yet
Big Data: Week - 11
22 pages
NoteGPT - What Is HBase - HBase Architecture - HBase Tutorial For Beginners - Hadoop Tutorial - Simplilearn
No ratings yet
NoteGPT - What Is HBase - HBase Architecture - HBase Tutorial For Beginners - Hadoop Tutorial - Simplilearn
5 pages
Unit 5 Bda
No ratings yet
Unit 5 Bda
42 pages
Unit 5 Big Data
No ratings yet
Unit 5 Big Data
34 pages
S Pig Hive HBase Zookeeper
No ratings yet
S Pig Hive HBase Zookeeper
19 pages
HBase & Hive Architecture Guide
No ratings yet
HBase & Hive Architecture Guide
10 pages
Unit-5 Notes
No ratings yet
Unit-5 Notes
61 pages
Unit 5 BDA
No ratings yet
Unit 5 BDA
34 pages
Unit V Hadoop Related Tools
No ratings yet
Unit V Hadoop Related Tools
54 pages
HBase
No ratings yet
HBase
31 pages
Columnar Databases for Data Analysts
No ratings yet
Columnar Databases for Data Analysts
18 pages
Hadoop HBASE
No ratings yet
Hadoop HBASE
71 pages
BDA1
No ratings yet
BDA1
42 pages
Unit 3 Hbase, Mongodb and Couch DB
No ratings yet
Unit 3 Hbase, Mongodb and Couch DB
12 pages
Unit 5 Bigdata
No ratings yet
Unit 5 Bigdata
14 pages
HBase Architecture & Features Guide
No ratings yet
HBase Architecture & Features Guide
35 pages
HBase
No ratings yet
HBase
4 pages
Apache HBase
No ratings yet
Apache HBase
12 pages
Chapter 12 HBase
No ratings yet
Chapter 12 HBase
108 pages
Unit - 5 Part - 1
No ratings yet
Unit - 5 Part - 1
8 pages
Big Data 22MSM40206
No ratings yet
Big Data 22MSM40206
9 pages
Week-5 - Lecture Notes
No ratings yet
Week-5 - Lecture Notes
138 pages
HBase
No ratings yet
HBase
39 pages
BDA Unit 5
No ratings yet
BDA Unit 5
33 pages
Hbase
No ratings yet
Hbase
15 pages
bdcc-2 5
No ratings yet
bdcc-2 5
9 pages
HBase
No ratings yet
HBase
38 pages
BDA Unit 5 HIVE HBASE
No ratings yet
BDA Unit 5 HIVE HBASE
33 pages
Mongodb Tutorial
100% (4)
Mongodb Tutorial
101 pages
Most One-Star Movies
No ratings yet
Most One-Star Movies
1 page
Introduction To Kubernetes
No ratings yet
Introduction To Kubernetes
182 pages
Apache Drill: SQL For Nosql
No ratings yet
Apache Drill: SQL For Nosql
7 pages
Section3-17 UsingPigAndUsingScripts
No ratings yet
Section3-17 UsingPigAndUsingScripts
1 page
Cassandra: A Distributed Database With No Single Point of Failure
100% (1)
Cassandra: A Distributed Database With No Single Point of Failure
9 pages
Zeppelin: A Notebook Interface To Your Big Data
No ratings yet
Zeppelin: A Notebook Interface To Your Big Data
5 pages
Oozie: Hadoop Job Orchestration
No ratings yet
Oozie: Hadoop Job Orchestration
10 pages
Mesos: Beyond Hadoop Resource Management
No ratings yet
Mesos: Beyond Hadoop Resource Management
7 pages
Apache Pig
No ratings yet
Apache Pig
23 pages
Hadoop User Experience
No ratings yet
Hadoop User Experience
5 pages
Flink: Another Data Stream Framework!
No ratings yet
Flink: Another Data Stream Framework!
7 pages
Linux Magazine USAIssue 243 February 2021
No ratings yet
Linux Magazine USAIssue 243 February 2021
102 pages