0% found this document useful (0 votes)

43 views57 pages

Bda Unit 4 - Mam

The document provides an overview of Apache Hive, a data warehouse infrastructure that simplifies querying and analyzing structured data in Hadoop using HiveQL, a SQL-like language. It details Hive's architecture, including its components like the Meta Store, Execution Engine, and various client types, as well as data types supported by Hive. Additionally, it discusses Hive's capabilities for data manipulation, complex data types, and file formats, along with the distinction between managed and external tables.

Uploaded by

neelohithrathod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views57 pages

Bda Unit 4 - Mam

Uploaded by

neelohithrathod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

UNIT – IV: Apache Hive and NOSQL Database - HBase:

Features of HIVE, HIVE Architecture, HIVE Meta store, Datatypes in HIVE,

HIVEQL, Tables, File Format Types–Text, Sequence, AVRO, Parquet,
Querying Data, Types of NOSQL Database, Characteristics of HBASE,
Architecture, HBase Vs RDBMS, HBASE Shell Commands.
Apache Hive
Apache Hive is a tool for processing structured data stored in a Hadoop Distributed File
System.

It is also a platform used to develop SQL type scripts to do MapReduce operations.

Apache Hive is a data warehouse infrastructure.

It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing
easy.

Hive use language called HiveQL (HQL), which is similar to SQL.

HiveQL automatically translates SQL-like queries into MapReduce jobs.

Apache Hive organizes data into tables.

This provides a means for attaching the structure to data stored in HDFS.

Apache Hive is similar to SQL, hence it becomes very easy for the SQL developers to
Features
Hive provides data summarization, query, and analysis in much easier manner.

It also supports partitioning of data at the level of tables to improve performance.

Hive has a rule-based optimizer for optimizing logical plans.

Using HiveQL doesn’t require any knowledge of programming language, Knowledge of

basic SQL query is enough.

We can easily process structured data in Hadoop using Hive.

Querying in Hive is very simple as it is similar to SQL.

It is designed for OLAP.

HIVE Architecture
User Interface

Hive is a data warehouse infrastructure software that can create interaction between user
and HDFS.

The user interfaces that Hive supports are Hive Web UI, Hive command line, and Hive HD
Insight (In Windows server).

Meta Store

Hive chooses respective database servers to store the schema or Metadata of tables,
databases, columns in a table, their data types, and HDFS mapping.

HiveQL Process Engine

HiveQL is similar to SQL for querying on schema info on the Metastore.

Instead of writing MapReduce program in Java, we can write a query for MapReduce job
and process it.
Execution Engine

The combining part of HiveQL process Engine and MapReduce is Hive Execution Engine.

Execution engine processes the query and generates results as same as MapReduce
results.

HDFS or HBASE

Hadoop distributed file system or HBASE are the data storage techniques to store data
into file system.
The major components of Apache Hive are:

❑ Hive Client

❑ Hive Services

❑ Processing and Resource Management

❑ Distributed Storage

Hive Client

Hive supports applications written in any language like Python, Java, C++, Ruby, etc.
using JDBC, ODBC, and Thrift drivers, for performing queries on the Hive.

Hence, one can easily write a hive client application in any language of its own choice.
Hive clients are categorized into three types:

1. Thrift Clients

The Hive server is based on Apache Thrift so that it can serve the request from a thrift
client.

2. JDBC client

Hive allows for the Java applications to connect to it using the JDBC driver.

JDBC driver uses Thrift to communicate with the Hive Server.

3. ODBC client

Hive ODBC driver allows applications based on the ODBC protocol to connect to Hive.

Similar to the JDBC driver, the ODBC driver uses Thrift to communicate with the Hive
Hive Service

To perform all queries, Hive provides various services like the Hive server2, Beeline,
etc. The various services offered by Hive are:

Beeline

The Beeline is a command shell supported by HiveServer2, where the user can submit its
queries and command to the system.

It is a JDBC client that is based on SQLLINE CLI

Hive Server 2

HiveServer2 is the successor of HiveServer1.

HiveServer2 enables clients to execute queries against the Hive.

It allows multiple clients to submit requests to Hive and retrieve the final results.

It is basically designed to provide the best support for open API clients like JDBC and
ODBC.
Hive Driver

The Hive driver receives the HiveQL statements submitted by the user through the command
shell.

It creates the session handles for the query and sends the query to the compiler.

Hive Compiler

Hive compiler parses the query.

It performs semantic analysis and type-checking on the different query blocks and query
expressions by using the metadata stored in metastore and generates an execution plan.

The execution plan created by the compiler is the DAG(Directed Acyclic Graph), where each
stage is a map/reduce job, operation on HDFS, a metadata operation.

Optimizer

Optimizer performs the transformation operations on the execution plan and splits the task to
improve efficiency and scalability.
Execution Engine

Execution engine, after the compilation and optimization steps, executes the execution
plan created by the compiler in order of their dependencies using Hadoop.

Metastore

Metastore is a central repository that stores the metadata information about the structure
of tables and partitions, including column and column type information.

It also stores information of serializer and deserializer, required for the read/write
operation, and HDFS files where data is stored. This metastore is generally a relational
database.
Processing-and-Resource-Management:

Hive internally uses a MapReduce framework as a default engine for executing the
queries.

MapReduce job works by splitting data into chunks, which are processed by map-reduce
tasks.

Distributed Storage

Hive is built on top of Hadoop, so it uses the underlying Hadoop Distributed File System for
the distributed storage.
Working of Hive
Working of Hive
Execute Query- The Hive interface such as Command Line or Web UI sends query Driver to

execute.

Get Plan- The driver takes the help of query complier that parses the query to check the

syntax and query plan or the requirement of query.

Get Metadata- The compiler sends metadata request to Metastore

Send Metadata- Metastore sends metadata as a response to the compiler.

Send Plan- The compiler checks the requirement and resends the plan to the driver. Up to

here, the parsing and compiling of a query is complete.

Execute Plan- the driver sends the execute plan to the execution engine.
Execute Job-

Internally, the process of execution job is a MapReduce job.

The execution engine sends the job to JobTracker, which is in Name node and it assigns
this job to TaskTracker, which is in Data node.

Here, the query executes MapReduce job.

Metadata Ops-

Meanwhile in execution, the execution engine can execute metadata operations with
Metastore.

Fetch Result-

The execution engine receives the results from Data nodes.

Send Results-

The execution engine sends those resultant values to the driver.The driver sends the
HIVE Data Types
Data Types in Hive specifies the column/field type in the Hive table.

It specifies the type of values that can be inserted into the specified column.
Primitive Type
1. Numeric
2. Date/time
3. String
4. Miscellaneous
Complex Type
1. Array
2. Map
3. Struct
4. Union
Numeric Type

The Numeric data type in Hive is categorized into

Integral data type

Floating data type

Integral data type

a. TINYINT – (1-byte signed integer ranging from -128 to 127)

b. SMALLINT – (2-byte signed integer ranging from -32, 768 to 32, 767)

c. INTEGER – (4-byte signed integer ranging from -2, 147, 483, 648 to 2, 147, 483, 647)

d. BIGINT – (8-byte signed integer ranging from -9, 223, 372, 036, 854, 775, 808 to 9, 223,
372, 036, 854, 775, 807)
Floating data type

a. FLOAT - It is a 4-byte single-precision floating-point number.

b. DOUBLE - It is an 8-byte double-precision floating-point number.

c. DOUBLE PRECISION -It is an alias for DOUBLE. It is only available starting with Hive
2.2.0

d. DECIMAL - It was introduced in Hive 0.11.0. It is based on Java’s BigDecimal. DECIMAL

types support both scientific and non-scientific notations.

Date/Time data type:

TIMESTAMP

Timestamps were introduced in Hive 0.8.0. It supports traditional UNIX timestamp with the
optional nanosecond precision.

The supported Timestamps format is yyyy-mm-dd hh:mm:ss[.f…] in the text files.

DATE

Dates were introduced in Hive 0.12.0. DATE value describes a particular year/month/day
in the form of YYYY-MM-DD.

For example- DATE ‘2020-02-04’

INTERVAL

Hive Interval data types are available only after starting with Hive version 1.2 or above.

Hive accepts the interval syntax with unit specifications. We have to specify the units
along with the interval value.

For example, INTERVAL ‘1’ DAY refers to the day time.

String data type

a. STRING
In Hive, String literals are represented either with the single quotes(‘ ’) or with double-quotes(“ ”).

Hive uses C-style escaping.

b. VARCHAR

In Hive, VARCHAR data types are of different lengths, but we have to specify the maximum
number of characters allowed in the character string.

If the string value assigned to the varchar is less than the maximum length, then the remaining
space will be freed out.

Also, if the string value assigned is more than the maximum length, then the string is silently
truncated.

The length of the varchar is between(1 to 65535).

Trailing whitespace is important in varchar and will affect the comparison results.
c. CHAR

CHAR data types are fixed-length.

The values shorter than the specified length are padded with the spaces.

Unlike VARCHAR, trailing spaces are not significant in CHAR types during comparisons.

The maximum length of CHAR is fixed at 255.

Miscellaneous data type

a. BOOLEAN

Boolean types in Hive store either true or false.

b. BINARY

BINARY type in Hive is an array of bytes.

Arrays

Array in Hive is an ordered sequence of similar type elements that are indexable using the
zero-based integers.

Arrays in Hive are similar to the arrays in JAVA.

array<datatype>

Example: array(‘AIML’,’Dept’). The second element is accessed as array[1].

maps

Map in Hive is a collection of key-value pairs, where the fields are accessed using array
notations of keys (e.g., [‘key’]).

map<primitive_type, data_type>

Example: ‘first’ -> ‘John’, ‘last’ -> ‘Deo’, represented as map(‘first’, ‘John’, ‘last’, ‘Deo’). Now
‘John’ can be accessed with map[‘first’].
structs

STRUCT in Hive is similar to the STRUCT in C language. It is a record type that

encapsulates a set of named fields, which can be any primitive data type.
We can access the elements in STRUCT type using DOT (.) notation.

Example: address STRUCT<street:STRING, city:STRING, pincode:INT>

SELECT address. city, FROM table_name;
union

UNION type in Hive is similar to the UNION in C. UNION types at any point of time can
hold exactly one data type from its specified data types.
The full support for UNIONTYPE data type in Hive is still incomplete.

UNIONTYPE<data_type, data_type, ...

HiveQL
The Hive Query Language (HiveQL) is a query language for Hive to process and analyze
structured data in a Meta store.

HiveQL provides a familiar SQL-like interface for querying and managing large datasets
stored in HDFS or other compatible storage systems.

SQL-Like Syntax:

HiveQL syntax is similar to traditional SQL, making it accessible to users who are already
familiar with relational databases and SQL queries.

It allows users to express queries without dealing directly with MapReduce or other
underlying technologies.
Data Definition Language (DDL):

HiveQL supports DDL statements for defining and managing database objects such as
tables, views, and partitions.

Example:

CREATE TABLE my_table (

id INT,
name STRING,
age INT
);
Data Manipulation Language (DML):

DML statements in HiveQL are used for querying, inserting, updating, and deleting data
in tables.

Example:

SELECT * FROM my_table WHERE age > 25;

Data Loading and Insertion:

HiveQL provides commands for loading data into tables, either from existing files or
external sources.

Example:

LOAD DATA INPATH ‘/path/to/data’ INTO TABLE my_table;

Complex Data Types:

HiveQL supports complex data types such as arrays, maps, and structs, allowing users to
work with semi-structured or nested data.

Example:

CREATE TABLE complex_table (

id INT,
names ARRAY<STRING>,
details MAP<STRING, INT>
);
User-Defined Functions (UDFs):

Users can define custom functions, known as User-Defined Functions (UDFs), in HiveQL to
perform operations not natively supported.

Example:

CREATE FUNCTION my_udf AS ‘com.example.MyUDF’ USING JAR

‘hdfs:///path/to/my_udf.jar’;

Join Operations:

HiveQL supports various types of join operations, including INNER JOIN, LEFT JOIN,
RIGHT JOIN, and FULL OUTER JOIN.

Example:

SELECT * FROM table1 JOIN table2 ON table1.id = table2.id;

Group By and Aggregation:

HiveQL allows users to perform aggregation functions using GROUP BY, such as COUNT,
SUM, AVG, etc.

Example:

SELECT department, AVG(salary) FROM employee GROUP BY department;

Partitioning and Bucketing:

Users can define and work with partitioned tables in HiveQL, allowing for more efficient data
organization and retrieval.

Example

CREATE TABLE partitioned_table (col1 INT, col2 STRING)

PARTITIONED BY (date STRING);
Types of Hive Tables
Hive has mainly two types of tables:

1. Managed Table (Internal Table)

2. External Table

Managed Table (Internal Table)

Hive controls both metadata and the data in a table.

Data is stored in Hive’s warehouse directory: hdfs:/user/hive/warehouse/tablename/

If you drop the table, both schema and data are deleted.
Example:

CREATE TABLE employees (id INT, name STRING, department STRING) ROW FORMAT

DELIMITED FIELDS TERMINATED BY ',’;

It creates a table named employees.

Data stored inside /user/hive/warehouse/employees/ in HDFS.

If you run: DROP TABLE employees;

Table and all its data will be deleted from HDFS.

External Table

Hive manages only the metadata of the table.

The actual data remains outside Hive's warehouse directory.

If you drop the table, metadata is deleted but the data remains.

It is useful when:

❑ You want to share data with other tools like Pig, Spark, etc.

❑ The data is already present in HDFS and you don’t want to move it
Example:

CREATE EXTERNAL TABLE employees (id INT, name STRING, department STRING )

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',’ LOCATION ‘/xyz/employees/’;

If you run: DROP TABLE employees;

Only the schema (metadata) is deleted.

The actual data at the HDFS path remains same

File Format Types
File formats decide how data stored inside the files.

Hive supports several file formats:

❑ Text File

❑ SequenceFile

❑ Avro Files

❑ Parquet
TEXT FILE FORMAT:

This is to store as a plain text file and also the default file format.

Delimited by — separator used in data (comma, period, tab, any complex symbol, or even

some characters).

NULL format — change the null values in data to empty or any value. eg: NULL DEFINED

AS ‘NULL’. Default is ‘\n’

It is Plain text (like CSV or TSV). And easy to read and write.

It can be used when you want easy access for viewing/editing.

Example:

CREATE TABLE students (id INT, name STRING, marks FLOAT) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ',’ STORED AS TEXTFILE;

Load Data:

LOAD DATA LOCAL INPATH '/home/cloudera/students.csv' INTO TABLE students;

students.csv

1,Ravi,80.5

2,Meena,90.0
SequenceFile Format:

Sequence files are flat files that store values in binary key-value pairs.

The sequence files are in binary format and able to split these files.

It can be used when data size is lower than block size which not happens in most cases.

It is faster to read/write than plain text.

it is Use when the data is generated by MapReduce.

this format is written by programs, not manually

Not human-readable. Efficient for internal Hive use, not often used by analysts.

Example:

CREATE TABLE students_seq (id INT, name STRING, marks FLOAT) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ',’ STORED AS SEQUENCEFILE;
Avro Format:

AVRO is a row-based format that has a high degree of splitting.

The schema is stored in JSON format, while the data is stored in binary format, which
minimizes file size and maximizes efficiency.

It supports data serialization and most reliable for schema evolution.

With schema evolution you can change schemas over time.

Example:
CREATE TABLE students_avro
STORED AS AVRO
TBLPROPERTIES (
'avro.schema.literal'='{
"type":"record",
"name":"Student",
"fields":[
{"name":"id", "type":"int"},
{"name":"name", "type":"string"},
{"name":"marks", "type":"float"}
]
}'
);
Parquet Format:

Parquet is a column-oriented binary file format.

Good for scanning large scale queries.

Parquet is optimized for a model called WORM(write once and read more).

It's a good choice for heavy read loads.

Very fast for analytical queries that read only a few columns.

Highly compressed and optimized for large-scale queries.

Example

CREATE TABLE students (id INT, name STRING, marks FLOAT ) STORED AS
PARQUET;
NOSQL
NoSQL is a type of database that stores and retrieves data without needing to define
its structure first - an alternative to the relational databases .

which stands for “not only SQL,” is an approach to database design that provides
flexible schemas for the storage and retrieval of data beyond the traditional table
structures found in relational databases

NoSQL database doesn't use tables for storing data.

It is generally used to store big data and real-time web applications.

relational database also get a problem that it could not handle big data, due to this
problem there was a need of database which can handle every types of problems then
NoSQL database was developed.
Advantages of NoSQL

It supports query language.

It provides fast performance.

It provides horizontal scalability.

Types
four main types of NoSQL databases:

Document databases

Key-value stores

Column-oriented databases

Graph databases
HBase
HBase is a distributed column-oriented database built on top of the Hadoop file
system.

It is a part of the Hadoop ecosystem that provides random real-time read/write access
to data in the Hadoop File System.

Data Producer store the data in HDFS either directly or through HBase.

Data consumer reads/accesses the data in HDFS randomly using HBase.

Storage Mechanism
HBase is a column-oriented database and the tables in it are sorted by row.

The table schema defines only column families.

A table have multiple column families and each column family can have any number
of columns.
Features
Column-Oriented Storage: HBase stores data in column families rather than rows

Schema-less: Each row can have a different number of columns.

Distributed and Scalable: HBase supports horizontal scalability using region servers.

Strong Consistency: HBase follows a strong consistency model, meaning any read
returns the most recent write for a given key.

Automatic Sharding: Data is split into regions and automatically distributed across
servers.

Versioned Data: HBase stores multiple versions of a cell, with each write tagged by a
timestamp.

NoSQL Database: HBase is a non-relational (NoSQL) database.

HBase vs RDBMS
HBase RDBMS

HBase is schema-less, it doesn't have the An RDBMS is governed by its schema,

concept of fixed columns schema; which describes the whole structure of
defines only column families. tables.

It is built for wide tables. HBase is It is thin and built for small tables. Hard
horizontally scalable. to scale.

No transactions are there in HBase. RDBMS is transactional.

It has de-normalized data. It will have normalized data.

It is good for semi-structured as well as

It is good for structured data.
structured data.
Architecture
In HBase, tables are split into regions and are served by the region servers.

Regions are vertically divided by column families into Stores. Stores are saved as files
in HDFS.

HBase has three major components: the client library, a master server, and region
servers.
MasterServer

Assigns regions to the region servers and takes the help of Apache ZooKeeper for this
task.

Handles load balancing of the regions across region servers.

It unloads the busy servers and shifts the regions to less occupied servers.

Maintains the state of the cluster by negotiating the load balancing.

Is responsible for schema changes and other metadata operations such as creation of
tables and column families.

Region server

Communicate with the client and handle data-related operations.

Handle read and write requests for all the regions under it.

Decide the size of the region by following the region size thresholds.

HIVE
No ratings yet
HIVE
18 pages
Unit V-Hive
No ratings yet
Unit V-Hive
10 pages
Apache Hive for Data Analysts
No ratings yet
Apache Hive for Data Analysts
8 pages
Hive Unit VI
No ratings yet
Hive Unit VI
39 pages
IET Udaipur BDA Unit-5
No ratings yet
IET Udaipur BDA Unit-5
9 pages
Apache Hive: Structure & Data Analysis
No ratings yet
Apache Hive: Structure & Data Analysis
25 pages
Hive
No ratings yet
Hive
30 pages
BDA Unit 4 Notes
No ratings yet
BDA Unit 4 Notes
33 pages
Apache Hive for Big Data Processing
No ratings yet
Apache Hive for Big Data Processing
19 pages
Unit 5 Lecture No-1 (Hive)
No ratings yet
Unit 5 Lecture No-1 (Hive)
30 pages
Unit 5 (BDC)
No ratings yet
Unit 5 (BDC)
59 pages
HIVE
No ratings yet
HIVE
16 pages
Unit 5 Lecture No-1 (Hive)
No ratings yet
Unit 5 Lecture No-1 (Hive)
30 pages
Hive
No ratings yet
Hive
5 pages
Bda Unit 5 Hive Notes
No ratings yet
Bda Unit 5 Hive Notes
23 pages
Chapter 5 Hive
No ratings yet
Chapter 5 Hive
69 pages
Unit 5-Hive
No ratings yet
Unit 5-Hive
18 pages
Unit-Vi Hive Hadoop & Big Data
100% (1)
Unit-Vi Hive Hadoop & Big Data
24 pages
Hive Final
No ratings yet
Hive Final
75 pages
Introduction To HIVE
No ratings yet
Introduction To HIVE
8 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
182 pages
Hive
No ratings yet
Hive
49 pages
Unit5 Notes
No ratings yet
Unit5 Notes
29 pages
Unit IV
No ratings yet
Unit IV
22 pages
Introduction to Hive Architecture
No ratings yet
Introduction to Hive Architecture
23 pages
Chapter 7
No ratings yet
Chapter 7
84 pages
Using Hive For Data Warehousing: Introduction To Hive
No ratings yet
Using Hive For Data Warehousing: Introduction To Hive
4 pages
(R17a0528) Big Data Analytics-57-100
No ratings yet
(R17a0528) Big Data Analytics-57-100
44 pages
Bda Report
No ratings yet
Bda Report
16 pages
Unit 4 Hadoop Ecosystem - HIVE and PIG
No ratings yet
Unit 4 Hadoop Ecosystem - HIVE and PIG
157 pages
Unit-4 Hive
No ratings yet
Unit-4 Hive
10 pages
BDA Unit-5
No ratings yet
BDA Unit-5
26 pages
Hive Data Warehousing Overview
No ratings yet
Hive Data Warehousing Overview
9 pages
Apache Hive Overview & Architecture
No ratings yet
Apache Hive Overview & Architecture
27 pages
Big Data & Analytics (CSE6005) L6
No ratings yet
Big Data & Analytics (CSE6005) L6
56 pages
Hive for Big Data Professionals
No ratings yet
Hive for Big Data Professionals
17 pages
Unit 3
No ratings yet
Unit 3
23 pages
Hive Data Types and Data Models
No ratings yet
Hive Data Types and Data Models
24 pages
Architecture and Working of Hive
No ratings yet
Architecture and Working of Hive
7 pages
Assignment 4-Gcc: Hive Is Not
No ratings yet
Assignment 4-Gcc: Hive Is Not
3 pages
04 Bigdata Hive
No ratings yet
04 Bigdata Hive
22 pages
Bda Unit 5 Notes
No ratings yet
Bda Unit 5 Notes
23 pages
Big Data Analytics Module-4
No ratings yet
Big Data Analytics Module-4
39 pages
Course3 Module2 Intro To Hive Slides
No ratings yet
Course3 Module2 Intro To Hive Slides
76 pages
Hive Updated
No ratings yet
Hive Updated
18 pages
01 Introduction To Hive (1) 2 15
No ratings yet
01 Introduction To Hive (1) 2 15
14 pages
BDA Unit-5
No ratings yet
BDA Unit-5
39 pages
Big-Data-Unit 5
No ratings yet
Big-Data-Unit 5
54 pages
Hadoop Ecosystem: Hive and MapReduce
No ratings yet
Hadoop Ecosystem: Hive and MapReduce
14 pages
Hive
No ratings yet
Hive
52 pages
Hive Full Lecture
No ratings yet
Hive Full Lecture
17 pages
Module 3-1
No ratings yet
Module 3-1
32 pages
Apache Hive: Prashant Gupta
100% (1)
Apache Hive: Prashant Gupta
61 pages
Course On: Big Data Analytics
No ratings yet
Course On: Big Data Analytics
59 pages
Hiveppt
No ratings yet
Hiveppt
29 pages
DSS U4 HIVE Rev1.1
No ratings yet
DSS U4 HIVE Rev1.1
23 pages
Data Mining With Big Data E-Health Service Using Map Reduce: Abinaya.K
No ratings yet
Data Mining With Big Data E-Health Service Using Map Reduce: Abinaya.K
5 pages
Important Questions-Bigdata
No ratings yet
Important Questions-Bigdata
4 pages
Cs498 Week 12 Slide
No ratings yet
Cs498 Week 12 Slide
100 pages
IT Specialist in Cloud & Big Data
No ratings yet
IT Specialist in Cloud & Big Data
7 pages
Pdi Security
No ratings yet
Pdi Security
41 pages
Ocs352-Iot Book-69-71
No ratings yet
Ocs352-Iot Book-69-71
3 pages
Ch. 3 ... Part - 5
No ratings yet
Ch. 3 ... Part - 5
21 pages
HCIA-Big Data V2.0 Lab Guide For Big Data Engineers - Revision 4
No ratings yet
HCIA-Big Data V2.0 Lab Guide For Big Data Engineers - Revision 4
131 pages
NPTEL - CC - Assignment 5
50% (2)
NPTEL - CC - Assignment 5
4 pages
NoSQL Databases for Developers
No ratings yet
NoSQL Databases for Developers
57 pages
LogSig Generating System Events From Raw Textual Logs
No ratings yet
LogSig Generating System Events From Raw Textual Logs
10 pages
Big Data Storage Techniques For Spatial Databases: Implications of Big Data Architecture On Spatial Query Processing
No ratings yet
Big Data Storage Techniques For Spatial Databases: Implications of Big Data Architecture On Spatial Query Processing
27 pages
BDA - Unit-4 Part 1
No ratings yet
BDA - Unit-4 Part 1
47 pages
Teja
No ratings yet
Teja
5 pages
Big Data & Hadoop Course Overview
No ratings yet
Big Data & Hadoop Course Overview
3 pages
Cloud Storage Systems Overview
No ratings yet
Cloud Storage Systems Overview
35 pages
Social Network Analysis (2017 Reg) - Unit2
No ratings yet
Social Network Analysis (2017 Reg) - Unit2
47 pages
Map Reduce Workflow Colloquim
No ratings yet
Map Reduce Workflow Colloquim
30 pages
Big Data Systems Course Overview
No ratings yet
Big Data Systems Course Overview
71 pages
Big Data Engineering Essentials Guide
No ratings yet
Big Data Engineering Essentials Guide
14 pages
MapReduce & Hadoop for CS Students
No ratings yet
MapReduce & Hadoop for CS Students
25 pages
Spark and Scala Course
No ratings yet
Spark and Scala Course
5 pages
Exam Overview: GCP Data Engineer
100% (1)
Exam Overview: GCP Data Engineer
12 pages
IOT2023
No ratings yet
IOT2023
2 pages
HDFS Basics for Tech Professionals
No ratings yet
HDFS Basics for Tech Professionals
13 pages
Unit 5 Lecture No-3 (Hbase)
No ratings yet
Unit 5 Lecture No-3 (Hbase)
35 pages
Telecom Fraud 101 Ebook
No ratings yet
Telecom Fraud 101 Ebook
30 pages
Tech Architect & Trainer Profile
No ratings yet
Tech Architect & Trainer Profile
8 pages
Big Data Analytics
100% (1)
Big Data Analytics
227 pages