0% found this document useful (0 votes)
60 views14 pages

Hadoop Demo

This document provides an overview of Hadoop, including: - Hadoop is an open source framework for distributed storage and processing of large datasets across clusters of computers. - It uses a master-slave architecture and the Hadoop Distributed File System (HDFS) for storage. - MapReduce is the programming model used for large-scale data processing via parallel processing on mappers and reducers. - Other Hadoop services like Namenode, Datanode, JobTracker and TaskTracker are discussed. - Prerequisites for learning Hadoop include a Linux system, Java, disk space and RAM. Training details on Hadoop administration and development are also provided.

Uploaded by

vishnu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views14 pages

Hadoop Demo

This document provides an overview of Hadoop, including: - Hadoop is an open source framework for distributed storage and processing of large datasets across clusters of computers. - It uses a master-slave architecture and the Hadoop Distributed File System (HDFS) for storage. - MapReduce is the programming model used for large-scale data processing via parallel processing on mappers and reducers. - Other Hadoop services like Namenode, Datanode, JobTracker and TaskTracker are discussed. - Prerequisites for learning Hadoop include a Linux system, Java, disk space and RAM. Training details on Hadoop administration and development are also provided.

Uploaded by

vishnu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Welcome to BigData

Agenda
Internet in real time
What is Hadoop
Master Slave architecture
Hadoop Distributed File System ( HDFS )
Map-Reduce
Hadoop Services
What is BigData
Why Hadoop is important for career
Prerequisites to learn Hadoop
Hadoop training details
Internet in real time

http://pennystocks.la/internet-in-real-time/
What is Hadoop

An Apache product
An open source framework

Distributed storage

Distributed processing

Deals with large amount of data


Master Slave architecture
Hadoop Distributed File System ( HDFS )
Map-Reduce
MapReduce plays a key role in Hadoop framework.

MapReduce is a programming model for writing


applications that rapidly process large amounts of
data.

Mapper – Processes input data parallelly on multiple


machines to generate intermediate output data.

Reducer - Fetches intermediate output from all


mappers and generate final output data.
Hadoop Services
The services which run on a Hadoop cluster.

Namenode ( Runs on master machine )

Datanode ( Runs on every slave machine )

Job Tracker ( Runs on master machine )

Task Tracker ( Runs on every slave machine )


Hadoop Services
What is BigData

BigData is story of 3 V's

Hadoop is the base technology.

Other sub technologies are HBase, Hive,


Pig, Sqoop etc
Why Hadoop is important for career

Better Salary Globally available

Better job opportunity Big companies hiring


Prerequisites to learn Hadoop
Linux based operating system Mac OS, Redhat,
Ubuntu

Java 1.6 or higher version.

Disk space ( To hold large amount of data)

RAM (Recommended 2 GB)

A group of computers.

You can install Hadoop even on single machine.


Hadoop Training details
Hadoop Developer ( Needs prior Java knowledge )
Suitable for guys from Java background

Hadoop Admin ( No need of Java knowledge )


Suitable for guys from Database, Data Warehousing, Informatica, Mainframe
& BI background

Our Training Programme : Hadoop Admin + Hadoop Developer,


HBase, Hive, Pig, Sqoop & POC

Training Duration : 1 Month ( 5 days a week )

Monday to Friday morning ( India )


Sunday to Thursday evening ( USA )
…Thanks…

You might also like