0% found this document useful (0 votes)

41 views10 pages

Execute WordCount in Hadoop CDH

Uploaded by

K arun kumar Arun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views10 pages

Execute WordCount in Hadoop CDH

Uploaded by

K arun kumar Arun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

How to Execute WordCount Program in

MapReduce using Cloudera

Distribution Hadoop(CDH)

For Lab on Wednesday (27/9/23)

The steps which show how to write a MapReduce code for Word
Count.

Input:

Hello I am Geeks for Geeks

Hello I am an Intern
Output:

GeeksforGeeks 1
Hello 2
I 2
Intern 1
am 2
an 1

Steps:

 First Open Eclipse -> then select File -> New -> Java
Project ->Name it WordCount -> then Finish.
 Create Three Java Classes into the project. Name
them WCDriver(having the main
function), WCMapper, WCReducer.
 You have to include two Reference Libraries for that:
Right Click on Project -> then select Build Path-> Click
on Configure Build Path
 In the above figure, you can see the Add External JARs
option on the Right Hand Side. Click on it and add the below
mention files. You can find these files in /usr/lib/
1. /usr/lib/hadoop-0.20-mapreduce/hadoop-core-2.6.0-mr1-
cdh5.13.0.jar
2. /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.13.0.jar

Mapper Code: You have to copy paste this program into the
WCMapper Java Class file.

 Java

// Importing libraries
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,

Text, Text, IntWritable> {

// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException
{

String line = value.toString();

// Splitting the line on spaces

for (String word : line.split(" "))
{
if (word.length() > 0)
{
output.collect(new Text(word), new IntWritable(1));
}
}
}
}

Reducer Code: You have to copy paste this program into the
WCReducer Java Class file.

 Java

// Importing libraries
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WCReducer extends MapReduceBase implements Reducer<Text,

IntWritable, Text, IntWritable> {

// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{

int count = 0;

// Counting the frequency of each words

while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}

output.collect(key, new IntWritable(count));

}
}

Driver Code: You have to copy paste this program into the
WCDriver Java Class file.

 Java

// Importing libraries
import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException

{
if (args.length < 2)
{
System.out.println("Please give valid inputs");
return -1;
}

JobConf conf = new JobConf(WCDriver.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
}

// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = ToolRunner.run(new WCDriver(), args);
System.out.println(exitCode);
}
}

 Now you have to make a jar file. Right Click on Project-

> Click on Export-> Select export destination as Jar
File-> Name the jar File(WordCount.jar) -> Click on
next -> at last Click on Finish. Now copy this file into the
Workspace directory of Cloudera



 Open the terminal on CDH and change the directory to the
workspace. You can do this by using “cd workspace/”
command. Now, Create a text file(WCFile.txt) and move it
to HDFS. For that open terminal and write this
code(remember you should be in the same directory as jar
file you have created just now).

 cat >> WordCCountFinal.txt

Enter your own text here
After finishing the text
Click Ctrl+z

 Then Create a directory using below command:

 sudo -u hdfs hadoop dfs -mkdir /WordCCount
 TO create the text file . Type the below command:
 Add the WordCCount.txt in hadoop by using below
command :
 sudo -u hdfs hadoop dfs -put
/WordCCount/WordCCountFinal.txt
 To view the contents
 sudo -u hdfs hadoop dfs -cat
/WordCCount/WordCCountFinal.txt
 Run jar file now using below command :

Sudo -u hdfs hadoop jar /home/cloudera/WordCCountFinal.jar

WCDriver /WordCCount/WordCCountFinal.txt OutputWC

 After executing the jar file , Run below command to

see the output
 hadoop fs -ls /WordCCount
 You can view output file that is OutputWC
 Type
 sudo -u hdfs hadoop dfs -cat
/WordCCount/OutputWC

Thanks and Regards

Kimmi Kumari

Dsbda 11
No ratings yet
Dsbda 11
15 pages
BDA3
No ratings yet
BDA3
7 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Ex No 04
No ratings yet
Ex No 04
4 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Lab-1-Steps-Word Count Problem-Hadoop
No ratings yet
Lab-1-Steps-Word Count Problem-Hadoop
6 pages
Word Count Using MapReduce On Hadoop
No ratings yet
Word Count Using MapReduce On Hadoop
14 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
No ratings yet
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
6 pages
Lab11 B
No ratings yet
Lab11 B
9 pages
Sanoob BDA 1 S Merged
No ratings yet
Sanoob BDA 1 S Merged
8 pages
Mapreduce Program
No ratings yet
Mapreduce Program
3 pages
Hadoop Installation & MapReduce Guide
No ratings yet
Hadoop Installation & MapReduce Guide
13 pages
BDA University Questions
No ratings yet
BDA University Questions
10 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
37 pages
L4A Running Hadoop With MR
No ratings yet
L4A Running Hadoop With MR
5 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Java WordCount with Hadoop Guide
No ratings yet
Java WordCount with Hadoop Guide
6 pages
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
No ratings yet
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
4 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Running Jar Program
No ratings yet
Running Jar Program
3 pages
Palak
No ratings yet
Palak
10 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Homework Labs Lecture2
No ratings yet
Homework Labs Lecture2
6 pages
Week 2 de Unedited
No ratings yet
Week 2 de Unedited
13 pages
Labs Lecture2
No ratings yet
Labs Lecture2
6 pages
TP3 - Hadoop Python - Wordcount
No ratings yet
TP3 - Hadoop Python - Wordcount
6 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Activity 2
No ratings yet
Activity 2
31 pages
Sanoob BDA - 2
No ratings yet
Sanoob BDA - 2
4 pages
Run Wordcount
No ratings yet
Run Wordcount
3 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Sanjith BDA 2
No ratings yet
Sanjith BDA 2
4 pages
Hadoop MapReduce WordCount Guide
No ratings yet
Hadoop MapReduce WordCount Guide
5 pages
BDAPract 4
No ratings yet
BDAPract 4
5 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
Intellipaat Hands On Exercises PDF
No ratings yet
Intellipaat Hands On Exercises PDF
49 pages
Word Count (2021)
No ratings yet
Word Count (2021)
50 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
3 MapReduce Program Ex Code
No ratings yet
3 MapReduce Program Ex Code
14 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Coding
No ratings yet
Coding
10 pages
Cloudera Academic Partnership 3 PDF
0% (1)
Cloudera Academic Partnership 3 PDF
103 pages
Document From ????
No ratings yet
Document From ????
5 pages
React JS
No ratings yet
React JS
11 pages
RL Report TEAM - 12
No ratings yet
RL Report TEAM - 12
17 pages
RL Report TEAM - 6
No ratings yet
RL Report TEAM - 6
13 pages
Public Participation in EIA
No ratings yet
Public Participation in EIA
12 pages
Module 3 Part 1B AgularJS
No ratings yet
Module 3 Part 1B AgularJS
24 pages
Important Programs
No ratings yet
Important Programs
20 pages
Labsheet 2
No ratings yet
Labsheet 2
21 pages
C# Expiriments
No ratings yet
C# Expiriments
42 pages
LB12 - Implement GAN For Neural Style Transfer (1) .Ipynb - Colab
No ratings yet
LB12 - Implement GAN For Neural Style Transfer (1) .Ipynb - Colab
17 pages
PREMKUMAR
No ratings yet
PREMKUMAR
2 pages
Comprehensive Analysis of Software Project Management
No ratings yet
Comprehensive Analysis of Software Project Management
20 pages
Case Study Applied ML
No ratings yet
Case Study Applied ML
1 page
Exp-2 Hadoop Commands
No ratings yet
Exp-2 Hadoop Commands
6 pages
Lab Sheet 3 - Interactive Webpage Using HTML5 and CSS3 - Resturant
No ratings yet
Lab Sheet 3 - Interactive Webpage Using HTML5 and CSS3 - Resturant
7 pages
Soa Module1
No ratings yet
Soa Module1
42 pages
Full Blown Research Format: Research Proposal Application Guide
No ratings yet
Full Blown Research Format: Research Proposal Application Guide
4 pages
Sp80-Nr964-4 G Linux Android Software User Manual - Scribd
No ratings yet
Sp80-Nr964-4 G Linux Android Software User Manual - Scribd
2 pages
Rayudu Secrets of Varga Chakras
100% (1)
Rayudu Secrets of Varga Chakras
3 pages
wkst1 Chem05 chpp10
No ratings yet
wkst1 Chem05 chpp10
8 pages
Supreme Court Case: Paciente vs. Dacuycuy
No ratings yet
Supreme Court Case: Paciente vs. Dacuycuy
3 pages
Empty Lesson Plan Format
No ratings yet
Empty Lesson Plan Format
2 pages
Hilton MA 13e Chap001 PPT
No ratings yet
Hilton MA 13e Chap001 PPT
31 pages
All India Granite Companies
100% (13)
All India Granite Companies
234 pages
Wa0001.
No ratings yet
Wa0001.
26 pages
Get Through FRCR Part 2B Rapid Reporting of Plain Radiographs Official Test Bank
No ratings yet
Get Through FRCR Part 2B Rapid Reporting of Plain Radiographs Official Test Bank
325 pages
Electric Discharge Through Gases: Electron, Photon, Photoelectric Effect and X-Rays
No ratings yet
Electric Discharge Through Gases: Electron, Photon, Photoelectric Effect and X-Rays
21 pages
g12 Quizz 1
No ratings yet
g12 Quizz 1
4 pages
Bhs - Inggris Bab 6
No ratings yet
Bhs - Inggris Bab 6
4 pages
Work Life Balance of Employees and Its Effect On Work Related Factors in Nationalized Banks
No ratings yet
Work Life Balance of Employees and Its Effect On Work Related Factors in Nationalized Banks
8 pages
2022 Hs Contest
No ratings yet
2022 Hs Contest
5 pages
Grade 3 Verb Lesson Plan 2021-22
No ratings yet
Grade 3 Verb Lesson Plan 2021-22
12 pages
Classical Period Art
100% (1)
Classical Period Art
39 pages
L1a Introduction To Information and Communication Technology
100% (2)
L1a Introduction To Information and Communication Technology
65 pages
3-2 Project Draft Introduction and Proposal.... Carmen Mendez
No ratings yet
3-2 Project Draft Introduction and Proposal.... Carmen Mendez
2 pages
Government Accounting
No ratings yet
Government Accounting
3 pages
Lecture D111L Week 05 S13
No ratings yet
Lecture D111L Week 05 S13
64 pages
School Trip to York Guide
No ratings yet
School Trip to York Guide
16 pages
Bail Part 2
No ratings yet
Bail Part 2
116 pages
Window Glass Coverage Endorsement: Your Policy Is Amended As Follows
No ratings yet
Window Glass Coverage Endorsement: Your Policy Is Amended As Follows
2 pages
How African Americans Contributed To The Early Labor Movement and Shaped Pre-Civil War Labor Rights
No ratings yet
How African Americans Contributed To The Early Labor Movement and Shaped Pre-Civil War Labor Rights
36 pages
2024 Test1 Version 1 Memorandum
No ratings yet
2024 Test1 Version 1 Memorandum
6 pages
Control and Coordination Previous Years Questions v2
No ratings yet
Control and Coordination Previous Years Questions v2
2 pages
Detailed Pharmacy Setup List
No ratings yet
Detailed Pharmacy Setup List
2 pages
8th End Term1 2023 RAJA ABC
No ratings yet
8th End Term1 2023 RAJA ABC
4 pages
As You Sow So Shall You Reap
No ratings yet
As You Sow So Shall You Reap
4 pages

Execute WordCount in Hadoop CDH

Uploaded by

Execute WordCount in Hadoop CDH

Uploaded by

How to Execute WordCount Program in

MapReduce using Cloudera

For Lab on Wednesday (27/9/23)

Hello I am Geeks for Geeks

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,

String line = value.toString();

// Splitting the line on spaces

public class WCReducer extends MapReduceBase implements Reducer<Text,

// Counting the frequency of each words

output.collect(key, new IntWritable(count));

public int run(String args[]) throws IOException

JobConf conf = new JobConf(WCDriver.class);

 Now you have to make a jar file. Right Click on Project-

 cat >> WordCCountFinal.txt

 Then Create a directory using below command:

Sudo -u hdfs hadoop jar /home/cloudera/WordCCountFinal.jar

 After executing the jar file , Run below command to

Thanks and Regards

You might also like