MapReduce

MapReduce parctice.

WordCount

A counter to count differernt word and its number.

How to run?

Set classpath according to reference document
Compile them
```
javac *.java
```
Pack them into a jar
```
jar cvf WordCount.jar *.class
```

Upload input file to HDFS

hadoop fs -mkdir -p /input/WordCount
hadoop fs -put input_file.txt /input/WordCount

Create a output path(If already exists, skip to next step)
```
hadoop fs -mkdir /output
```

Run it

hadoop jar WordCount.jar WordCount /input/WordCount /output/WordCount

Cat or download output file

hadoop fs -cat /output/WordCount/part-r-00000
hadoop fs -get /output/WordCount/part-r-00000 result.txt

MeanScore

Calculate mean score from student transcript by studnet and by subject

StudentMean

How to run?

Set classpath according to reference document
Compile them
```
javac StudentMean\*.java
```

Pack them into a jar

jar cvf StudentMean.jar StudentMean\*.class

Upload input file to HDFS

hadoop fs -mkdir -p /input/ScoreMean
hadoop fs -put student_transcript.txt /input/ScoreMean

Create a output path(If already exists, skip to next step)
```
hadoop fs -mkdir /output
```

Run it

hadoop jar StudentMean.jar StudentMean.StudentMean /input/ScoreMean /output/StudentMean

Cat or download output file

hadoop fs -cat /output/StudentMean/part-r-00000
hadoop fs -get /output/StudentMean/part-r-00000 StudentMean.txt

SubjectMean

How to run?

Set classpath according to reference document
Compile them
```
javac SubjectMean\*.java
```

Pack them into a jar

jar cvf SubjectMean.jar SubjectMean\*.class

Upload input file to HDFS(If already exists, skip to next step)

hadoop fs -mkdir -p /input/ScoreMean
hadoop fs -put student_transcript.txt /input/ScoreMean

Create a output path(If already exists, skip to next step)
```
hadoop fs -mkdir /output
```

Run it

hadoop jar SubjectMean.jar SubjectMean.SubjectMean /input/ScoreMean /output/SubjectMean

Cat or download output file

hadoop fs -cat /output/SubjectMean/part-r-00000
hadoop fs -get /output/SubjectMean/part-r-00000 SubjectMean.txt

GrandchildGrandparent

Input a child-parent file, find all grandchild-grandparent. Suppose there is no same name

How to run?

Set classpath according to reference document
Compile them
```
javac *.java
```

Pack them into a jar

jar cvf GrandchildGrandparent.jar *.class

Upload input file to HDFS

hadoop fs -mkdir -p /input/GrandchildGrandparent
hadoop fs -put child-parent.txt /input/GrandchildGrandparent

Create a output path(If already exists, skip to next step)
```
hadoop fs -mkdir /output
```

Run it

hadoop jar SubjectMean.jar SubjectMean.SubjectMean /input/ScoreMean /output/SubjectMean

Cat or download output file

hadoop fs -cat /output/GrandchildGrandparent/part-r-00000
hadoop fs -get /output/GrandchildGrandparent/part-r-00000 GrandchildGrandparent.txt

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
GrandchildGrandparent/src		GrandchildGrandparent/src
MeanScore/src		MeanScore/src
WordCount/src		WordCount/src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MapReduce

WordCount

How to run?

MeanScore

StudentMean

How to run?

SubjectMean

How to run?

GrandchildGrandparent

How to run?

About

Uh oh!

Releases

Packages

Languages

LittleGreenMouse/MapReduce

Folders and files

Latest commit

History

Repository files navigation

MapReduce

WordCount

How to run?

MeanScore

StudentMean

How to run?

SubjectMean

How to run?

GrandchildGrandparent

How to run?

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages