MaxTemp PPT.pptx

Finding maximum temperature of each
city in a state using MapReduce

If you are familiar with creating a new project and adding external jar
files, then do it and start from slide number

Click on File ->New -> Java Project.

Enter project name and click Next>

Click on File System in the Places pane-> open usr folder.

Select all the jar files and click ok.

Click on Add External JARs again.

Select all JAR files and click ok.

Under your project folder, right click on src->select new->select class.

Enter the name of mapper class and click finish.

Similarly create reducer class. ( right click on src->new->class, then enter reducer class name)

Similarly create driver class.

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class Map
extends Mapper<LongWritable, Text, Text, IntWritable>{
private IntWritable max = new IntWritable();
private Text word = new Text();
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer line = new StringTokenizer(value.toString(),",t");
word.set(line.nextToken());
max.set(Integer.parseInt(line.nextToken()));
context.write(word,max);
}
}
Mapper Program

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.mapreduce.Reducer;
public class Reduce
extends Reducer<Text, IntWritable, Text, IntWritable>{
private int max_temp = Integer.MIN_VALUE;
private int temp = 0;
@Override
protected void reduce(Text key, Iterable<IntWritable> values,
Context context)
throws IOException, InterruptedException {
Iterator<IntWritable> itr = values.iterator();
temp = 0;
max_temp = Integer.MIN_VALUE;
while (itr.hasNext()) {
temp = itr.next().get();
if( temp > max_temp)
{
max_temp = temp;
}
}
context.write(key, new IntWritable(max_temp));
}
}
Reducer Program

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class MaxTempDriver {
public static void main(String[] args) throws Exception {
// Create a new job
Job job = new Job();
// Set job name to locate it in the distributed environment
job.setJarByClass(MaxTempDriver.class);
job.setJobName("Max Temperature");
// Set input and output Path, note that we use the default input format
// which is TextInputFormat (each record is a line of input)
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
// Set Mapper and Reducer class
job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);
// Set Output key and value
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Driver Program

After saving all 3 programs -> right click on your project in package explorer pane and click on export.

Under Java -> select JAR file –> click Next>

Enter Name and select the location where you wish to store JAR file.

Open terminal and Use ls command to check if the jar file is present.

Create a text file and enter sample data into the text file. Press ctrl +Z when u have finished entering the
data.

Check if the file is moved properly by displaying the file contents.

Now give the txt file store in HDFS as input to the MapReduce program.
( /output is the directory in which the output will be stored.)
hadoop jar /home/cloudera/MaxTemp.jar MaxTemp /maxtemp_ip/MaxTemp_Data.txt /maxtemp_op

Check the contents of the output directory. It will contain a file named “part-r-00000” which contains the
output of the program.
Display the contents of that file to check the output obtained.

MaxTemp PPT.pptx

Recommended

Recommended

More Related Content

Similar to MaxTemp PPT.pptx

Similar to MaxTemp PPT.pptx (20)

Recently uploaded

Recently uploaded (20)

MaxTemp PPT.pptx