Hadoop Hands on    Session                  1
Agenda            Installation            Start Services            Job Tracker and Name Node UI Overview            I...
Installation            Hadoop installation is as easy as unzipping the binary             distribution.            Setu...
Start Services            First format the namenode               hadoop namenode -format            Now start all 4 ser...
UI Overview            Name node               http://localhost:50070/dfshealth.jsp            Data Node               h...
Interaction with HDFSImpetus Confidential                      6
Mapper         public static class DemoMapper extends Mapper<LongWritable, Text, Text, IntWritable> {                   //...
Reducer         public static class DemoReducer extends Reducer<Text, IntWritable, Text, IntWritable> {              @Over...
Hadoop Job         public static void main(String[] args) throws Exception {                   Configuration conf = new Co...
Thank You                        Q&AImpetus Proprietary               10
Upcoming SlideShare
Loading in...5
×

Hands on

1,216

Published on

Hands on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,216
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
53
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Hands on

  1. 1. Hadoop Hands on Session 1
  2. 2. Agenda  Installation  Start Services  Job Tracker and Name Node UI Overview  Interaction with HDFS  Mapper  Reducer  Hadoop Job  Run Map Reduce on HadoopImpetus Proprietary 2
  3. 3. Installation  Hadoop installation is as easy as unzipping the binary distribution.  Setup ssh login to localhost- rm -rf .ssh/* ssh-keygen -t rsa > /dev/null ssh-copy-id -i localhost ssh localhostImpetus Confidential 3
  4. 4. Start Services  First format the namenode hadoop namenode -format  Now start all 4 services hadoop namenode hadoop datanode hadoop jobtracker hadoop tasktrackerImpetus Confidential 4
  5. 5. UI Overview  Name node http://localhost:50070/dfshealth.jsp  Data Node http://localhost:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=% 2F  Job Tracker http://localhost:50030/jobtracker.jsp  Task Tracker http://localhost:50060/tasktracker.jspImpetus Confidential 5
  6. 6. Interaction with HDFSImpetus Confidential 6
  7. 7. Mapper public static class DemoMapper extends Mapper<LongWritable, Text, Text, IntWritable> { // 11~American President, The (1995)~Comedy|Drama|Romance public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] records = value.toString().split("~"); String[] genres = records[2].split("|"); for (String genre : genres) { context.write(new Text(genre), new IntWritable(1)); } } }Impetus Confidential 7
  8. 8. Reducer public static class DemoReducer extends Reducer<Text, IntWritable, Text, IntWritable> { @Override protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int count = 0; for (IntWritable value : values) { count += value.get(); } context.write(key, new IntWritable(count)); } }Impetus Confidential 8
  9. 9. Hadoop Job public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf); job.setJarByClass(DemoMapper.class); job.setMapperClass(DemoMapper.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.setReducerClass(DemoReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setNumReduceTasks(1); // 1 is default job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); TextInputFormat.addInputPath(job, new Path("/mldata/movies.dat")); TextOutputFormat.setOutputPath(job, new Path("/mldata/moviesout/")); boolean result = job.waitForCompletion(true); System.out.println("Job status: " + result); }Impetus Confidential 9
  10. 10. Thank You Q&AImpetus Proprietary 10
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×