2014 International Software Testing Conference in Seoul
Upcoming SlideShare
Loading in...5
×
 

2014 International Software Testing Conference in Seoul

on

  • 586 views

Introduction to test Hadoop MapReduce code and working environment

Introduction to test Hadoop MapReduce code and working environment

Statistics

Views

Total Views
586
Views on SlideShare
586
Embed Views
0

Actions

Likes
1
Downloads
29
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

2014 International Software Testing Conference in Seoul 2014 International Software Testing Conference in Seoul Presentation Transcript

  • Testing Big Data: Unit Test in Hadoop (Part II) Jongwook Woo (PhD) High-Performance Internet Computing Center (HiPIC) Educational Partner with Cloudera and Grants Awardee of Amazon AWS Computer Information Systems Department California State University, Los Angeles Seoul Software Testing Conference
  • Contents • Test in General • Use Cases: Big Data in Hadoop and Ecosystems • Unit Test in Hadoop Seoul Software Testing Conference
  • Test in general • Quality Assurance – TDD (Test Driven Development) • Unit Test – Test functional units of the S/W – BDD (Behavior Driven Development) • Based on TDD • Test behavior of the S/W – Integration Test: • integrated components – Group of unit tests • CI (Continuous Integration) Server – Hudson, Jenkins etc Seoul Software Testing Conference
  • CI Server • Continuous Integration Server – TDD (Test Driven Development) based • All developers commit the update everyday • CI server compile and run the unit tests • If a test fails, all receive the failure email – Know who committed a bad code – Hudson, Jenkins etc • Supports SCM version control tools – CVS, Subversion, Git Seoul Software Testing Conference
  • Test in Hadoop • Much harder – JUnit cannot be used in Hadoop – Cluster – Server – Parallel Computing Seoul Software Testing Conference
  • Use Cases: Shopzilla • Hadoop’s Elephant In The Room – Hadoop testing • Quality Assurance – Unit Test: functional units of the S/W – Integration Test: integrated components – BDD Test: Behavior of the S/W • Augmented Development – Use a dev cluster? » Too long per day – Hadoop-In-A-Box Seoul Software Testing Conference
  • Use Cases: Shopzilla • Hadoop-In-A-Box – Fully compatible Mock Environment • Without a cluster • Mock cluster state – Test Locally • Single Node Pseudo Cluster • MiniMRCluster • => can test HDFS, Pig Seoul Software Testing Conference
  • Use Cases: Yahoo • Developer – Wants to run Hadoop codes in the local machine • Does not want to run Hadoop codes at the Hadoop cluster • Yahoo HIT – Hadoop Integration Test – Run Hadoop tests in the Hadoop Ecosystems • Deploy HIT on a Hadoop single or cluster • Run tests in Hadoop, Pig, Hive, Oozie,… Seoul Software Testing Conference
  • Unit Test in Hadoop • MRUnit testing framework – is based on JUnit – Cloudera donated to Apache – can test Map Reduce programs • written on 0.20 , 0.23.x , 1.0.x , 2.x version of Hadoop – Can test Mapper, Reducer, MapperReducer Seoul Software Testing Conference
  • Unit Test in Hadoop • WordCount Example – reads text files and counts how often words occur. • The input and the output are text files, – Need three classes • WordCount.java – Driver class with main function • WordMapper.java – Mapper class with map method • SumReducer.java – Reducer class with reduce method Seoul Software Testing Conference
  • WordCount Example • WordMapper.java – Mapper class with map function – For the given sample input • assuming two map nodes – The sample input is distributed to the maps • the first map emits: – <Hello, 1> <World, 1> <Bye, 1> <World, 1> • The second map emits: – <Hello, 1> <Hadoop, 1> <Goodbye, 1> <Hadoop, 1> Seoul Software Testing Conference
  • WordCount Example • SumReducer.java – Reducer class with reduce function – For the input from two Mappers • the reduce method just sums up the values, – which are the occurrence counts for each key • Thus the output of the job is: – <Bye, 1> <Goodbye, 1> <Hadoop, 2> <Hello, 2> <World, 2> Seoul Software Testing Conference
  • WordCount.java (Driver) import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Check Input and Output files Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); Set output (key, value) types job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); Set Mapper/Reducer classes job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); Set Input/Output format classes job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); Set Input/Output paths FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); Set FileOutputFormat.setOutputPath(job, new Path(args[1])); Driver class job.setJarByClass(WordCount.class); job.submit(); } } Seoul Software Testing Conference
  • WordCount.java public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); Submit the job job.setJarByClass(WordCount.class); to the master node job.submit(); } } Seoul Software Testing Conference
  • WordMapper.java (Mapper class) import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static IntWritable one = new IntWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, InterruptedException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } } } Seoul Software Testing Conference
  • WordMapper.java Extends mapper class with input/ output keys and values public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static IntWritable one = new IntWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, InterruptedException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } } } Seoul Software Testing Conference
  • WordMapper.java public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { Output (key, value) types private Text word = new Text(); private final static IntWritable one = new IntWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, InterruptedException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } } } Seoul Software Testing Conference
  • WordMapper.java public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static IntWritable one = new IntWritable(1); Input (key, value) types Output as Context type @Override public void map(Object key, Text value, Context contex) throws IOException, InterruptedException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } } } Seoul Software Testing Conference
  • WordMapper.java public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static IntWritable one = new IntWritable(1); @Override public void map(Object key, Text value, Read words from each line Context contex) throws IOException, InterruptedException { input file of the // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } } } Seoul Software Testing Conference
  • WordMapper.java public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static IntWritable one = new IntWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, InterruptedException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); Count each word while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } } } Seoul Software Testing Conference
  • Shuffler/Sorter • Maps emit (key, value) pairs • Shuffler/Sorter of Hadoop framework – Sort (key, value) pairs by key – Then, append the value to make (key, list of values) p air – For example, • The first, second maps emit: – <Hello, 1> <World, 1> <Bye, 1> <World, 1> – <Hello, 1> <Hadoop, 1> <Goodbye, 1> <Hadoop, 1> • Shuffler produces and it becomes the input of the reducer – <Bye, 1>, <Goodbye, 1>, <Hadoop, <1,1>>, <Hello, <1, 1>> , <World, <1,1>> Seoul Software Testing Conference
  • SumReducer.java (Reducer class) import java.io.IOException; import java.util.Iterator; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class SumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable totalWordCount = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, totalWordCount); } } Seoul Software Testing Conference
  • SumReducer.java Extends Reducer class with input/ output keys and values public class SumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable totalWordCount = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, totalWordCount); } } Seoul Software Testing Conference
  • SumReducer.java public class SumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { Set output value type private IntWritable totalWordCount = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, totalWordCount); } } Seoul Software Testing Conference
  • SumReducer.java public class SumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable totalWordCount = new IntWritable(); Set input (key, list of values) type and output as Context class @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, totalWordCount); } } Seoul Software Testing Conference
  • SumReducer.java public class SumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable totalWordCount = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; For each word, Iterator<IntWritable> it=values.iterator(); Count/sum the number of values while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, totalWordCount); } } Seoul Software Testing Conference
  • SumReducer.java public class SumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable totalWordCount = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } For each word, Total count becomes the value totalWordCount.set(wordCount); context.write(key, totalWordCount); } } Seoul Software Testing Conference
  • SumReducer • Reducer – Input: Shuffler produces and it becomes the input of the reducer • <Bye, 1>, <Goodbye, 1>, <Hadoop, <1,1>>, <Hello, <1, 1>> , <World, <1,1>> – Output • <Bye, 1>, <Goodbye, 1>, <Hadoop, 2>, <Hello, 2>, <World, 2> Seoul Software Testing Conference
  • MRUnit Test • How to UnitTest in Hadoop – Extending JUnit test • With org.apache.hadoop.mrunit.* API – Needs to test Driver, Mapper, Reducer • MapReduceDriver, MapDriver, ReduceDriver • Add input with expected output Seoul Software Testing Conference
  • MRUnit Test import java.util.ArrayList; import java.util.List; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mrunit.MapDriver; import org.apache.hadoop.mrunit.MapReduceDriver; import org.apache.hadoop.mrunit.ReduceDriver; import org.junit.Before; import org.junit.Test; public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test @Test public void testMapper() { mapDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("dog"), new IntWritable(1)); mapDriver.runTest(); } @Test public void testReducer() { List<IntWritable> values = new ArrayList<IntWritable>(); values.add(new IntWritable(1)); values.add(new IntWritable(1)); reduceDriver.withInput(new Text("cat"), values); reduceDriver.withOutput(new Text("cat"), new IntWritable(2)); reduceDriver.runTest(); } @Test public void testMapReduce() { mapReduceDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapReduceDriver.addOutput(new Text("cat"), new IntWritable(2)); mapReduceDriver.addOutput(new Text("dog"), new IntWritable(1)); mapReduceDriver.runTest(); } } Seoul Software Testing Conference
  • MRUnit Test Using MRUnit API, declare MapReduce, Mapper, Reducer drivers with input/output (key, value) public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDr iver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, I ntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDr iver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; Run setUp() before executing each test method @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, I ntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDr iver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; @Before Instantiate WordCount Mapper, Reducer public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, I ntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDr iver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); Instantiate and set Mapper driver with input/output (key, value) mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, I ntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDr iver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); Instantiate and set Reducer driver with input/output (key, value) mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, I ntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDr iver; MapDriver<LongWritable, Text, Text, IntWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); Instantiate and set MapperReducer driver with input/output (key, value) mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, I ntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); } Seoul Software Testing Conference
  • MRUnit Test Mapper test: Define sample input with expected output @Test public void testMapper() { mapDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("dog"), new IntWritable(1)); mapDriver.runTest(); } @Test public void testReducer() { List<IntWritable> values = new ArrayList<IntWritable>(); values.add(new IntWritable(1)); values.add(new IntWritable(1)); reduceDriver.withInput(new Text("cat"), values); reduceDriver.withOutput(new Text("cat"), new IntWritable(2)); reduceDriver.runTest(); } Seoul Software Testing Conference
  • MRUnit Test @Test public void testMapper() { mapDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("dog"), new IntWritable(1)); mapDriver.runTest(); } Reducer test: Define sample input with expected output @Test public void testReducer() { List<IntWritable> values = new ArrayList<IntWritable>(); values.add(new IntWritable(1)); values.add(new IntWritable(1)); reduceDriver.withInput(new Text("cat"), values); reduceDriver.withOutput(new Text("cat"), new IntWritable(2)); reduceDriver.runTest(); } Seoul Software Testing Conference
  • MRUnit Test MapperReducer test: Define sample input with expected output @Test public void testMapReduce() { mapReduceDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapReduceDriver.addOutput(new Text("cat"), new IntWritable(2)); mapReduceDriver.addOutput(new Text("dog"), new IntWritable(1)); mapReduceDriver.runTest(); } } Seoul Software Testing Conference
  • MRUnit Test in real • Need to implement unit tests – How many? • all Map, Reduce, Driver – Problems? • Mostly work – But it does not support complicated Map, Reduce APIs – How many problems you can detect • Depends on how well you implement MRUnit code Seoul Software Testing Conference
  • Conclusion • • • • MRUnit for Hadoop Unit Test Development Integrate with QA site with CI server Need to use it Seoul Software Testing Conference
  • Question? Seoul Software Testing Conference
  • References 1. Hadoop WordCount example with new map reduce api (http://codesfusion.blogspot.com/2013/1 0/hadoop-wordcount-with-new-map-reduce-api.html) 2. Hadoop Word Count Example (http://wiki.apache.org/hadoop/WordCount ) 3. Example: WordCount v1.0, Cloudera Hadoop Tutorial (http://www.cloudera.com/content/clouder a-content/cloudera-docs/HadoopTutorial/CDH4/Hadoop-Tutorial/ht_walk_through.html ) 4. Testing Word Count (https://cwiki.apache.org/confluence/display/MRUNIT/Testing+Word+Count) 5. Apache MRUnit Tutorial (https://cwiki.apache.org/confluence/display/MRUNIT/MRUnit+Tutorial ) 6. Hadoop Integration Test Suite, Shopzilla (https://github.com/shopzilla/hadoop-integration-test-sui te ) 7. Hadoop’s Elepahnt in the Room, Jeremy Lucas, Shopzilla (http://tech.shopzilla.com/2013/04/hado ops-elephant-in-the-room/ ) 8. Facebook Test MapReduce Local (https://github.com/facebook/hadoop-20/blob/master/src/test/ org/apache/hadoop/mapreduce/TestMapReduceLocal.java ) 9. Yahoo HIT Hadoop Integrated Testing (http://www.slideshare.net/ydn/hi-tv3?from_search=1 ) Seoul Software Testing Conference
  • Seoul Software Testing Conference
  • Seoul Software Testing Conference