Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analysed with traditional computing techniques.
3. OMG!! Did he just asked me to catch rats in a
place full of snakes
3
4. Agenda
1. What is Big Data
2. Characteristic of Big Data
3. Meaning of BIG DATA to “US”
4. Hadoop
6. Submitting a Map Reduce Job
5. What is BIG DATA?
• ‘Big Data’ is similar to ‘small data’, but bigger in size
• Big Data generates value from the storage and processing of very large
quantities of digital information that cannot be analyzed with traditional
computing techniques.
• Walmart handles more than 1 million customer transactions every hour.
• Facebook handles 40 billion photos from its user base.
• Decoding the human genome originally took 10years to process; now it can
be achieved in one week.
6. Three Characteristics of Big Data V3s
Volume
• Data quantity
Velocity
• Data Speed
Variety
• Data Types
7. What BIG DATA TESTING mean to Testers?
Take into consideration these 3 perspectives:
• Data
• Infrastructure
• Validation Tools
8. Now the questions comes what technology is
needed for handling BIG DATA ?
1.HADOOP
9. Hadoop & Its Components
• Hadoop is an open-source software framework for storing and processing big data
in a distributed fashion on large clusters of commodity hardware. Essentially, it
accomplishes two tasks: massive data storage and faster processing.
Source: http://www.trieuvan.com/apache/hadoop/common/
10. How is Hadoop Helping?
• HDFS: Java based distributed FS that can run and store all kinds of data
• Map Reduce: A software programming model for processing large set of
data in parallel
• YARN: A resource management framework for scheduling and handling
resource requests from distributed applications
11. This is our Input File : Input Sampleset.txt
11
12. Map Reduce Program For Max Temperature :
Driver Class
Job job = new Job();
job.setJarByClass(MaxTemperatureDriver.class);
job.setJobName("Max Temperature");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);
12
13. Mapper Class
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String year = line.substring(15, 19);
int airTemperature;
if (line.charAt(87) == '+') { // parseInt doesn't like leading plus
// signs
airTemperature = Integer.parseInt(line.substring(88, 92));
} else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}
13
14. Reducer Class
@Override
public void reduce(Text key, Iterable<IntWritable> values,
Context context)
throws IOException, InterruptedException {
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new IntWritable(maxValue));
}
}
14
15. Thank You
For more information, please:
• Contact us at info@qainfotech.com
• Visit us at www.qainfotech.com
• Read our blog at www.qainfotech.com/blog
• Follow us on Twitter at www.twitter.com/qainfotech
USA
Office
International
Headquarters
Noida
Uttar Pradesh, India
Farmington Hills
Michigan, U.S.A.