Apache hadoop & map reduce

642 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
642
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Apache hadoop & map reduce

  1. 1. Apache Hadoop, BigData & MapReduceWHY BIG DATA:“More data usually beats better algorithm.”GOOD NEWS:“Big data is here.”BAD NEWS:We are struggling to store and analyze it.KEY PROBLEM:“Storage increased, not Speed.”SOLUTION:  ParallelismBut, while implementing parallelism we may face some noteworthy problems like; Hardware failure Combining dataThese problems have been overcome by Hadoop because of use of – HDFS ( Hadoop Distributed File System) MapReduce ( use of keys and values)
  2. 2. In a nutshell,Hadoop provides - A reliable Shared Storage (by HDFS) -A reliable Analysis System (by MapReduce)MAPREDUCE: Entire database or a good portion of it is processed for each query. MapReduce is a batch query processor. Already used by Mailtrust , Rackspace’s mail division for handling big data.MAPREDUCE VS RDBMS:CONCLUSION:Though a thorough understanding is absent here, more research will make it more clarified anddistinguished as well. Some more valuable information will enrich it in the coming days.

×