IntroducingThe Hadoop EcosystemThe Hadoop Ecosystem
Context: Performance Gap Trend                            Introduction to the Hadoop Ecosystem                            ...
Context: Exponential for Decades Abundance of - computing & storage - generated data (estimated 8ZB in ’15) - things More ...
New Hardware ApproachTraditional               Big Data Exotic HW                 Commodity HW  - big central servers   -r...
New Software ApproachTraditional         Big Data Monolotic           Distributed  - Centralized     -storage & compute no...
Hadoop De facto big data industry standard (batch) Vendor adoption - IBM, Microsoft, Oracle, EMC, ... A collection of proj...
HDFS       Introduction to the Hadoop Ecosystem                                              7
MapReduce            Introduction to the Hadoop Ecosystem                                                   8
MapReduce            Introduction to the Hadoop Ecosystem                                                   9
MapReduce            Introduction to the Hadoop Ecosystem                                                   10
Typical Adoption Pattern An idea that’s impractical without Hadoop Build Hadoop-based POC Move initial application to prod...
Use Case 1: Truvo                    Introduction to the Hadoop Ecosystem                                                 ...
Use Case 2: UZ Brussel                         Introduction to the Hadoop Ecosystem                                       ...
How can you use Hadoop? What data are you ignoring? - How can you use it? How can you combine internal and external data? ...
DataCrunchers - Big Data Enablers                              Introduction to the Hadoop Ecosystem                       ...
Introduction to the Hadoop Ecosystem                                       16
Upcoming SlideShare
Loading in...5
×

Introducing the hadoop ecosystem

807

Published on

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
807
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Introducing the hadoop ecosystem

  1. 1. IntroducingThe Hadoop EcosystemThe Hadoop Ecosystem
  2. 2. Context: Performance Gap Trend Introduction to the Hadoop Ecosystem 2
  3. 3. Context: Exponential for Decades Abundance of - computing & storage - generated data (estimated 8ZB in ’15) - things More data provides greater value Traditional data doesn’t scale well It’s time for a new approach! Introduction to the Hadoop Ecosystem 3
  4. 4. New Hardware ApproachTraditional Big Data Exotic HW Commodity HW - big central servers -racks of pizza boxes - SAN -Ethernet - RAID -JBOD Hardware reliability Unreliable HW Scales further Limited scalability Cost effective Expensive Introduction to the Hadoop Ecosystem 4
  5. 5. New Software ApproachTraditional Big Data Monolotic Distributed - Centralized -storage & compute nodes - RDBMS Raw data Schema first Open source Proprietary Introduction to the Hadoop Ecosystem 5
  6. 6. Hadoop De facto big data industry standard (batch) Vendor adoption - IBM, Microsoft, Oracle, EMC, ... A collection of projects at Apache - HDFS, MapReduce, Hive, Pig, Hbase, Flume, Oozie, ... Main components - HDFS - MapReduce Cluster Set of machines running HDFS and MapReduce Introduction to the Hadoop Ecosystem 6
  7. 7. HDFS Introduction to the Hadoop Ecosystem 7
  8. 8. MapReduce Introduction to the Hadoop Ecosystem 8
  9. 9. MapReduce Introduction to the Hadoop Ecosystem 9
  10. 10. MapReduce Introduction to the Hadoop Ecosystem 10
  11. 11. Typical Adoption Pattern An idea that’s impractical without Hadoop Build Hadoop-based POC Move initial application to production Add more datasets and users - removing data silos in organizations - permitting easy experiments on real data Snowballs into institution’s central repository for - analysis data processing data service layer Introduction to the Hadoop Ecosystem 11
  12. 12. Use Case 1: Truvo Introduction to the Hadoop Ecosystem 12
  13. 13. Use Case 2: UZ Brussel Introduction to the Hadoop Ecosystem 13
  14. 14. How can you use Hadoop? What data are you ignoring? - How can you use it? How can you combine internal and external data? - Business partners - Feedback from you customers through social media - End your data silos - ... Introduction to the Hadoop Ecosystem 14
  15. 15. DataCrunchers - Big Data Enablers Introduction to the Hadoop Ecosystem 15
  16. 16. Introduction to the Hadoop Ecosystem 16

×