Your SlideShare is downloading. ×
Introducing the hadoop ecosystem
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Introducing the hadoop ecosystem

734
views

Published on

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th


0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
734
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. IntroducingThe Hadoop EcosystemThe Hadoop Ecosystem
  • 2. Context: Performance Gap Trend Introduction to the Hadoop Ecosystem 2
  • 3. Context: Exponential for Decades Abundance of - computing & storage - generated data (estimated 8ZB in ’15) - things More data provides greater value Traditional data doesn’t scale well It’s time for a new approach! Introduction to the Hadoop Ecosystem 3
  • 4. New Hardware ApproachTraditional Big Data Exotic HW Commodity HW - big central servers -racks of pizza boxes - SAN -Ethernet - RAID -JBOD Hardware reliability Unreliable HW Scales further Limited scalability Cost effective Expensive Introduction to the Hadoop Ecosystem 4
  • 5. New Software ApproachTraditional Big Data Monolotic Distributed - Centralized -storage & compute nodes - RDBMS Raw data Schema first Open source Proprietary Introduction to the Hadoop Ecosystem 5
  • 6. Hadoop De facto big data industry standard (batch) Vendor adoption - IBM, Microsoft, Oracle, EMC, ... A collection of projects at Apache - HDFS, MapReduce, Hive, Pig, Hbase, Flume, Oozie, ... Main components - HDFS - MapReduce Cluster Set of machines running HDFS and MapReduce Introduction to the Hadoop Ecosystem 6
  • 7. HDFS Introduction to the Hadoop Ecosystem 7
  • 8. MapReduce Introduction to the Hadoop Ecosystem 8
  • 9. MapReduce Introduction to the Hadoop Ecosystem 9
  • 10. MapReduce Introduction to the Hadoop Ecosystem 10
  • 11. Typical Adoption Pattern An idea that’s impractical without Hadoop Build Hadoop-based POC Move initial application to production Add more datasets and users - removing data silos in organizations - permitting easy experiments on real data Snowballs into institution’s central repository for - analysis data processing data service layer Introduction to the Hadoop Ecosystem 11
  • 12. Use Case 1: Truvo Introduction to the Hadoop Ecosystem 12
  • 13. Use Case 2: UZ Brussel Introduction to the Hadoop Ecosystem 13
  • 14. How can you use Hadoop? What data are you ignoring? - How can you use it? How can you combine internal and external data? - Business partners - Feedback from you customers through social media - End your data silos - ... Introduction to the Hadoop Ecosystem 14
  • 15. DataCrunchers - Big Data Enablers Introduction to the Hadoop Ecosystem 15
  • 16. Introduction to the Hadoop Ecosystem 16