Your SlideShare is downloading. ×
0
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Introducing the hadoop ecosystem
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introducing the hadoop ecosystem

786

Published on

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
786
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. IntroducingThe Hadoop EcosystemThe Hadoop Ecosystem
  • 2. Context: Performance Gap Trend Introduction to the Hadoop Ecosystem 2
  • 3. Context: Exponential for Decades Abundance of - computing & storage - generated data (estimated 8ZB in ’15) - things More data provides greater value Traditional data doesn’t scale well It’s time for a new approach! Introduction to the Hadoop Ecosystem 3
  • 4. New Hardware ApproachTraditional Big Data Exotic HW Commodity HW - big central servers -racks of pizza boxes - SAN -Ethernet - RAID -JBOD Hardware reliability Unreliable HW Scales further Limited scalability Cost effective Expensive Introduction to the Hadoop Ecosystem 4
  • 5. New Software ApproachTraditional Big Data Monolotic Distributed - Centralized -storage & compute nodes - RDBMS Raw data Schema first Open source Proprietary Introduction to the Hadoop Ecosystem 5
  • 6. Hadoop De facto big data industry standard (batch) Vendor adoption - IBM, Microsoft, Oracle, EMC, ... A collection of projects at Apache - HDFS, MapReduce, Hive, Pig, Hbase, Flume, Oozie, ... Main components - HDFS - MapReduce Cluster Set of machines running HDFS and MapReduce Introduction to the Hadoop Ecosystem 6
  • 7. HDFS Introduction to the Hadoop Ecosystem 7
  • 8. MapReduce Introduction to the Hadoop Ecosystem 8
  • 9. MapReduce Introduction to the Hadoop Ecosystem 9
  • 10. MapReduce Introduction to the Hadoop Ecosystem 10
  • 11. Typical Adoption Pattern An idea that’s impractical without Hadoop Build Hadoop-based POC Move initial application to production Add more datasets and users - removing data silos in organizations - permitting easy experiments on real data Snowballs into institution’s central repository for - analysis data processing data service layer Introduction to the Hadoop Ecosystem 11
  • 12. Use Case 1: Truvo Introduction to the Hadoop Ecosystem 12
  • 13. Use Case 2: UZ Brussel Introduction to the Hadoop Ecosystem 13
  • 14. How can you use Hadoop? What data are you ignoring? - How can you use it? How can you combine internal and external data? - Business partners - Feedback from you customers through social media - End your data silos - ... Introduction to the Hadoop Ecosystem 14
  • 15. DataCrunchers - Big Data Enablers Introduction to the Hadoop Ecosystem 15
  • 16. Introduction to the Hadoop Ecosystem 16

×