• Save
Introducing the hadoop ecosystem
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Introducing the hadoop ecosystem

  • 1,235 views
Uploaded on

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th

Introducing the Hadoop Ecosystem, a presentation I gave at KMO Kennisbeurs on October 25th

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,235
On Slideshare
1,210
From Embeds
25
Number of Embeds
3

Actions

Shares
Downloads
0
Comments
0
Likes
2

Embeds 25

http://datacrunchers.eu 22
http://localhost 2
https://twitter.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. IntroducingThe Hadoop EcosystemThe Hadoop Ecosystem
  • 2. Context: Performance Gap Trend Introduction to the Hadoop Ecosystem 2
  • 3. Context: Exponential for Decades Abundance of - computing & storage - generated data (estimated 8ZB in ’15) - things More data provides greater value Traditional data doesn’t scale well It’s time for a new approach! Introduction to the Hadoop Ecosystem 3
  • 4. New Hardware ApproachTraditional Big Data Exotic HW Commodity HW - big central servers -racks of pizza boxes - SAN -Ethernet - RAID -JBOD Hardware reliability Unreliable HW Scales further Limited scalability Cost effective Expensive Introduction to the Hadoop Ecosystem 4
  • 5. New Software ApproachTraditional Big Data Monolotic Distributed - Centralized -storage & compute nodes - RDBMS Raw data Schema first Open source Proprietary Introduction to the Hadoop Ecosystem 5
  • 6. Hadoop De facto big data industry standard (batch) Vendor adoption - IBM, Microsoft, Oracle, EMC, ... A collection of projects at Apache - HDFS, MapReduce, Hive, Pig, Hbase, Flume, Oozie, ... Main components - HDFS - MapReduce Cluster Set of machines running HDFS and MapReduce Introduction to the Hadoop Ecosystem 6
  • 7. HDFS Introduction to the Hadoop Ecosystem 7
  • 8. MapReduce Introduction to the Hadoop Ecosystem 8
  • 9. MapReduce Introduction to the Hadoop Ecosystem 9
  • 10. MapReduce Introduction to the Hadoop Ecosystem 10
  • 11. Typical Adoption Pattern An idea that’s impractical without Hadoop Build Hadoop-based POC Move initial application to production Add more datasets and users - removing data silos in organizations - permitting easy experiments on real data Snowballs into institution’s central repository for - analysis data processing data service layer Introduction to the Hadoop Ecosystem 11
  • 12. Use Case 1: Truvo Introduction to the Hadoop Ecosystem 12
  • 13. Use Case 2: UZ Brussel Introduction to the Hadoop Ecosystem 13
  • 14. How can you use Hadoop? What data are you ignoring? - How can you use it? How can you combine internal and external data? - Business partners - Feedback from you customers through social media - End your data silos - ... Introduction to the Hadoop Ecosystem 14
  • 15. DataCrunchers - Big Data Enablers Introduction to the Hadoop Ecosystem 15
  • 16. Introduction to the Hadoop Ecosystem 16