Analytics on your data in place
Steve Watt, Red Hat

CC flickr Barta IV

@wattsteve
Hadoop at Red Hat

@wattsteve
But tonight I have my community hat on

CC flickr wcdumonts

@wattsteve
Hadoop in 2007
Platform Layers Technologies
Computational
Runtimes
FileSystems

HDFS or Amazon S3

Infrastructures

CC fli...
Hadoop in 2013
Platform Layers

Technologies

Computational
Runtimes

YARN, GiRAPH, MapReduce,
HBase, Phoenix,
Spark/BDAS,...
Observation #1: The Hadoop FileSystem Interface is
the keystone of the entire Ecosystem

CC flickr grufnik

@wattsteve
Observation #2: Moving data around just to analyze it
is slow and expensive. Especially if it requires a redundant
reposit...
So how does this work?
By leveraging Hadoop’s pluggable FileSystem architecture
Hadoop FS Clients

MapReduce

HBase

YARN
...
Hadoop FileSystem Configuration for HDFS

Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem In...
What are some examples of where big
data is stored?
- Object Stores
- NoSQL Stores
- Distributed FileSystems
- Network Fil...
Network Filer Example
Hadoop FileSystem Configuration for GlusterFS
Hadoop FS Clients

MapReduce

HBase

YARN

Any Applica...
Network Filer - Apache Hadoop on GlusterFS
Hadoop

Resource

Master Services

Manager

Management
Server

plugin

SWIFT

H...
Object Store Example
Hadoop FileSystem Configuration for SWIFT
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application
...
NoSQL Example
Hadoop FileSystem Configuration for CassandraFS
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

...
NoSQL - Apache Hadoop on CassandraFS

@wattsteve
We are working on filesystem tests within
Apache Hadoop-Common and Apache BigTop
as well as opening up ecosystem tools

CC...
@wattsteve
@wattsteve
Closing Remarks
1. The amount of Hadoop FileSystems available
to you continues to increase
2. This is good! A vibrant ecos...
Upcoming SlideShare
Loading in...5
×

Hadoop file systems

1,338

Published on

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,338
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
43
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide
  • Now lets look at some examples
  • Hadoop file systems

    1. 1. Analytics on your data in place Steve Watt, Red Hat CC flickr Barta IV @wattsteve
    2. 2. Hadoop at Red Hat @wattsteve
    3. 3. But tonight I have my community hat on CC flickr wcdumonts @wattsteve
    4. 4. Hadoop in 2007 Platform Layers Technologies Computational Runtimes FileSystems HDFS or Amazon S3 Infrastructures CC flickr wwarby MapReduce, HBase x86 or Amazon EC2 @wattsteve
    5. 5. Hadoop in 2013 Platform Layers Technologies Computational Runtimes YARN, GiRAPH, MapReduce, HBase, Phoenix, Spark/BDAS, Drill, Impala, Stinger FileSystems HDFS + 13 Other Hadoop FileSystems Infrastructures System on a Chip, x86, Virtualization and Cloud CC flickr lowfatbrains @wattsteve
    6. 6. Observation #1: The Hadoop FileSystem Interface is the keystone of the entire Ecosystem CC flickr grufnik @wattsteve
    7. 7. Observation #2: Moving data around just to analyze it is slow and expensive. Especially if it requires a redundant repository . CC flickr traftery @wattsteve
    8. 8. So how does this work? By leveraging Hadoop’s pluggable FileSystem architecture Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface Hadoop FileSystem Plugin Hadoop FileSystem FileSystem Implementation @wattsteve
    9. 9. Hadoop FileSystem Configuration for HDFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface HDFS Plugin Hadoop FileSystem HDFS @wattsteve
    10. 10. What are some examples of where big data is stored? - Object Stores - NoSQL Stores - Distributed FileSystems - Network Filers - Databases CC flickr birdwatcher63 @wattsteve
    11. 11. Network Filer Example Hadoop FileSystem Configuration for GlusterFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface GlusterFS Plugin Hadoop FileSystem @wattsteve
    12. 12. Network Filer - Apache Hadoop on GlusterFS Hadoop Resource Master Services Manager Management Server plugin SWIFT Hadoop Node Node Node Workers Manager Manager Manager plugin plugin plugin NFS FUSE GlusterFS FUSE FUSE FUSE Trusted Peer Trusted Peer DAS Brick DAS Brick DAS Brick Server 1 Server 2 Server 50 ... Trusted Peer @wattsteve
    13. 13. Object Store Example Hadoop FileSystem Configuration for SWIFT Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface SWIFT Plugin Hadoop FileSystem SWIFT @wattsteve
    14. 14. NoSQL Example Hadoop FileSystem Configuration for CassandraFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface CassandraFS Plugin Hadoop FileSystem @wattsteve
    15. 15. NoSQL - Apache Hadoop on CassandraFS @wattsteve
    16. 16. We are working on filesystem tests within Apache Hadoop-Common and Apache BigTop as well as opening up ecosystem tools CC flickr syume @wattsteve
    17. 17. @wattsteve
    18. 18. @wattsteve
    19. 19. Closing Remarks 1. The amount of Hadoop FileSystems available to you continues to increase 2. This is good! A vibrant ecosystem gives you choice 3. Evaluate the option of analyzing your data in place before deploying new environments CC flickr zoomboy1 @wattsteve
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×