SlideShare a Scribd company logo
1 of 38
Dynamic Namespace Partitioning with

The Giraffa File System



Konstantin V. Shvachko            Plamen Jeliazkov
Founder, Altoscale                UC San Diego




 June 14, 2012                    Hadoop Summit 2012


      AltoScale
AltoScale                                    Introduction


Plamen
    Fresh grad from UCSD
    Internship with Hadoop Platform Team at eBay
    Wrote Giraffa prototype

Konstantin
    Founder of Altoscale. Primary focus
      1. Altoscale Workbench
         Hadoop & HBase cluster on a public or a private cloud
      2. Giraffa
    Apache Hadoop PMC
    HDFS scalabilty


2
AltoScale               Contents


Background
Motivation
Architecture
Main problems and solutions
    Bootstrapping
    Namespace Partitioning
    Rename




3
AltoScale                 Giraffa



Giraffa is a distributed,
 highly available file system


Utilizes features of
 HDFS and HBase


New open source project
 in experimental stage


4
AltoScale                          Origin:


Giraffe. Latin: Giraffa camelopardalis
    Family      Giraffidae
    Genus       Giraffa
    Species     Giraffa camelopardalis

Other languages
    Arabic      Zarafa
    Spanish     Jirafa
    Bulgarian   жирафа
    Italian     Giraffa

Favorites of my daughter
        o As the Hadoop traditions require

5
AltoScale                      Apache Hadoop


A reliable, scalable, high performance distributed
 computing system
The Hadoop Distributed File System (HDFS)
    Reliable storage layer

MapReduce – distributed computation framework
    Simple computational model

Hadoop scales computation capacity, storage capacity,
 and I/O bandwidth by adding commodity servers.




6
AltoScale                       The Design Principles


Linear scalability
    More nodes can do more work within the same time
    On Data size and Compute resources

Reliability and Availability
    1 drive fails in 3 years. Probability of failing today 1/1000.
    Several drives fail on a cluster with thousands of drives

Move computation to data
    Minimize expensive data transfers

Sequential data processing
    Avoid random reads


7
AltoScale                 Collocated Hadoop Clusters


HDFS – a distributed file system
    NameNode – namespace and block management
    DataNodes – block replica container

MapReduce – a framework for distributed computations
    JobTracker – job scheduling, resource management, lifecycle
     coordination
    TaskTracker – task execution module

                       NameNode                 JobTracker




                   TaskTracker    TaskTracker        TaskTracker

                   DataNode       DataNode            DataNode



8
AltoScale      Hadoop Distributed File System


The namespace is a hierarchy of files and directories
    Files are divided into large blocks 128 MB

Namespace (metadata) is decoupled from data
    Fast namespace operations, not slowed down by
    Direct data streaming from the source storage

Single NameNode keeps the entire name space in RAM
DataNodes store block replicas as files on local drives
    Blocks replicated on 3 DataNodes for redundancy & availability

HDFS client – point of entry to HDFS
    Contacts NameNode for metadata
    Serves data to applications directly from DataNodes
9
AltoScale                            Scalability Limits


Single-master architecture: a constraining resource
NameNode space limit
     100 million files and 200 million blocks with 64GB RAM
     Restricts storage capacity to 20 PB
     Small file problem: block-to-file ratio is shrinking

Single NameNode limits linear performance growth
     A handful of clients can saturate NameNode

MapReduce framework scalability limit: 40,000 clients
     Corresponds to a 4,000-node cluster with 10 MapReduce slots

“HDFS Scalability: The limits to growth” USENIX ;login: 2010

10
AltoScale         Horizontal to Vertical Scaling


Horizontal scaling is limited by single-master architecture
Natural growth of compute power and storage density
     Clusters composed of more powerful servers

Vertical scaling leads to cluster size shrinking
Storage capacity, Compute power, and Cost
 remain constant




11
AltoScale                            Shrinking Clusters



                                                                               2008 Yahoo!
Resources per node: Cores, Disks, RAM




                                                                                4000-node cluster
                                                                               2010 Facebook
                                                                                2000 nodes
                                                                               2011 eBay
                                                                                1000 nodes
                                                                               2013 Cluster of
                                                                                500 nodes
                                             Cluster Size: Number of Nodes


12
AltoScale                  Scalability for Hadoop 2.0


HDFS Federation
     Independent NameNodes sharing a common pool of DataNodes
     Cluster is a family of volumes with shared block storage layer
     User sees volumes as isolated file systems
     ViewFS: the client-side mount table

Yarn: New MapReduce framework
     Dynamic partitioning of cluster resources: no fixed slots
     Separation of JobTracker functions
     1. Job scheduling and resource allocation: centralized
     2. Job monitoring and job life-cycle coordination: decentralized
        o Delegate coordination of different jobs to other nodes



13
AltoScale                  Namespace Partitioning

Static: Federation
     Directory sub-trees are statically assigned to
      disjoint volumes
     Relocating sub-trees without copying is
      challenging
     Scale x10: billions of files
Dynamic:
     Files, directory sub-trees can move automatically
      between nodes based on their utilization or load
      balancing requirements
     Files can be relocated without copying data blocks
     Scale x100: 100s of billion of files
Orthogonal independent approaches.
     Federation of distributed namespaces is possible


14
AltoScale       Distributed Namespaces Today


Ceph
     Metadata stored on OSD
     MDS cache metadata: Dynamic Partitioning

Lustre
     Plans to release (2.4) distributed namespace
     Code ready

Colossus: from Google S.Quinlan and J.Dean
     100 million files per metadata server
     Hundreds of servers

VoldFS, CassandraFS, KTHFS (MySQL)
     Prototypes



15
AltoScale                           HBase Overview


Table: big, sparse, loosely structured
     Collection of rows, sorted by row keys
     Rows can have arbitrary number of columns

Table is split Horizontally into Regions
     Dynamic Table partitioning!
     Region Servers serve regions to applications

Columns grouped into Column families
     Vertical partition of tables

Distributed Cache: Regions are loaded in nodes’ RAM
     Real-time access to data




16
AltoScale                                      HBase API


HBaseAdmin: administrative functions
     Create, delete, list tables
     Create, update, delete columns, column families
     Split, compact, flush

HTable: access table data
     Result HTable.get(Get g) // get cells of a row
     void HTable.put(Put p) // update a row
     void HTable.put(Put[] p) // batch update of rows
     void HTable.delete(Delete d) // delete cells/row
     ResultScanner getScanner(family) // scan col family

Coprocessors:
     Custom actions triggered by update events
     Like database triggers and stored procedures

17
AltoScale   HBase Architecture




18
AltoScale                         Giraffa File System


HDFS + HBase = Giraffa
     Goal: build from existing building blocks
     Minimize changes to existing components

1. Store file & directory metadata in HBase table
     Dynamic table partitioning into regions
     Cashed in RegionServer RAM for fast access

2. Store file data in HDFS DataNodes: data streaming
3. Block management
     Handle communication with DataNodes
     Perform block replication


19
AltoScale                      Giraffa Requirements


More files & more data
Availability
     Load balancing of metadata traffic
     Same data streaming speed to / from DataNodes
     No SPOF

Cluster operability, management
     Cost of running larger clusters same as smaller ones

                      HDFS          Federated HDFS   Giraffa
Space                 25 PB         120 PB           1 EB = 1000 PB
Files + blocks        200 million   1 billion        100 billion
Concurrent Clients    40,000        100,000          1 million


20
AltoScale         FAQ: Why HDFS and HBase?


Building new FS from scratch – Really hard, Takes years
HDFS a reliable, scalable block storage
     Efficient Data Streaming
     Automatic Data Recovery

HBase a natural metadata service
     Distributed Cache …
     Dynamic Partitioning
     Automatic Metadata Recovery

Same breed, should be “compatible”
     HBase stores data in HDFS: same storage for data and metadata


21
AltoScale
                                            FAQ: Why not store
                                   whole files in HBase tables?

Defeats the main concept of Distributed File Systems:
     Decoupling of data and metadata

Small files can be stored as rows
     Row size is limited by Region size
     Large files must be split

Technically possible to split any information into rows
        o Log files: into events
        o Video files: into frames
        o Random bits: into 1K blobs with an offset as a row key
     Different level of abstraction
     Requires data conversion

22
AltoScale
                           FAQ: My Dataset is Only 1 PB
                                 Do I Still Need Giraffa?



Availability
     Distributed access to namespace for many concurrent clients
     Not bottlenecked by single NameNode performance



“Small files”
     Block-to-file ration is decreasing: 2 –> 1.5 -> 1.2
     No need to aggregate small files into large archives




23
AltoScale                           Building Blocks


Single Table called “Namespace” stores
     File ID (row key) and file attributes:
       o name, replication, block-size, permissions, times
     List of blocks
     Block locations
Giraffa client: FileSystem implementation
     Obtains metadata from HBase
     Data exchange with DataNodes
Block manager: maintain flat namespace of blocks
     Block allocation, replication, removal
     DataNode management
     Storage for the HBase table
24
AltoScale                                            Giraffa Architecture


                              HBase

                                Namespace                                        1. Giraffa client
                                path, attrs, block[], DN[][], BM-node
                                                                                    gets files
                                                                                    and blocks
                          1            Block Management Agent                       from HBase
                                                                                 2. May directly
                                                                                    query Block
         NamespaceAgent




 App                      2            Block Management Layer                       Manager
                                                                                 3. Stream data
                                 BM                  BM                  BM
                          3
                                                                                    to or from
                                                                                    DataNodes
                               DN                 DN                    DN
                                 DN                 DN                    DN
                                   DN                 DN                    DN




25
AltoScale                              Namespace Table

Row keys
     Identify files and directories as rows in the table
     Different key definitions based on locality requirement
     Key definition is chosen during formatting of the file system
     Full-path-key is the default
Columns
     File attributes:
        o Local name, owner, group, permissions, access-time,
          modification-time, block-size, replication, isDir, length
     List of blocks of a file
        o Persisted in the table
     List of block locations for each block
        o Not persisted, but discovered from block reports
     Directory table maps dir-entry name to corresponding row key


26
AltoScale                                                              Giraffa Client


GiraffaFileSystem implements FileSystem
     fs.defaultFS = grfa:///
     fs.grfa.impl = o.a.giraffa.GiraffaFileSystem

GiraffaClient extends DFSClient
     NamespaceAgent replaces NameNode RPC

                                                         Namespace
         GiraffaFileSystem




                                                           Agent
                             GiraffaClient

                                             DFSClient



                                                                     to NameNode




                                                                     to DataNodes

27
AltoScale                              Block Management


Block Manager
     Block allocation, deletion, replication

DataNode Manager
     Process DataNode block reports, heartbeats. Identify lost nodes

Provide storage for HBase table
     Small file system to store HFiles

BMServer paired on the same node with RegionServer
     Distributed cluster of BMServes
     Mostly local communication between Region and BM servers

NameNode is an initial implementation of BMServer
     Giraffa block is a single block file with the same name as block id


28
AltoScale                           Three Problems



Bootstrapping
     HBase stores tables as files in HDFS


Namespace Partitioning
     Retain locality



Atomic Renames



29
AltoScale                                                  Bootstrapping



          Block Manager Server


                                            .log           HBase Volume
                         hbase/       region1              Table layout
             /                                             Rare updates
                         giraffa/     region2



           blk_123_001      dn-1     dn-2          dn-3
                                                           Block Volume
           blk_234_002     dn-11    dn-12          dn-13   Flat namespace
                                                           of blocks
           blk_345_003     dn-101   dn-102     dn-103




30
AltoScale                      Locality of Reference



Row keys
     Define sorting of files and directories in the table
     Tree structured namespace is flattened into linear array
Ordered list of files is self-partitioned into regions
Retain locality in linearized structure
Files in the same directory - adjacent in the table
     Belong to the same region with some exclusions
Files of the same directory should be on the same node
     Avoid jumping cross regions for simple “ls”




31
AltoScale                           Partitioning Example 1


Straightforward partitioning based on random hashing


                                          1


                      2              3                4


                 1        1
                 5        6



                 T1       T2             T3          T4




           id1                 id2             id3




32
AltoScale                                Partitioning Example 2


Partitioning based on lexicographic full-path ordering
     The default

                                                  1


                                2            3              4

                           15       16




                           T1       T2           T3        T4




              1   1    1                 1             1
                  2    2
                           T1       T2   3
                                                 T3    4
                                                           T4
                      15




33
AltoScale                              Partitioning Example 3


Partitioning based on fixed depth neighborhoods


                                             1


                              2        3             4


                     1            1
                     5            6



                     T1           T2        T3      T4




             1   1   1    1            2    2
                 2   3    4            15   16




34
AltoScale                              Atomic Rename


Giraffa will implement atomic in-place rename
     No support for atomic file move from one directory to another

A move can then be implemented on application level
     Non-atomically move the target file from the source directory to a
      temporary file in the target directory
     Atomically rename the temporary file to its original name
     On failure use simple 3-step recovery procedure

Eventually implement atomic moves
     PAXOS
     Simplified synchronization algorithms



35
AltoScale                                             History


(2008) Idea. Study of distributed systems
     AFS, Lustre, Ceph, PVFS, GPFS, Farsite, …
     Partitioning of the namespace: 4 types of partitioning

(2009) Study on scalability limits
     NameNode optimization

(2010) Design with Michael Stack
     Presentation at HDFS contributors meeting

(2011) Plamen implements POC
(2012) Rewrite open sourced as Apache Extras project
     http://code.google.com/a/apache-extras.org/p/giraffa/

36
AltoScale             Status


 Design stage
 One node cluster running
     Live demo with Plamen




37
AltoScale   Thank You!




38

More Related Content

What's hot

HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User ReferenceBiju Nair
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Designsudhakara st
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Simplilearn
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file systemAnshul Bhatnagar
 
Snapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemSnapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemBhavesh Padharia
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemAnand Kulkarni
 
Hadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenesHadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenesNitin Khattar
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemVaibhav Jain
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceUday Vakalapudi
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File Systemelliando dias
 
Hadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHanborq Inc.
 
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File SystemFredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File SystemFredrick Ishengoma
 

What's hot (20)

HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Design
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file system
 
Hadoop HDFS
Hadoop HDFSHadoop HDFS
Hadoop HDFS
 
Snapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemSnapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File System
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenesHadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenes
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Hadoop ppt2
Hadoop ppt2Hadoop ppt2
Hadoop ppt2
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHadoop HDFS NameNode HA
Hadoop HDFS NameNode HA
 
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
 
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File SystemFredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
 
Hdfs architecture
Hdfs architectureHdfs architecture
Hdfs architecture
 

Similar to Dynamic Namespace Partitioning with Giraffa File System

How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedDouglas Bernardini
 
big data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing databig data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing datapreetik9044
 
Hadoop - HDFS
Hadoop - HDFSHadoop - HDFS
Hadoop - HDFSKavyaGo
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟datastack
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppthothyfa
 
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaYahoo Developer Network
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introductionChirag Ahuja
 
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.xModule 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.xNPN Training
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingCloudera, Inc.
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceChris Nauroth
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxDanishMahmood23
 
EMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with HadoopNalini Mehta
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Lucidworks
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 

Similar to Dynamic Namespace Partitioning with Giraffa File System (20)

Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Giraffa - November 2014
Giraffa - November 2014Giraffa - November 2014
Giraffa - November 2014
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
 
big data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing databig data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing data
 
HADOOP
HADOOPHADOOP
HADOOP
 
Hadoop - HDFS
Hadoop - HDFSHadoop - HDFS
Hadoop - HDFS
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
 
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
 
Hadoop overview.pdf
Hadoop overview.pdfHadoop overview.pdf
Hadoop overview.pdf
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.xModule 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
 
Unit IV.pdf
Unit IV.pdfUnit IV.pdf
Unit IV.pdf
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduce
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
 
EMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data Storage
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with Hadoop
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Dynamic Namespace Partitioning with Giraffa File System

  • 1. Dynamic Namespace Partitioning with The Giraffa File System Konstantin V. Shvachko Plamen Jeliazkov Founder, Altoscale UC San Diego June 14, 2012 Hadoop Summit 2012 AltoScale
  • 2. AltoScale Introduction Plamen Fresh grad from UCSD Internship with Hadoop Platform Team at eBay Wrote Giraffa prototype Konstantin Founder of Altoscale. Primary focus 1. Altoscale Workbench Hadoop & HBase cluster on a public or a private cloud 2. Giraffa Apache Hadoop PMC HDFS scalabilty 2
  • 3. AltoScale Contents Background Motivation Architecture Main problems and solutions Bootstrapping Namespace Partitioning Rename 3
  • 4. AltoScale Giraffa Giraffa is a distributed, highly available file system Utilizes features of HDFS and HBase New open source project in experimental stage 4
  • 5. AltoScale Origin: Giraffe. Latin: Giraffa camelopardalis Family Giraffidae Genus Giraffa Species Giraffa camelopardalis Other languages Arabic Zarafa Spanish Jirafa Bulgarian жирафа Italian Giraffa Favorites of my daughter o As the Hadoop traditions require 5
  • 6. AltoScale Apache Hadoop A reliable, scalable, high performance distributed computing system The Hadoop Distributed File System (HDFS) Reliable storage layer MapReduce – distributed computation framework Simple computational model Hadoop scales computation capacity, storage capacity, and I/O bandwidth by adding commodity servers. 6
  • 7. AltoScale The Design Principles Linear scalability More nodes can do more work within the same time On Data size and Compute resources Reliability and Availability 1 drive fails in 3 years. Probability of failing today 1/1000. Several drives fail on a cluster with thousands of drives Move computation to data Minimize expensive data transfers Sequential data processing Avoid random reads 7
  • 8. AltoScale Collocated Hadoop Clusters HDFS – a distributed file system NameNode – namespace and block management DataNodes – block replica container MapReduce – a framework for distributed computations JobTracker – job scheduling, resource management, lifecycle coordination TaskTracker – task execution module NameNode JobTracker TaskTracker TaskTracker TaskTracker DataNode DataNode DataNode 8
  • 9. AltoScale Hadoop Distributed File System The namespace is a hierarchy of files and directories Files are divided into large blocks 128 MB Namespace (metadata) is decoupled from data Fast namespace operations, not slowed down by Direct data streaming from the source storage Single NameNode keeps the entire name space in RAM DataNodes store block replicas as files on local drives Blocks replicated on 3 DataNodes for redundancy & availability HDFS client – point of entry to HDFS Contacts NameNode for metadata Serves data to applications directly from DataNodes 9
  • 10. AltoScale Scalability Limits Single-master architecture: a constraining resource NameNode space limit 100 million files and 200 million blocks with 64GB RAM Restricts storage capacity to 20 PB Small file problem: block-to-file ratio is shrinking Single NameNode limits linear performance growth A handful of clients can saturate NameNode MapReduce framework scalability limit: 40,000 clients Corresponds to a 4,000-node cluster with 10 MapReduce slots “HDFS Scalability: The limits to growth” USENIX ;login: 2010 10
  • 11. AltoScale Horizontal to Vertical Scaling Horizontal scaling is limited by single-master architecture Natural growth of compute power and storage density Clusters composed of more powerful servers Vertical scaling leads to cluster size shrinking Storage capacity, Compute power, and Cost remain constant 11
  • 12. AltoScale Shrinking Clusters 2008 Yahoo! Resources per node: Cores, Disks, RAM 4000-node cluster 2010 Facebook 2000 nodes 2011 eBay 1000 nodes 2013 Cluster of 500 nodes Cluster Size: Number of Nodes 12
  • 13. AltoScale Scalability for Hadoop 2.0 HDFS Federation Independent NameNodes sharing a common pool of DataNodes Cluster is a family of volumes with shared block storage layer User sees volumes as isolated file systems ViewFS: the client-side mount table Yarn: New MapReduce framework Dynamic partitioning of cluster resources: no fixed slots Separation of JobTracker functions 1. Job scheduling and resource allocation: centralized 2. Job monitoring and job life-cycle coordination: decentralized o Delegate coordination of different jobs to other nodes 13
  • 14. AltoScale Namespace Partitioning Static: Federation Directory sub-trees are statically assigned to disjoint volumes Relocating sub-trees without copying is challenging Scale x10: billions of files Dynamic: Files, directory sub-trees can move automatically between nodes based on their utilization or load balancing requirements Files can be relocated without copying data blocks Scale x100: 100s of billion of files Orthogonal independent approaches. Federation of distributed namespaces is possible 14
  • 15. AltoScale Distributed Namespaces Today Ceph Metadata stored on OSD MDS cache metadata: Dynamic Partitioning Lustre Plans to release (2.4) distributed namespace Code ready Colossus: from Google S.Quinlan and J.Dean 100 million files per metadata server Hundreds of servers VoldFS, CassandraFS, KTHFS (MySQL) Prototypes 15
  • 16. AltoScale HBase Overview Table: big, sparse, loosely structured Collection of rows, sorted by row keys Rows can have arbitrary number of columns Table is split Horizontally into Regions Dynamic Table partitioning! Region Servers serve regions to applications Columns grouped into Column families Vertical partition of tables Distributed Cache: Regions are loaded in nodes’ RAM Real-time access to data 16
  • 17. AltoScale HBase API HBaseAdmin: administrative functions Create, delete, list tables Create, update, delete columns, column families Split, compact, flush HTable: access table data Result HTable.get(Get g) // get cells of a row void HTable.put(Put p) // update a row void HTable.put(Put[] p) // batch update of rows void HTable.delete(Delete d) // delete cells/row ResultScanner getScanner(family) // scan col family Coprocessors: Custom actions triggered by update events Like database triggers and stored procedures 17
  • 18. AltoScale HBase Architecture 18
  • 19. AltoScale Giraffa File System HDFS + HBase = Giraffa Goal: build from existing building blocks Minimize changes to existing components 1. Store file & directory metadata in HBase table Dynamic table partitioning into regions Cashed in RegionServer RAM for fast access 2. Store file data in HDFS DataNodes: data streaming 3. Block management Handle communication with DataNodes Perform block replication 19
  • 20. AltoScale Giraffa Requirements More files & more data Availability Load balancing of metadata traffic Same data streaming speed to / from DataNodes No SPOF Cluster operability, management Cost of running larger clusters same as smaller ones HDFS Federated HDFS Giraffa Space 25 PB 120 PB 1 EB = 1000 PB Files + blocks 200 million 1 billion 100 billion Concurrent Clients 40,000 100,000 1 million 20
  • 21. AltoScale FAQ: Why HDFS and HBase? Building new FS from scratch – Really hard, Takes years HDFS a reliable, scalable block storage Efficient Data Streaming Automatic Data Recovery HBase a natural metadata service Distributed Cache … Dynamic Partitioning Automatic Metadata Recovery Same breed, should be “compatible” HBase stores data in HDFS: same storage for data and metadata 21
  • 22. AltoScale FAQ: Why not store whole files in HBase tables? Defeats the main concept of Distributed File Systems: Decoupling of data and metadata Small files can be stored as rows Row size is limited by Region size Large files must be split Technically possible to split any information into rows o Log files: into events o Video files: into frames o Random bits: into 1K blobs with an offset as a row key Different level of abstraction Requires data conversion 22
  • 23. AltoScale FAQ: My Dataset is Only 1 PB Do I Still Need Giraffa? Availability Distributed access to namespace for many concurrent clients Not bottlenecked by single NameNode performance “Small files” Block-to-file ration is decreasing: 2 –> 1.5 -> 1.2 No need to aggregate small files into large archives 23
  • 24. AltoScale Building Blocks Single Table called “Namespace” stores File ID (row key) and file attributes: o name, replication, block-size, permissions, times List of blocks Block locations Giraffa client: FileSystem implementation Obtains metadata from HBase Data exchange with DataNodes Block manager: maintain flat namespace of blocks Block allocation, replication, removal DataNode management Storage for the HBase table 24
  • 25. AltoScale Giraffa Architecture HBase Namespace 1. Giraffa client path, attrs, block[], DN[][], BM-node gets files and blocks 1 Block Management Agent from HBase 2. May directly query Block NamespaceAgent App 2 Block Management Layer Manager 3. Stream data BM BM BM 3 to or from DataNodes DN DN DN DN DN DN DN DN DN 25
  • 26. AltoScale Namespace Table Row keys Identify files and directories as rows in the table Different key definitions based on locality requirement Key definition is chosen during formatting of the file system Full-path-key is the default Columns File attributes: o Local name, owner, group, permissions, access-time, modification-time, block-size, replication, isDir, length List of blocks of a file o Persisted in the table List of block locations for each block o Not persisted, but discovered from block reports Directory table maps dir-entry name to corresponding row key 26
  • 27. AltoScale Giraffa Client GiraffaFileSystem implements FileSystem fs.defaultFS = grfa:/// fs.grfa.impl = o.a.giraffa.GiraffaFileSystem GiraffaClient extends DFSClient NamespaceAgent replaces NameNode RPC Namespace GiraffaFileSystem Agent GiraffaClient DFSClient to NameNode to DataNodes 27
  • 28. AltoScale Block Management Block Manager Block allocation, deletion, replication DataNode Manager Process DataNode block reports, heartbeats. Identify lost nodes Provide storage for HBase table Small file system to store HFiles BMServer paired on the same node with RegionServer Distributed cluster of BMServes Mostly local communication between Region and BM servers NameNode is an initial implementation of BMServer Giraffa block is a single block file with the same name as block id 28
  • 29. AltoScale Three Problems Bootstrapping HBase stores tables as files in HDFS Namespace Partitioning Retain locality Atomic Renames 29
  • 30. AltoScale Bootstrapping Block Manager Server .log HBase Volume hbase/ region1 Table layout / Rare updates giraffa/ region2 blk_123_001 dn-1 dn-2 dn-3 Block Volume blk_234_002 dn-11 dn-12 dn-13 Flat namespace of blocks blk_345_003 dn-101 dn-102 dn-103 30
  • 31. AltoScale Locality of Reference Row keys Define sorting of files and directories in the table Tree structured namespace is flattened into linear array Ordered list of files is self-partitioned into regions Retain locality in linearized structure Files in the same directory - adjacent in the table Belong to the same region with some exclusions Files of the same directory should be on the same node Avoid jumping cross regions for simple “ls” 31
  • 32. AltoScale Partitioning Example 1 Straightforward partitioning based on random hashing 1 2 3 4 1 1 5 6 T1 T2 T3 T4 id1 id2 id3 32
  • 33. AltoScale Partitioning Example 2 Partitioning based on lexicographic full-path ordering The default 1 2 3 4 15 16 T1 T2 T3 T4 1 1 1 1 1 2 2 T1 T2 3 T3 4 T4 15 33
  • 34. AltoScale Partitioning Example 3 Partitioning based on fixed depth neighborhoods 1 2 3 4 1 1 5 6 T1 T2 T3 T4 1 1 1 1 2 2 2 3 4 15 16 34
  • 35. AltoScale Atomic Rename Giraffa will implement atomic in-place rename No support for atomic file move from one directory to another A move can then be implemented on application level Non-atomically move the target file from the source directory to a temporary file in the target directory Atomically rename the temporary file to its original name On failure use simple 3-step recovery procedure Eventually implement atomic moves PAXOS Simplified synchronization algorithms 35
  • 36. AltoScale History (2008) Idea. Study of distributed systems AFS, Lustre, Ceph, PVFS, GPFS, Farsite, … Partitioning of the namespace: 4 types of partitioning (2009) Study on scalability limits NameNode optimization (2010) Design with Michael Stack Presentation at HDFS contributors meeting (2011) Plamen implements POC (2012) Rewrite open sourced as Apache Extras project http://code.google.com/a/apache-extras.org/p/giraffa/ 36
  • 37. AltoScale Status  Design stage  One node cluster running Live demo with Plamen 37
  • 38. AltoScale Thank You! 38