Design for a Distributed Name Node
Upcoming SlideShare
Loading in...5
×
 

Design for a Distributed Name Node

on

  • 5,381 views

A proposed design for a distributed HDFS NameNode.

A proposed design for a distributed HDFS NameNode.

Statistics

Views

Total Views
5,381
Views on SlideShare
5,380
Embed Views
1

Actions

Likes
7
Downloads
136
Comments
1

1 Embed 1

http://a0.twimg.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Does this distributed name node appear in any paper or publications? Very interested.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Design for a Distributed Name Node Design for a Distributed Name Node Presentation Transcript

  • Reaching 10,000 Aaron Cordova Booz Allen Hamilton | Hadoop Meetup DC | Sep 7 2010 cordova_aaron@bah.com
  • Lots of Applications Require Scalability Machine Learning Text Defense Intelligence Graph Analytics Bio-Metrics Video Bio-Informatics Network Security Images Structured Data
  • Hadoop Scales
  • Linear Scalability Cost -> Data Size -> Shared Nothing Shared Disk
  • Massive Parallelism
  • MapReduce Simplified Distributed Programming Model Fault Tolerant Designed to Scale to Thousands of Servers Many Algorithms Easily Expressed as Map and Reduce
  • HDFS Distributed File System Optimized for High-Throughput Fault Tolerant Through Replication, Checksumming Designed to Scale to 10,000 servers
  • Hadoop is a Platform
  • Pig MapReduce HBase Cascading Flume HDFS Nutch Mahout Hive
  • HBase Scalable Structured store Fast Lookups Durable, Consistent Writes Automatic Partitioning
  • Mahout Scalable Machine Learning Algorithms Clustering Classification
  • Fuzzy Table Low-Latency Parallel Search Generalized Fuzzy Matching Images, Biometrics, Audio
  • One Major Problem
  • HDFS Single NameNode Single NameSpace - easy to serialize operations NameSpace stored entirely in memory Changes written to transaction log first Single Point of Failure Performance Bottleneck?
  • NameNode Scalability “100,000 HDFS clients on a 10,000-node HDFS cluster will exceed the throughput capacity of a single name-node. ... any solution intended for single namespace server optimization lacks Konstantin scalability. Shvachko ... the most promising solutions seem to Login Apr 2010 be based on distributing the namespace server ...”
  • Goal 50 writes/second (thousands) 37.5 25 12.5 0 Single NN Target
  • HDFS Single NameNode Server grade machine Lots of memory Reliable components RAID Hot-Failover
  • Needs Parallelism
  • Scaling NameNode Grow memory Read-only Replicas of NameNode Multiple static namespace partitions Distributed name server, partition namespace dynamically
  • Distributed NameNode Features Fast Lookups Durable, Consistent writes Automatic Partitioning
  • Can we use HBase?
  • Mappings as HBase Tables NameSpace filename : blocks DataNodes node : blocks Blocks block : nodes
  • How to order namespace?
  • Depth First Search Order / /dir1 /dir1/subdir /dir1/subdir/file /dir2/file1 /dir2/file2
  • Depth First Operations Delete (Recursive) Move / Rename
  • Breadth First Search Order 0/ 1/dir1 2/dir2/file1 2/dir2/file2 2/dir1/subdir 3/dir2/subdir/file
  • Breadth First Operations List
  • Current Architecture NameNode DataNode DataNode DFSClient DFSClient
  • Proposed Architecture RServer RServer RServer RServer DNNProxy DNNProxy DNNProxy DNNProxy DataNode DataNode DFSClient DFSClient
  • 100k clients -> 41k writes/s
  • Anticipated Performance 50 writes/second (thousands) 37.5 25 12.5 0 100 150 200 250 # machines hosting namespace Single NN Distributed NN Target
  • Issues Synchronization - multiple writers, changes Name distribution hotspots
  • Current Status Working code exists that uses HBase with slightly modified DFSClient and DataNode for create, write, close, open, read, mkdirs, delete. New component: HealthServer monitors DataNodes and does garbage collection. More like BigTable master, can die, restart without affecting clients.
  • Code Will be at http://code.google.com/p/hdfs-dnn Available under the Apache license - whichever is compatible with Hadoop
  • Doesn’t HBase run on HDFS?
  • Self-Hosted HBase May be possible to have HBase use the same HDFS instance it’s supporting Some recursion and self-reference already exists: HBase Metadata table is itself a table in HBase Have to work out bootstrapping and failure recovery to resolve any potential circular dependencies