Johan Oskarsson

   Developer at Last.fm
Hadoop and Hive committer
What is HDFS?

 Hadoop Hadoop Distributed FileSystem
 Two server types
     Namenode - keeps track of block locations
     Datanode - stores blocks
 Files commonly split up into 128mb blocks
 Replicated to 3 datanodes by default
 Scales well: ~4000 nodes
 Write once
 Large files
"Can you use HDFS in
    production?"
Yes

We have used it in production since
2006, but then again we are insane.
Who is using HDFS in production?

  Yahoo! Largest cluster 4000 nodes (14PB raw storage)
  Facebook. 600 nodes (2PB raw storage)

  Powerset (Microsoft). "up to 400 instances"

  Last.fm. 31 nodes (110TB raw storage)

  ... see more at http://wiki.apache.org/hadoop/PoweredBy
What do they use Hadoop for?

  Yahoo! search index, Yahoo! anti spam, etc

  Facebook ad, profile and application monitoring, etc

  Powerset search index, heavy HBase users

  Last.fm charts, A/B testing stats, site metrics and reporting
"Does HDFS meet people's
needs? If not, what can we do?"
Use case - MR batch jobs

Scenario
1. Large source data files are inserted into HDFS
2. MapReduce job is run
3. Output is saved to HDFS

   HDFS is a great choice for this use case
   Shorter downtime is acceptable
   Backups for important data
   Permissions + trash to avoid user error
Use case - Serving files to a website

Scenario
1. User visits a website to browse photos
2. Lots of image files are requested from HDFS

Potential issues and solutions
   HDFS isn't written for many small files
       Namenode ram limits number of files
       Use HBase or similar
   Namenode goes down
       Crazy "double cluster" solution
       Standby namenode HADOOP-4539
    HDFS isn't really written for low response times
       Work is being done, not high priority
   Use GlusterFS or MogileFS instead
Use case - Reliable, realtime log storage

Scenario
1. A stream of logging events is generated
2. The stream is written directly to HDFS

Potential issues and solutions
   Problems with long write sessions
       HDFS-200, HADOOP-6099, HDFS-278
   Namenode goes down
       Crazy "double cluster" solution
       Standby namenode HADOOP-4539
   Appends not stable
       HDFS-265
Potential dealbreakers

  Small files problem™
     Use archives, sequencefiles or HBase
  Appends/sync not stable
  Namenode not highly available
  Relatively high latency reads
Improvements

In progress or completed
    HADOOP-4539 - Streaming edits to a standby NN
    HDFS-265 - Appends
    HDFS-245 - Symbolic links


Wish list
   HDFS-209 - Tool to edit namenode metadata files
   HDFS-220 - Transparent data archiving off HDFS
   HDFS-503 - Reduce disk space used with erasure coding
Competitors

  Hadoop MapReduce compatible
    CloudStore - http://kosmosfs.sourceforge.net/

  Low response time
     MogileFS - http://www.danga.com/mogilefs/
     GlusterFS - http://www.gluster.org/

HDFS

  • 1.
    Johan Oskarsson Developer at Last.fm Hadoop and Hive committer
  • 2.
    What is HDFS? Hadoop Hadoop Distributed FileSystem Two server types Namenode - keeps track of block locations Datanode - stores blocks Files commonly split up into 128mb blocks Replicated to 3 datanodes by default Scales well: ~4000 nodes Write once Large files
  • 3.
    "Can you useHDFS in production?"
  • 4.
    Yes We have usedit in production since 2006, but then again we are insane.
  • 5.
    Who is usingHDFS in production? Yahoo! Largest cluster 4000 nodes (14PB raw storage) Facebook. 600 nodes (2PB raw storage) Powerset (Microsoft). "up to 400 instances" Last.fm. 31 nodes (110TB raw storage) ... see more at http://wiki.apache.org/hadoop/PoweredBy
  • 6.
    What do theyuse Hadoop for? Yahoo! search index, Yahoo! anti spam, etc Facebook ad, profile and application monitoring, etc Powerset search index, heavy HBase users Last.fm charts, A/B testing stats, site metrics and reporting
  • 7.
    "Does HDFS meetpeople's needs? If not, what can we do?"
  • 8.
    Use case -MR batch jobs Scenario 1. Large source data files are inserted into HDFS 2. MapReduce job is run 3. Output is saved to HDFS HDFS is a great choice for this use case Shorter downtime is acceptable Backups for important data Permissions + trash to avoid user error
  • 9.
    Use case -Serving files to a website Scenario 1. User visits a website to browse photos 2. Lots of image files are requested from HDFS Potential issues and solutions HDFS isn't written for many small files Namenode ram limits number of files Use HBase or similar Namenode goes down Crazy "double cluster" solution Standby namenode HADOOP-4539 HDFS isn't really written for low response times Work is being done, not high priority Use GlusterFS or MogileFS instead
  • 10.
    Use case -Reliable, realtime log storage Scenario 1. A stream of logging events is generated 2. The stream is written directly to HDFS Potential issues and solutions Problems with long write sessions HDFS-200, HADOOP-6099, HDFS-278 Namenode goes down Crazy "double cluster" solution Standby namenode HADOOP-4539 Appends not stable HDFS-265
  • 11.
    Potential dealbreakers Small files problem™ Use archives, sequencefiles or HBase Appends/sync not stable Namenode not highly available Relatively high latency reads
  • 12.
    Improvements In progress orcompleted HADOOP-4539 - Streaming edits to a standby NN HDFS-265 - Appends HDFS-245 - Symbolic links Wish list HDFS-209 - Tool to edit namenode metadata files HDFS-220 - Transparent data archiving off HDFS HDFS-503 - Reduce disk space used with erasure coding
  • 13.
    Competitors HadoopMapReduce compatible CloudStore - http://kosmosfs.sourceforge.net/ Low response time MogileFS - http://www.danga.com/mogilefs/ GlusterFS - http://www.gluster.org/