HDFS Issues


Published on

Look at things you should be aware of before you go live with an HDFS cluster

Published in: Business, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • HDFS Issues

    1. 1. HDFS Issues Steve Loughran Julio Guijarro Paolo Castagna
    2. 2. HDFS: Hadoop Distributed Filesystem <ul><li>Filesystem to scale to tens of Petabytes </li></ul><ul><li>High IO bandwidth to code near the data </li></ul><ul><li>Replication across machines </li></ul><ul><li>Commodity HW: SATA, CPU blades + gigabit LAN </li></ul><ul><li>Not-Posix: no R/W files, locks </li></ul>
    3. 3. Locality is key for Hadoop performance <ul><li>Where does data live? </li></ul><ul><li>Each site provides a shell script for this </li></ul><ul><li>Future improvement: inline JavaScript/regexp pattern? </li></ul><ul><li>Have a simple IP -> Rack/switch mapping </li></ul>
    4. 4. Append <ul><li>&quot;critical&quot; for HBase performance </li></ul><ul><li>Not stable, reliable yet (HADOOP-5332) </li></ul><ul><li>Disabled in Hadoop 0.20; can be turned on with dfs.support.append=true </li></ul><ul><li>Stable in 0.21? Let Y! find out first. </li></ul>
    5. 5. Data Loss <ul><li>HADOOP-4810 Data lost at cluster startup time </li></ul><ul><li>HADOOP-4702 Failure to clean up failed copies -added invalid tmp blocks </li></ul><ul><li>HADOOP-4663 bad import of data in tmp files -added invalid tmp blocks to the fs </li></ul><ul><li>=>need a good backup strategy </li></ul>
    6. 6. Handling of full disks <ul><li>Joost: Namenode HDD overflow -corrupted edit log leading to crash on every cluster restart </li></ul><ul><li>HADOOP-3574 Better Datanode DiskOutOfSpaceException handling. </li></ul><ul><li>Ongoing problem in mailing lists </li></ul><ul><li>No good tests yet </li></ul><ul><li>==>don't let the namenode disks fill up </li></ul>
    7. 7. Underreplication/ bad handling of corrupt data <ul><li>HADOOP-4543 HADOOP-3314 -inadequate detection of truncated/incorrectly sized blocks (could be picked up on startup, otherwise only the (slower) checksum scanner will find it eventually </li></ul><ul><li>HADOOP-5133 - when the block lengths are all inconsistent, which to choose? </li></ul><ul><li>==> min replication of 3 +stay close to the Yahoo! Configuration </li></ul><ul><li>-better handling of missing blocks? </li></ul>
    8. 8. Limits to scale of Namenode <ul><li>Everything is in memory </li></ul><ul><li>Y! run 32+GB machines and a big blocksize </li></ul><ul><li>Run a secondary for faster restart times </li></ul><ul><li>Secondary namenode memory should be the same as that of the primary. </li></ul>
    9. 9. Failover handling <ul><li>Secondary namenode is not a failover server, it is a log server </li></ul><ul><li>Need to restart the primary namenode and replay actions </li></ul><ul><li>Dynamic hosts? all JVMs (currently) cache the DNS entries of namenode. </li></ul><ul><li>Worker nodes don't reload their configurations when waiting for masters to come back up </li></ul>
    10. 10. Rate of change of filesystem/APIs <ul><li>One-way upgrades 3x a year? </li></ul><ul><li>Y! run the 0.x.1 releases live, though they skipped 0.19 entirely </li></ul><ul><li>Most people are on 18.3. 0.20.1 looks good (with append disabled)‏ </li></ul><ul><li>Rollback via distCP to another cluster </li></ul>
    11. 11. Security: none <ul><li>User identification added after last.fm deleted a fileystem by accident. </li></ul><ul><li>Caller provides name -taken on trust </li></ul><ul><li>Working towards running MR jobs with restricted user rights </li></ul><ul><li>-there is no security, just defence against accidents </li></ul>
    12. 12. Is HDFS ready for production? Maybe, but with care
    13. 13. 7 May 2009