SlideShare a Scribd company logo
Hadoop and Performance

       Todd Lipcon – Cloudera
        todd@cloudera.com

    Yanpei Chen – UC Berkeley
    ychen2@eecs.berkeley.edu



   Copyright 2011 Cloudera Inc. All rights reserved
Measures of performance

• MapReduce/HDFS
  • Per-job latency (measure job with a stop watch)
  • Cluster throughput (measure slot seconds)
     • Single-job latency and cluster throughput may be at odds! (eg
       speculative execution)
  • Efficiency (CPU-seconds used, bytes spilled, bytes shuffled)

• HBase
  • Average, 99th percentile latency
  • Write/read throughput
  • In-cache vs out-of-cache
                     Copyright 2011 Cloudera Inc. All rights reserved   2
How do we make our favorite elephant handle
like a Ferrari?




Photo credit: Telethon Italia, http://www.flickr.com/photos/telethonitalia/6171706697/
                            Copyright 2011 Cloudera Inc. All rights reserved             3
Performance from Hadoop developer’s POV
• What do we measure?
   • Overhead = effort spent doing work you don’t care about

• CPU overhead, lock contention:
   • CPU-seconds per HDFS GB written/read
   • Microseconds of latency to read 64K from OS cache via HDFS
   • Number of concurrent reads per second from many threads
• Scheduler/framework overhead
   • No-op MR jobs (sleep –mt 1 –rt 1 –m 5000 –r 1)
• Disk utilization
   • iostat: avg sectors/req, %util, MB/sec R/W

                     Copyright 2011 Cloudera Inc. All rights reserved   4
Hadoop Performance myths

• Java is slow?
   • Most of Hadoop is IO or network bound, not CPU
   • In the few CPU hot spots, we can use JNI or sun.misc.Unsafe
• Java doesn’t give you enough low-level system access?
   • JNI allows us to call any syscall that C can!
   • We can integrate tight assembly code too!
• Hadoop’s IO path has too many layers
   • We trust that the 60+ developers of ext4, XFS, and the Linux
     IO scheduler know more about low-level IO than we do!
   • Every system has a file layout and IO scheduler; those that do
     their own usually do it for portability (some RDBMS)
                     Copyright 2011 Cloudera Inc. All rights reserved   5
Lies, damn lies, and statistics benchmarks




           Copyright 2011 Cloudera Inc. All rights reserved   6
HDFS/MR Improvements: IO/Caching
• MapReduce is designed to do primarily sequential IO
• Linux IO scheduler read-ahead heuristics are weak
   • HDFS 0.20 does many seeks even for sequential access
• Many datasets are significantly larger than memory
  capacity; caching them is a waste of memory
   • Doing MR jobs on large input sets evicts useful data from
     memory (dentries, inodes, HBase HFiles)
• Linux employs lazy page writeback of written data in
  case it is re-written (which HDFS never does!)
   • If dirty pages take up more than dirty_ratio of memory, forces
     blocking of all writes! (IO stalls even on underutilized drives)

                     Copyright 2011 Cloudera Inc. All rights reserved   7
HDFS/MR Improvements: IO/Caching Solutions
• Linux provides three useful syscalls:
   • posix_fadvise(POSIX_FADV_WILLNEED): hint to start reading in a
     given range of a file (e.g. forced readahead)
   • posix_fadvise(POSIX_FADV_DONTNEED): hint to drop a range of
     file from cache (e.g. after we’ve read it)
   • sync_file_range(SYNC_FILE_RANGE_WRITE): enqueue dirty pages
     for immediate (but asynchronous) writeback
• Explicit readahead short-circuits bad Linux
  heuristics, causes larger IOs to device. Marked async in
  kernel to allow better reordering.
• Dropping unnecessary data from cache leaves more room
  for more important data.
• Explicit writeback avoids blocking behavior when too many
  pages are dirty.
                    Copyright 2011 Cloudera Inc. All rights reserved   8
HDFS/MR Improvements: IO/Caching Results

• ~20% improvement on terasort wall-clock
• Disk utilization much higher
• CPU utilization graph much smoother




          Before                                                      After
                   Copyright 2011 Cloudera Inc. All rights reserved           9
MR CPU Improvements: Sort

• CPU inefficiency:
   • Many CPU cycles in WritableComparator.compareBytes
   • Naïve looping over parallel byte arrays: slow!
   • Optimization: Use sun.misc.Unsafe for ~4x reduction in CPU
     usage of this function! (compare 64 bits at a time)
• CPU cache usage:
   • “Indirect QuickSort” is used for the sort algorithm in MR
   • Sort operates on an array of pointers to data, but bad cache-
     locality properties
   • Optimization: include 4-byte key prefix with pointer, avoiding
     memory indirection for most comparisons
                    Copyright 2011 Cloudera Inc. All rights reserved   10
Benchmark results
CDH3u2 vs optimized build with cache-conscious sort and IO
scheduling advice.
1TB sort, 10 nodes, 24G RAM, 6 disks (lower is better)
                 1

                0.9

                0.8
Relative time




                0.7

                0.6

                0.5
                                                                                                      CDH3u2
                0.4
                                                                                                      Optimized
                0.3

                0.2

                0.1
                      30m 24m                      31h 21h                            19h 15h
                 0


                      Wall-clock               Map slot-hours                     Reduce slot-hours

                                   Copyright 2011 Cloudera Inc. All rights reserved                               11
MR Improvements: Scheduler

• 0.20.2:
   • TaskTrackers heartbeat once every 3 seconds, even on a small
     cluster
   • Each heartbeat assigns exactly 1 task to a TT
   • If tasks run for less than 3*slotCount seconds, TT will be
     underutilized (tasks finish faster than assigned)
• Trunk/0.20.205/CDH3:
   • Assign multiple tasks per heartbeat
   • Heartbeat rate dropped to 0.3 seconds on small clusters
     (adaptive for larger)
   • “Out-of-band heartbeats” on any task completion
                    Copyright 2011 Cloudera Inc. All rights reserved   12
MR Improvements: Scheduler results

                                                                                30.3
            24


                                                                                         0.20.2
                        12
                                                                                         Current

                                                                     3.3


       Minimum job latency (s)                                 Tasks scheduled/sec
          (lower is better)                                         (higher is better)

10 node cluster, 10 map slots per machine.
hadoop jar examples.jar sleep –mt 1 –rt 1 –m 5000 –r 1
                        Copyright 2011 Cloudera Inc. All rights reserved                           13
HDFS CPU Improvements: Checksumming

• HDFS checksums every piece of data in/out
• Significant CPU overhead
   • Measure by putting ~1G in HDFS, cat file in a loop
   • 0.20.2: ~30-50% of CPU time is CRC32 computation!
• Optimizations:
   • Switch to “bulk” API: verify/compute 64KB at a time instead
     of 512 bytes (better instruction cache locality, amortize JNI
     overhead)
   • Switch to CRC32C polynomial, SSE4.2, highly tuned assembly
     (~8 bytes per cycle with instruction level parallelism!)

                    Copyright 2011 Cloudera Inc. All rights reserved   14
Checksum improvements
                                 (lower is better)
          1360us
100%
 90%
 80%
 70%
 60%               760us
 50%
                                                                                           CDH3u0
 40%
                                                                                           Optimized
 30%
 20%
 10%
  0%
       Random-read latency        Random-read CPU                    Sequential-read CPU
                                       usage                               usage

 Post-optimization: only 16% overhead vs un-checksummed access
 Maintain ~800MB/sec from a single thread reading OS cache
                           Copyright 2011 Cloudera Inc. All rights reserved                         15
HDFS Random access

• 0.20.2:
   • Each individual read operation reconnects to DataNode
   • Much TCP Handshake overhead, thread creation, etc
• 0.23:
   • Clients cache open sockets to each datanode (like HTTP
     Keepalive)
   • HBase-like access patterns use a finite small number of
     sockets
   • Rewritten BlockReader to eliminate a data copy
   • Eliminated lock contention in DataNode’s FSDataset class

                    Copyright 2011 Cloudera Inc. All rights reserved   16
Random-read micro benchmark (higher is better)
                  700
                  600
 Speed (MB/sec)




                  500
                  400
                  300
                  200
                  100
                        106 253 299                       247 488 635                      187 477 633
                    0
                        4 threads, 1 file               16 threads, 1 file                8 threads, 2 files
                             0.20.2         Trunk (no native)                   Trunk (native)
        TestParallelRead benchmark, modified to 100% random read proportion.
        Quad core Core i7 Q820@1.73Ghz

                                    Copyright 2011 Cloudera Inc. All rights reserved                           17
Random-read macro benchmark (HBase YCSB)




             Copyright 2011 Cloudera Inc. All rights reserved   18
MR2 Improvements: Shuffle

• Trunk/MR2 features many shuffle improvements:
  • No more io.sort.record.percent (important but obscure
    tunable): now auto-tuned
  • Reducers now fetch several map outputs on one TCP
    connection instead of reconnecting (better fetch throughput)
  • MR2: shuffle server rewritten with Netty, zero-copy sendfile
    support (less CPU usage on TaskTracker, fewer timeouts)


• 30% improvement in shuffle throughput


                   Copyright 2011 Cloudera Inc. All rights reserved   19
Summary

• Hadoop is now faster than ever…
   • 2-3x random read performance boost vs 0.20.2
   • Significantly less CPU usage for a given workload
   • Up to 2x faster wall-clock time for shuffle-intensive jobs
• …and getting faster every day!




                     Copyright 2011 Cloudera Inc. All rights reserved   20
Upstream JIRAs

•   Random read keepalive: HDFS-941
•   Faster Checksums: HDFS-2080
•   fadvise/sync_file_range: HADOOP-7714
•   Faster compareBytes: HADOOP-7761
•   MR sort cache locality: MAPREDUCE-3235
•   Rewritten map-side shuffle: MAPREDUCE-64
•   Rewritten reduce side shuffle: MAPREDUCE-318
•   Others still in progress:
    • http://tiny.cloudera.com/perf-jiras

                     Copyright 2011 Cloudera Inc. All rights reserved   21
Coming to a Hadoop near you!

• Some of these improvements will be in CDH3u3:
   • (~25% faster terasort, ~2x faster HBase random read)
   • 10-node terasort, 6x7200RPM, 24G RAM: 24.5 minutes
• Almost all of these improvements will be in
  Apache Hadoop 0.23.1

• Many improvements come from upstream Linux itself!
   • ext4 vs ext3 (less fragmentation, faster unlink())
   • per-block-device writeback (Linux 2.6.32)


                     Copyright 2011 Cloudera Inc. All rights reserved   22
Next: Advances in MapReduce performance
measurement




                Copyright 2011 Cloudera Inc. All rights reserved   Slide 23
State of the art in MapReduce performance
• Essential but insufficient benchmarks

                 Gridmix2 Hive BM Hibench PigMix Gridmix3




                       Copyright 2011 Cloudera Inc. All rights reserved   Slide 24
State of the art in MapReduce performance
 • Essential but insufficient benchmarks

                       Gridmix2 Hive BM Hibench PigMix Gridmix3
Data sizes
Intensity variations
Job types
Cluster independent
Synthetic workloads



                             Copyright 2011 Cloudera Inc. All rights reserved   Slide 25
State of the art in MapReduce performance
 • Essential but insufficient benchmarks
Representative?
                       Gridmix2 Hive BM Hibench PigMix Gridmix3
Data sizes
Intensity variations
Job types
Cluster independent
Synthetic workloads
           Replay?


                             Copyright 2011 Cloudera Inc. All rights reserved   Slide 26
State of the art in MapReduce performance
 • Essential but insufficient benchmarks
Representative?
                       Gridmix2 Hive BM Hibench PigMix Gridmix3
Data sizes                                                                           √
Intensity variations                                                                 √
Job types                                                                            √
Cluster independent       √                  √                     √             √
Synthetic workloads
           Replay?


                              Copyright 2011 Cloudera Inc. All rights reserved       Slide 27
Need to measure what production
MapReduce systems REALLY do

From generic large elephant               to realistic elephant herd
        (benchmarks)                               (workloads)




                       Copyright 2011 Cloudera Inc. All rights reserved   Slide 28
Need to advance the state of the art
• Develop scientific knowledge
   - Workload description, synthesis, replay

• Apply to system engineering
   - Design, measurement, QA, certification, design




                         Copyright 2011 Cloudera Inc. All rights reserved   Slide 29
First comparison of production MapReduce
• Facebook trace (FB)
   - 6 months in 2009, approx. 6000 jobs per day

• Cloudera customer trace (CC-b)
   - 1 week in 2011, approx. 3000 jobs per day

• Not in this talk – Traces from 4 other customers
  - E-commerce, media/telecomm, retail


                        Copyright 2011 Cloudera Inc. All rights reserved   Slide 30
Three dimensions to compare

     Representative?

      Data sizes
      Intensity variations
      Job types
      Cluster independent
      Synthetic workloads




                       Copyright 2011 Cloudera Inc. All rights reserved   Slide 31
Different job sizes

  CDF                       CDF                    CDF
  1                         1                      1
0.8                       0.8                    0.8
                                                                     CC-b
0.6                       0.6                    0.6
0.4                       0.4                    0.4                 FB
0.2                       0.2                    0.2
  0                         0                      0
  1E+0E+3E+6GB TB
    0 1 1 1E+9
      KB MB       1E+12     1E+0E+3E+6GB TB
                              0 KB 1 1E+9
                                1 MB                 0 KB MB 1 1E+12
                                                       1 1 GB TB
                                            1E+12 1E+0E+3E+6E+9
       Input size                Shuffle size            Output size




                               Copyright 2011 Cloudera Inc. All rights reserved   Slide 32
Different workload time variation
                                  FB                                                          CC-b
              1000                                                300
   Number                                                         200
   of jobs     500
                                                                  100
                 0                                                    0
 Input +     2.E+13                                           4.E+13
shuffle +
  output     1.E+13                                           2.E+13
data size
             0.E+00                                           0.E+00
             6.E+06                                           6.E+07
  Map +                                                       4.E+07
  reduce   3.E+06
                                                              2.E+07
task times
           0.E+00                                             0.E+00
                      0   1    2 3 4 5 6 7                                  0      1      2 3 4 5 6 7
                              Days since a Sunday                                         Days since a Tuesday


                                       Copyright 2011 Cloudera Inc. All rights reserved                  Slide 33
Different job types – identify by kmeans

                                                 Map time Red. time
   Input   Shuffle   Output   Duration
                                                (task sec) (task sec)




                              Copyright 2011 Cloudera Inc. All rights reserved   Slide 34
Different job types – identify by kmeans

% of                                                    Map time Red. time
        Input    Shuffle   Output    Duration
jobs                                                   (task sec) (task sec)

 95.8    21 KB         0 871 KB        32 s                      20                     0
  3.3   381 KB         0 1.9 GB      21 min                   6097                      0
  0.2    10 KB         0 4.2 GB 1 hr 50 min                  26,321                     0
  0.1   405 KB         0 447 GB 1 hr 10 min                  66,657                     0
 0.02   446 KB         0 1.1 TB 5 hrs 5 min                 125,662                     0
 0.05   230 KB    8.8 GB 491 MB      15 min                 104,338                66,760
 0.03   1.9 TB   502 MB     2.6 GB       30 min            348, 942                76,738

 0.01 418 GB      2.5 TB    45 GB 1 hr 25 min            1,076,089               974,395
 0.07 255 GB     788 GB    1.6 GB        35 min           384, 562               338,050
0.002 7.6 TB      51 GB    104 KB        55 min          4,843,452               853,911


                                     Copyright 2011 Cloudera Inc. All rights reserved       Slide 35
Different job types – identify by kmeans

% of                                                    Map time Red. time
        Input    Shuffle   Output    Duration                                Description
jobs                                                   (task sec) (task sec)

 95.8    21 KB         0 871 KB        32 s                      20                     0   Small jobs
  3.3   381 KB         0 1.9 GB      21 min                   6097                      0   Load data, fast
  0.2    10 KB         0 4.2 GB 1 hr 50 min                  26,321                     0   Load data, slow
  0.1   405 KB         0 447 GB 1 hr 10 min                  66,657                     0   Load data, large
 0.02   446 KB         0 1.1 TB 5 hrs 5 min                 125,662                     0   Load data, huge
 0.05   230 KB    8.8 GB 491 MB      15 min                 104,338                66,760   Aggregate, fast
                                                                                            Aggregate and
 0.03   1.9 TB   502 MB     2.6 GB       30 min            348, 942                76,738
                                                                                            expand
                                                                                            Expand and
 0.01 418 GB      2.5 TB    45 GB 1 hr 25 min            1,076,089               974,395
                                                                                            aggregate
 0.07 255 GB     788 GB    1.6 GB        35 min           384, 562               338,050    Data transform
0.002 7.6 TB      51 GB    104 KB        55 min          4,843,452               853,911    Data summary


                                     Copyright 2011 Cloudera Inc. All rights reserved                  Slide 36
Different job types – summary
           FB                                        CC-b
% of jobs Description          % of jobs Description
    95.8   Small jobs           88.9 Small job
     3.3   Load data, fast       7.7 Data transform “small”
     0.2   Load data, slow       2.8 Data transform “med”
     0.1   Load data, large          Data transform “large”,
                                 0.5
    0.02   Load data, huge           map heavy
    0.05   Aggregate, fast       0.1 Aggregate
    0.03   Aggregate and expand
    0.01   Expand and aggregate
    0.07   Data transform
   0.002   Data summary

                         Copyright 2011 Cloudera Inc. All rights reserved   Slide 37
Measure performance by replaying workload

• Major advance over artificial benchmarks
  - “Performance is X for workload A, Y for workload B.”

• Challenges:
   - Turn long trace (6 months) into short workload (1 day)
   - Do so while keeping the workload representative




                       Copyright 2011 Cloudera Inc. All rights reserved   Slide 38
Replay concatenated trace samples

• Problem: Need short and representative workloads

• Solution: concatenate continuous window trace samples
 E.g. Day long workload = Concat(24 x continuous hourly samples)
                    or = Concat(6 x continuous 4-hourly samples)

• Strength: Reproduce all statistics within a sample window

• Limits: Introduces (controllable) sampling error



                         Copyright 2011 Cloudera Inc. All rights reserved   Slide 39
Verify that we get representative workloads
        CDFs for synthetic
           workloads
  CDF                          CDF                                              CDF
 1                            1                                                1
0.8                          0.8                                            0.8
0.6                          0.6                                            0.6
0.4                          0.4                                            0.4
0.2                          0.2                                            0.2
 0                    CDF for  0                           0
                    whole race
    0   KB MB GB TB               0   KB MB GB TB             0    KB MB GB TB
  1E+0 1E+3 1E+6 1E+91E+12 1E+0 1E+3 1E+6 1E+91E+12 1E+0 1E+3 1E+6 1E+91E+12
   Output size – 4hr samples     Output size – 1hr samples   Output size – 15min samples




                                     Copyright 2011 Cloudera Inc. All rights reserved   Slide 40
Case study: Compare FIFO vs. fair scheduler

                        Replay on
                        System 1
Fixed workload,                                                        Performance
Fixed metrics,                                                         comparison
  Fixed env.            Replay on
                        System 2


   Workload = Facebook-like synthetic workload, 24 x 1hr samples
      Metrics = average job latency
 Environment = 200-machines EC2 cluster, m1.large instances
   System 1 = FIFO scheduler
   System 2 = Fair scheduler

                        Copyright 2011 Cloudera Inc. All rights reserved             Slide 41
Choice of scheduler depends on workload!
                       100000                                                                        Fair scheduler           Better
 Completion time (s)



                                                                                                     FIFO scheduler

                                      1000


                                        10
                                         4800        4820        4840           4860               4880               4900
                                                   Job index (out of 6000+ total jobs in a day)
                                      1000                                                             Fair scheduler
                 Completion time(s)




                                                                                                       FIFO scheduler         Better

                                       100


                                        10
                                             100      120        140         160          180                           200
                                                    Job index (out of 6000+ total jobs in a day)
                                                                   Copyright 2011 Cloudera Inc. All rights reserved               Slide 42
Takeaways

• Need to consider workload-level performance!
  • Data sizes, arrival patterns, job types

• Understanding performance is a community effort!
  • Not everyone is like Google/Yahoo/Facebook
  • Different features and metrics relevant to each use case

• Check out our evolving MapReduce workload repository
  • www.eecs.berkeley.edu/~ychen2/SWIM.html


                                 Thanks!!!
                          Copyright 2011 Cloudera Inc. All rights reserved   Slide 43
Backup slides




                Copyright 2011 Cloudera Inc. All rights reserved   Slide 44
Indirect quicksort
                                      Metadata entries are swapped as sort proceeds
  Metadata buffer
    Metadata 1        Metadata 2                        …                     Metadata M   Metadata N



  Partition #    Key Ptr    Value Ptr           Value Len




        Key 1              Value 1                    …                       Key N        Value N
 Data buffer


Layer of indirection means that every compare(), even of adjacent
metadata, causes CPU cache misses!

                           Copyright 2011 Cloudera Inc. All rights reserved                             45
Indirect quicksort with “comparison proxies”
  Metadata buffer
    Metadata 1         Metadata 2                         …                     Metadata M   Metadata N



  Partition #   Key prefix      Key Ptr            Value Ptr            Value Len
       Copy first 4
      bytes of key
    into metadata
                                   Key 1                       Value 1               …
                      Data buffer


Many comparisons avoid the indirect memory lookup into the data
buffer: 1.6x improvement in total map side CPU usage for terasort!

                             Copyright 2011 Cloudera Inc. All rights reserved                             46

More Related Content

What's hot

Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
Subhas Kumar Ghosh
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
Kathleen Ting
 
Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best Practices
Cloudera, Inc.
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
Douglas Bernardini
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
Great Wide Open
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
Cloudera, Inc.
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
Alex Moundalexis
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big Data
DataWorks Summit
 
Nn ha hadoop world.final
Nn ha hadoop world.finalNn ha hadoop world.final
Nn ha hadoop world.final
Hortonworks
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
Cloudera, Inc.
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
DataWorks Summit
 
Chicago spark meetup-april2017-public
Chicago spark meetup-april2017-publicChicago spark meetup-april2017-public
Chicago spark meetup-april2017-public
Guru Dharmateja Medasani
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
Vitthal Gogate
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hive
rxu
 
ha_module5
ha_module5ha_module5
ha_module5
Gurmukh Singh
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
Alex Moundalexis
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
Adam Kawa
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
mcsrivas
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource Manager
VertiCloud Inc
 

What's hot (20)

Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best Practices
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big Data
 
Nn ha hadoop world.final
Nn ha hadoop world.finalNn ha hadoop world.final
Nn ha hadoop world.final
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
 
Chicago spark meetup-april2017-public
Chicago spark meetup-april2017-publicChicago spark meetup-april2017-public
Chicago spark meetup-april2017-public
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hive
 
ha_module5
ha_module5ha_module5
ha_module5
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource Manager
 

Similar to Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera

[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
NAVER D2
 
Hadoop on a personal supercomputer
Hadoop on a personal supercomputerHadoop on a personal supercomputer
Hadoop on a personal supercomputer
Paul Dingman
 
Deployment
DeploymentDeployment
Deployment
rogerbodamer
 
LUG 2014
LUG 2014LUG 2014
LUG 2014
Hitoshi Sato
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
Colin Charles
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
MongoDB
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
aaronmorton
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
MongoDB
 
Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)
Yoshinori Matsunobu
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
MongoDB
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
Bigstep
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
Chris Adkin
 
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
Alex Rasmussen
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
shinolajla
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
Nicolas Poggi
 
Best Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on SolarisBest Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on Solaris
Jignesh Shah
 
R&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsR&D work on pre exascale HPC systems
R&D work on pre exascale HPC systems
Joshua Mora
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
PyData
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
Mike Pittaro
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
David Grier
 

Similar to Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera (20)

[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
 
Hadoop on a personal supercomputer
Hadoop on a personal supercomputerHadoop on a personal supercomputer
Hadoop on a personal supercomputer
 
Deployment
DeploymentDeployment
Deployment
 
LUG 2014
LUG 2014LUG 2014
LUG 2014
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Best Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on SolarisBest Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on Solaris
 
R&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsR&D work on pre exascale HPC systems
R&D work on pre exascale HPC systems
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 

Recently uploaded (20)

What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 

Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera

  • 1. Hadoop and Performance Todd Lipcon – Cloudera todd@cloudera.com Yanpei Chen – UC Berkeley ychen2@eecs.berkeley.edu Copyright 2011 Cloudera Inc. All rights reserved
  • 2. Measures of performance • MapReduce/HDFS • Per-job latency (measure job with a stop watch) • Cluster throughput (measure slot seconds) • Single-job latency and cluster throughput may be at odds! (eg speculative execution) • Efficiency (CPU-seconds used, bytes spilled, bytes shuffled) • HBase • Average, 99th percentile latency • Write/read throughput • In-cache vs out-of-cache Copyright 2011 Cloudera Inc. All rights reserved 2
  • 3. How do we make our favorite elephant handle like a Ferrari? Photo credit: Telethon Italia, http://www.flickr.com/photos/telethonitalia/6171706697/ Copyright 2011 Cloudera Inc. All rights reserved 3
  • 4. Performance from Hadoop developer’s POV • What do we measure? • Overhead = effort spent doing work you don’t care about • CPU overhead, lock contention: • CPU-seconds per HDFS GB written/read • Microseconds of latency to read 64K from OS cache via HDFS • Number of concurrent reads per second from many threads • Scheduler/framework overhead • No-op MR jobs (sleep –mt 1 –rt 1 –m 5000 –r 1) • Disk utilization • iostat: avg sectors/req, %util, MB/sec R/W Copyright 2011 Cloudera Inc. All rights reserved 4
  • 5. Hadoop Performance myths • Java is slow? • Most of Hadoop is IO or network bound, not CPU • In the few CPU hot spots, we can use JNI or sun.misc.Unsafe • Java doesn’t give you enough low-level system access? • JNI allows us to call any syscall that C can! • We can integrate tight assembly code too! • Hadoop’s IO path has too many layers • We trust that the 60+ developers of ext4, XFS, and the Linux IO scheduler know more about low-level IO than we do! • Every system has a file layout and IO scheduler; those that do their own usually do it for portability (some RDBMS) Copyright 2011 Cloudera Inc. All rights reserved 5
  • 6. Lies, damn lies, and statistics benchmarks Copyright 2011 Cloudera Inc. All rights reserved 6
  • 7. HDFS/MR Improvements: IO/Caching • MapReduce is designed to do primarily sequential IO • Linux IO scheduler read-ahead heuristics are weak • HDFS 0.20 does many seeks even for sequential access • Many datasets are significantly larger than memory capacity; caching them is a waste of memory • Doing MR jobs on large input sets evicts useful data from memory (dentries, inodes, HBase HFiles) • Linux employs lazy page writeback of written data in case it is re-written (which HDFS never does!) • If dirty pages take up more than dirty_ratio of memory, forces blocking of all writes! (IO stalls even on underutilized drives) Copyright 2011 Cloudera Inc. All rights reserved 7
  • 8. HDFS/MR Improvements: IO/Caching Solutions • Linux provides three useful syscalls: • posix_fadvise(POSIX_FADV_WILLNEED): hint to start reading in a given range of a file (e.g. forced readahead) • posix_fadvise(POSIX_FADV_DONTNEED): hint to drop a range of file from cache (e.g. after we’ve read it) • sync_file_range(SYNC_FILE_RANGE_WRITE): enqueue dirty pages for immediate (but asynchronous) writeback • Explicit readahead short-circuits bad Linux heuristics, causes larger IOs to device. Marked async in kernel to allow better reordering. • Dropping unnecessary data from cache leaves more room for more important data. • Explicit writeback avoids blocking behavior when too many pages are dirty. Copyright 2011 Cloudera Inc. All rights reserved 8
  • 9. HDFS/MR Improvements: IO/Caching Results • ~20% improvement on terasort wall-clock • Disk utilization much higher • CPU utilization graph much smoother Before After Copyright 2011 Cloudera Inc. All rights reserved 9
  • 10. MR CPU Improvements: Sort • CPU inefficiency: • Many CPU cycles in WritableComparator.compareBytes • Naïve looping over parallel byte arrays: slow! • Optimization: Use sun.misc.Unsafe for ~4x reduction in CPU usage of this function! (compare 64 bits at a time) • CPU cache usage: • “Indirect QuickSort” is used for the sort algorithm in MR • Sort operates on an array of pointers to data, but bad cache- locality properties • Optimization: include 4-byte key prefix with pointer, avoiding memory indirection for most comparisons Copyright 2011 Cloudera Inc. All rights reserved 10
  • 11. Benchmark results CDH3u2 vs optimized build with cache-conscious sort and IO scheduling advice. 1TB sort, 10 nodes, 24G RAM, 6 disks (lower is better) 1 0.9 0.8 Relative time 0.7 0.6 0.5 CDH3u2 0.4 Optimized 0.3 0.2 0.1 30m 24m 31h 21h 19h 15h 0 Wall-clock Map slot-hours Reduce slot-hours Copyright 2011 Cloudera Inc. All rights reserved 11
  • 12. MR Improvements: Scheduler • 0.20.2: • TaskTrackers heartbeat once every 3 seconds, even on a small cluster • Each heartbeat assigns exactly 1 task to a TT • If tasks run for less than 3*slotCount seconds, TT will be underutilized (tasks finish faster than assigned) • Trunk/0.20.205/CDH3: • Assign multiple tasks per heartbeat • Heartbeat rate dropped to 0.3 seconds on small clusters (adaptive for larger) • “Out-of-band heartbeats” on any task completion Copyright 2011 Cloudera Inc. All rights reserved 12
  • 13. MR Improvements: Scheduler results 30.3 24 0.20.2 12 Current 3.3 Minimum job latency (s) Tasks scheduled/sec (lower is better) (higher is better) 10 node cluster, 10 map slots per machine. hadoop jar examples.jar sleep –mt 1 –rt 1 –m 5000 –r 1 Copyright 2011 Cloudera Inc. All rights reserved 13
  • 14. HDFS CPU Improvements: Checksumming • HDFS checksums every piece of data in/out • Significant CPU overhead • Measure by putting ~1G in HDFS, cat file in a loop • 0.20.2: ~30-50% of CPU time is CRC32 computation! • Optimizations: • Switch to “bulk” API: verify/compute 64KB at a time instead of 512 bytes (better instruction cache locality, amortize JNI overhead) • Switch to CRC32C polynomial, SSE4.2, highly tuned assembly (~8 bytes per cycle with instruction level parallelism!) Copyright 2011 Cloudera Inc. All rights reserved 14
  • 15. Checksum improvements (lower is better) 1360us 100% 90% 80% 70% 60% 760us 50% CDH3u0 40% Optimized 30% 20% 10% 0% Random-read latency Random-read CPU Sequential-read CPU usage usage Post-optimization: only 16% overhead vs un-checksummed access Maintain ~800MB/sec from a single thread reading OS cache Copyright 2011 Cloudera Inc. All rights reserved 15
  • 16. HDFS Random access • 0.20.2: • Each individual read operation reconnects to DataNode • Much TCP Handshake overhead, thread creation, etc • 0.23: • Clients cache open sockets to each datanode (like HTTP Keepalive) • HBase-like access patterns use a finite small number of sockets • Rewritten BlockReader to eliminate a data copy • Eliminated lock contention in DataNode’s FSDataset class Copyright 2011 Cloudera Inc. All rights reserved 16
  • 17. Random-read micro benchmark (higher is better) 700 600 Speed (MB/sec) 500 400 300 200 100 106 253 299 247 488 635 187 477 633 0 4 threads, 1 file 16 threads, 1 file 8 threads, 2 files 0.20.2 Trunk (no native) Trunk (native) TestParallelRead benchmark, modified to 100% random read proportion. Quad core Core i7 Q820@1.73Ghz Copyright 2011 Cloudera Inc. All rights reserved 17
  • 18. Random-read macro benchmark (HBase YCSB) Copyright 2011 Cloudera Inc. All rights reserved 18
  • 19. MR2 Improvements: Shuffle • Trunk/MR2 features many shuffle improvements: • No more io.sort.record.percent (important but obscure tunable): now auto-tuned • Reducers now fetch several map outputs on one TCP connection instead of reconnecting (better fetch throughput) • MR2: shuffle server rewritten with Netty, zero-copy sendfile support (less CPU usage on TaskTracker, fewer timeouts) • 30% improvement in shuffle throughput Copyright 2011 Cloudera Inc. All rights reserved 19
  • 20. Summary • Hadoop is now faster than ever… • 2-3x random read performance boost vs 0.20.2 • Significantly less CPU usage for a given workload • Up to 2x faster wall-clock time for shuffle-intensive jobs • …and getting faster every day! Copyright 2011 Cloudera Inc. All rights reserved 20
  • 21. Upstream JIRAs • Random read keepalive: HDFS-941 • Faster Checksums: HDFS-2080 • fadvise/sync_file_range: HADOOP-7714 • Faster compareBytes: HADOOP-7761 • MR sort cache locality: MAPREDUCE-3235 • Rewritten map-side shuffle: MAPREDUCE-64 • Rewritten reduce side shuffle: MAPREDUCE-318 • Others still in progress: • http://tiny.cloudera.com/perf-jiras Copyright 2011 Cloudera Inc. All rights reserved 21
  • 22. Coming to a Hadoop near you! • Some of these improvements will be in CDH3u3: • (~25% faster terasort, ~2x faster HBase random read) • 10-node terasort, 6x7200RPM, 24G RAM: 24.5 minutes • Almost all of these improvements will be in Apache Hadoop 0.23.1 • Many improvements come from upstream Linux itself! • ext4 vs ext3 (less fragmentation, faster unlink()) • per-block-device writeback (Linux 2.6.32) Copyright 2011 Cloudera Inc. All rights reserved 22
  • 23. Next: Advances in MapReduce performance measurement Copyright 2011 Cloudera Inc. All rights reserved Slide 23
  • 24. State of the art in MapReduce performance • Essential but insufficient benchmarks Gridmix2 Hive BM Hibench PigMix Gridmix3 Copyright 2011 Cloudera Inc. All rights reserved Slide 24
  • 25. State of the art in MapReduce performance • Essential but insufficient benchmarks Gridmix2 Hive BM Hibench PigMix Gridmix3 Data sizes Intensity variations Job types Cluster independent Synthetic workloads Copyright 2011 Cloudera Inc. All rights reserved Slide 25
  • 26. State of the art in MapReduce performance • Essential but insufficient benchmarks Representative? Gridmix2 Hive BM Hibench PigMix Gridmix3 Data sizes Intensity variations Job types Cluster independent Synthetic workloads Replay? Copyright 2011 Cloudera Inc. All rights reserved Slide 26
  • 27. State of the art in MapReduce performance • Essential but insufficient benchmarks Representative? Gridmix2 Hive BM Hibench PigMix Gridmix3 Data sizes √ Intensity variations √ Job types √ Cluster independent √ √ √ √ Synthetic workloads Replay? Copyright 2011 Cloudera Inc. All rights reserved Slide 27
  • 28. Need to measure what production MapReduce systems REALLY do From generic large elephant to realistic elephant herd (benchmarks) (workloads) Copyright 2011 Cloudera Inc. All rights reserved Slide 28
  • 29. Need to advance the state of the art • Develop scientific knowledge - Workload description, synthesis, replay • Apply to system engineering - Design, measurement, QA, certification, design Copyright 2011 Cloudera Inc. All rights reserved Slide 29
  • 30. First comparison of production MapReduce • Facebook trace (FB) - 6 months in 2009, approx. 6000 jobs per day • Cloudera customer trace (CC-b) - 1 week in 2011, approx. 3000 jobs per day • Not in this talk – Traces from 4 other customers - E-commerce, media/telecomm, retail Copyright 2011 Cloudera Inc. All rights reserved Slide 30
  • 31. Three dimensions to compare Representative? Data sizes Intensity variations Job types Cluster independent Synthetic workloads Copyright 2011 Cloudera Inc. All rights reserved Slide 31
  • 32. Different job sizes CDF CDF CDF 1 1 1 0.8 0.8 0.8 CC-b 0.6 0.6 0.6 0.4 0.4 0.4 FB 0.2 0.2 0.2 0 0 0 1E+0E+3E+6GB TB 0 1 1 1E+9 KB MB 1E+12 1E+0E+3E+6GB TB 0 KB 1 1E+9 1 MB 0 KB MB 1 1E+12 1 1 GB TB 1E+12 1E+0E+3E+6E+9 Input size Shuffle size Output size Copyright 2011 Cloudera Inc. All rights reserved Slide 32
  • 33. Different workload time variation FB CC-b 1000 300 Number 200 of jobs 500 100 0 0 Input + 2.E+13 4.E+13 shuffle + output 1.E+13 2.E+13 data size 0.E+00 0.E+00 6.E+06 6.E+07 Map + 4.E+07 reduce 3.E+06 2.E+07 task times 0.E+00 0.E+00 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Days since a Sunday Days since a Tuesday Copyright 2011 Cloudera Inc. All rights reserved Slide 33
  • 34. Different job types – identify by kmeans Map time Red. time Input Shuffle Output Duration (task sec) (task sec) Copyright 2011 Cloudera Inc. All rights reserved Slide 34
  • 35. Different job types – identify by kmeans % of Map time Red. time Input Shuffle Output Duration jobs (task sec) (task sec) 95.8 21 KB 0 871 KB 32 s 20 0 3.3 381 KB 0 1.9 GB 21 min 6097 0 0.2 10 KB 0 4.2 GB 1 hr 50 min 26,321 0 0.1 405 KB 0 447 GB 1 hr 10 min 66,657 0 0.02 446 KB 0 1.1 TB 5 hrs 5 min 125,662 0 0.05 230 KB 8.8 GB 491 MB 15 min 104,338 66,760 0.03 1.9 TB 502 MB 2.6 GB 30 min 348, 942 76,738 0.01 418 GB 2.5 TB 45 GB 1 hr 25 min 1,076,089 974,395 0.07 255 GB 788 GB 1.6 GB 35 min 384, 562 338,050 0.002 7.6 TB 51 GB 104 KB 55 min 4,843,452 853,911 Copyright 2011 Cloudera Inc. All rights reserved Slide 35
  • 36. Different job types – identify by kmeans % of Map time Red. time Input Shuffle Output Duration Description jobs (task sec) (task sec) 95.8 21 KB 0 871 KB 32 s 20 0 Small jobs 3.3 381 KB 0 1.9 GB 21 min 6097 0 Load data, fast 0.2 10 KB 0 4.2 GB 1 hr 50 min 26,321 0 Load data, slow 0.1 405 KB 0 447 GB 1 hr 10 min 66,657 0 Load data, large 0.02 446 KB 0 1.1 TB 5 hrs 5 min 125,662 0 Load data, huge 0.05 230 KB 8.8 GB 491 MB 15 min 104,338 66,760 Aggregate, fast Aggregate and 0.03 1.9 TB 502 MB 2.6 GB 30 min 348, 942 76,738 expand Expand and 0.01 418 GB 2.5 TB 45 GB 1 hr 25 min 1,076,089 974,395 aggregate 0.07 255 GB 788 GB 1.6 GB 35 min 384, 562 338,050 Data transform 0.002 7.6 TB 51 GB 104 KB 55 min 4,843,452 853,911 Data summary Copyright 2011 Cloudera Inc. All rights reserved Slide 36
  • 37. Different job types – summary FB CC-b % of jobs Description % of jobs Description 95.8 Small jobs 88.9 Small job 3.3 Load data, fast 7.7 Data transform “small” 0.2 Load data, slow 2.8 Data transform “med” 0.1 Load data, large Data transform “large”, 0.5 0.02 Load data, huge map heavy 0.05 Aggregate, fast 0.1 Aggregate 0.03 Aggregate and expand 0.01 Expand and aggregate 0.07 Data transform 0.002 Data summary Copyright 2011 Cloudera Inc. All rights reserved Slide 37
  • 38. Measure performance by replaying workload • Major advance over artificial benchmarks - “Performance is X for workload A, Y for workload B.” • Challenges: - Turn long trace (6 months) into short workload (1 day) - Do so while keeping the workload representative Copyright 2011 Cloudera Inc. All rights reserved Slide 38
  • 39. Replay concatenated trace samples • Problem: Need short and representative workloads • Solution: concatenate continuous window trace samples E.g. Day long workload = Concat(24 x continuous hourly samples) or = Concat(6 x continuous 4-hourly samples) • Strength: Reproduce all statistics within a sample window • Limits: Introduces (controllable) sampling error Copyright 2011 Cloudera Inc. All rights reserved Slide 39
  • 40. Verify that we get representative workloads CDFs for synthetic workloads CDF CDF CDF 1 1 1 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 CDF for 0 0 whole race 0 KB MB GB TB 0 KB MB GB TB 0 KB MB GB TB 1E+0 1E+3 1E+6 1E+91E+12 1E+0 1E+3 1E+6 1E+91E+12 1E+0 1E+3 1E+6 1E+91E+12 Output size – 4hr samples Output size – 1hr samples Output size – 15min samples Copyright 2011 Cloudera Inc. All rights reserved Slide 40
  • 41. Case study: Compare FIFO vs. fair scheduler Replay on System 1 Fixed workload, Performance Fixed metrics, comparison Fixed env. Replay on System 2 Workload = Facebook-like synthetic workload, 24 x 1hr samples Metrics = average job latency Environment = 200-machines EC2 cluster, m1.large instances System 1 = FIFO scheduler System 2 = Fair scheduler Copyright 2011 Cloudera Inc. All rights reserved Slide 41
  • 42. Choice of scheduler depends on workload! 100000 Fair scheduler Better Completion time (s) FIFO scheduler 1000 10 4800 4820 4840 4860 4880 4900 Job index (out of 6000+ total jobs in a day) 1000 Fair scheduler Completion time(s) FIFO scheduler Better 100 10 100 120 140 160 180 200 Job index (out of 6000+ total jobs in a day) Copyright 2011 Cloudera Inc. All rights reserved Slide 42
  • 43. Takeaways • Need to consider workload-level performance! • Data sizes, arrival patterns, job types • Understanding performance is a community effort! • Not everyone is like Google/Yahoo/Facebook • Different features and metrics relevant to each use case • Check out our evolving MapReduce workload repository • www.eecs.berkeley.edu/~ychen2/SWIM.html Thanks!!! Copyright 2011 Cloudera Inc. All rights reserved Slide 43
  • 44. Backup slides Copyright 2011 Cloudera Inc. All rights reserved Slide 44
  • 45. Indirect quicksort Metadata entries are swapped as sort proceeds Metadata buffer Metadata 1 Metadata 2 … Metadata M Metadata N Partition # Key Ptr Value Ptr Value Len Key 1 Value 1 … Key N Value N Data buffer Layer of indirection means that every compare(), even of adjacent metadata, causes CPU cache misses! Copyright 2011 Cloudera Inc. All rights reserved 45
  • 46. Indirect quicksort with “comparison proxies” Metadata buffer Metadata 1 Metadata 2 … Metadata M Metadata N Partition # Key prefix Key Ptr Value Ptr Value Len Copy first 4 bytes of key into metadata Key 1 Value 1 … Data buffer Many comparisons avoid the indirect memory lookup into the data buffer: 1.6x improvement in total map side CPU usage for terasort! Copyright 2011 Cloudera Inc. All rights reserved 46