Your SlideShare is downloading. ×

Slides

150

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
150
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Profiling and Modeling Resource Usage of Virtualized Applications Timothy Wood 1 , Lucy Cherkasova 2 , Kivanc Ozonat 2 , and Prashant Shenoy 1 1 University of Massachusetts, Amherst 2 HPLabs, Palo Alto
  • 2. Virtualized Data Centers
    • Benefits
      • Lower hardware and energy costs through server consolidation
      • Capacity on demand, agile and dynamic IT
    • Challenges
      • Apps are characterized by a collection of resource usage traces in native environment
      • Virtualization overheads
      • Effects of consolidating multiple VMs to one host
    • Important for capacity planning and efficient server consolidation
  • 3. Application Virtualization Overhead
    • Many research papers measure virtualization overhead but do not predict it in a general way:
      • A particular hardware platform
      • A particular app/benchmark, e.g., netperf, Spec or SpecWeb, disk benchmarks
      • Max throughput/latency/performance is X% worse
      • Showing Y% increase in CPU resources
    • How do we translate these measurements in “ what is a virtualization overhead for a given application ”?
    New performance models are needed
  • 4. Predicting Resource Requirements
    • Most overhead caused by I/O
      • Network and Disk activity
    • Xen I/O Model
    • 2 components
      • Dom0 handles I/O
    • Must predict CPU needs of:
      • 1. Virtual machine running the application
      • 2. Domain 0 performing I/O on behalf of the app
    Requires several prediction models based on multiple resources VM Domain0
  • 5. Problem Definition Native Application Trace VM CPU Dom0 CPU Virtualized Application Trace ? ? T 1 CPU T 1 Network T 1 Disk T 1 T 1
  • 6. Why Bother?
    • More accurate cost/benefit analysis
      • Capacity planning and VM placement
    • Impossible to pre-test some critical services
    • Hypervisor comparisons
      • Different platforms or versions
    App 1 App 2 VM 1 VM 2 Dom 0 + Native Virtual CPU Util
  • 7. Our Approach
    • Automated robust model generation
    • Run benchmark set on native and virtual platforms
      • Performs a range of I/O and CPU intensive tasks
      • Gather resource traces
    • Build model of Native --> Virtual relationship
      • Use linear regression techniques
      • Model is specific to platform, but not applications
    • Automate all the steps in the process
    Can apply this general model to any application’s traces to predict its requirements Native system usage profile Virtual system usage profile model ?
  • 8. Microbenchmark Suite
    • Focus on CPU-intensive and different types of I/O-intensive client-server apps
    • Benchmark activities:
      • Network-intensive : download and upload files
      • Disk-intensive : read and write files
      • CPU-intensive
    • Need to break correlations between resources
      • High correlation between packets/sec and CPU time
    • Simplicity of implementation
      • based on httperf , Apache Jmeter, Apache Web Server and PHP
    Microbenchmarks are easy to run in a traditional data center environment
  • 9. Model Generation Model VM: Model Dom-0: Set of equations to solve: Set of equations to solve: model ? native virtual … …
  • 10. Building Robust Models
    • Outliers can considerably impact regression models
      • Creates model that minimizes absolute error
      • Must use robust regression techniques to eliminate outliers
    • Not all metrics are equally significant
      • Starts with 11 metrics: 3 CPU, 4 Network, and 4 Disk
      • Use stepwise regression to find most significant metrics
    • Evaluate outcome of microbenchmark runs and eliminate erroneous and corrupted data
    Correct data set is a prerequisite for building an accurate model
  • 11. Performance Evaluation: Testbed Details
    • Two hardware platforms
      • HP ProLiant DL385, 2-way AMD Opteron , 2.6GHz, 64-bit
      • HP ProLiant DL580, 4-way Intel Xeon , 1.6GHz, 32-bit
    • Two applications:
      • RUBiS (auction site, modeled after e-Bay)
      • TPC-W (e-commerce site, modeled after Amazon.com)
    • Monitoring
      • Native: sysstat
      • Virtual: xenmon and xentop
      • Measurements: 30 sec intervals
  • 12. Questions
    • Why this set of metrics?
    • Why these benchmarks?
    • Why this process of model creation?
    • Model accuracy
  • 13. Importance of Modeling I/O
    • Is it necessary to look at resources other than just total CPU?
    • How accurate such a simplified model for predicting the CPU requirement of VM ?
    Definitely need multiple resources! 5% 65%
  • 14. Benchmark Coverage Using a subset of benchmarks leads to a poor accuracy model Why these benchmarks?
  • 15. Automated Benchmark Error Detection
    • Some benchmarks run incorrectly
      • Rates too high
      • Background activity
    • Remove benchmarks with abnormally high error rates
    Automatically remove bad benchmarks without eliminating useful data
  • 16. Model Accuracy
    • Intel hardware platform
    • Train the model using simple benchmarks
    • Apply to RUBiS web application
    90% of Dom0 predictions within 4% error 90% of VM predictions within 11% error
  • 17. Second Hardware Platform
    • AMD, 64bit dual CPU, 2.6Ghz
    Produces different model parameters Predictions are just as accurate
  • 18. Different Platform’s Virtualization Overhead Different platforms exhibit different amount of CPU overhead To predict virtualization overhead for different hardware platforms require building their own models 1.7 x nat_CPU 1.4 x nat_CPU
  • 19. Summary
    • Proposed approach builds a model for each hardware and virtualization platform.
    • It enables comparison of application resource requirements on different hardware platforms.
    • Interesting additional application: helps to assess and compare “performance” overhead of different virtualization software releases.
  • 20. Future Work
    • Refine a set of microbenchmarks and related measurements (what is a practical minimal set?)
    • Repeat the experiments for VMware platform
    • Linear models – are they enough?
      • Create multiple models for resources with different overheads at different rates
    • Evaluation of virtual device capacity
    • Define composition rules for estimating resource requirements of collocated virtualized applications
  • 21. Questions?

×