Operating your open stack private cloud rackspace

1,984 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,984
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
43
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Goal: The goal of this talk to give the operations community an idea of how we are running our\nprivate clouds. The private cloud team(s) at Rackspace have approached our design with the \nfollowing philosophies:\n\nStable code only - We are only deploying code that we feel is stable enough for primetime.\nEasy to deploy with repeatable processes ( supportable)\nChef, standardized network design\nAll OpenSource (Linux, KVM)\n
  • In April, we were performing simple CDM agent based monitoring\n\nNow, integrated solution using opensource tools\ncollectd, statsd, monit, graphite\nwon’t go into much detail here but check our chef recipes (later) see how it’s designed\n
  • In April, we were performing simple CDM agent based monitoring\n\nNow, integrated solution using opensource tools\ncollectd, statsd, monit, graphite\nwon’t go into much detail here but check our chef recipes (later) see how it’s designed\n
  • it’s getting better but still rough\n
  • Can do things like \ngive all vms on a host\nget usage report for an availability zone\n\nWe want this to be THE ops tool\nI want this tool to get instance audit and remediation tools\n
  • Can do things like \ngive all vms on a host\nget usage report for an availability zone\n\nWe want this to be THE ops tool\nI want this tool to get instance audit and remediation tools\n
  • should be safe to run on master (Innodb - row level locking) but always best to back up off of a slave\n\nNot yet a chef cookbook\n
  • \n
  • \n
  • qcow2 may make better “business” sense.\nfaster to spin up, cheaper to store\nMoral: benchmark for your workloads, plan accordingly.\nMitigate with Cinder\nMajor benefit of private clouds is being able to customize these pieces. \n
  • chunking is sets point where glance will break apart a large file and what size to set the various chunks too\n
  • chunking is sets point where glance will break apart a large file and what size to set the various chunks too\n
  • chunking is sets point where glance will break apart a large file and what size to set the various chunks too\n
  • describe environment\n
  • describe environment\n
  • Is there a formula for ideal chunk size based on overall swift cluster size?\n\nOffer discussion about ideal chunk size at end\n
  • Downside to not removing images\ndisk space on compute nodes\n\nBest of both worlds:\n“smart” cache system. pre-caching often used images\n
  • Downside to not removing images\ndisk space on compute nodes\n\nBest of both worlds:\n“smart” cache system. pre-caching often used images\n
  • offer discussion at end about scaling the scheduler\n
  • repeatable process - requirement for supporting multiple environments\n\ngreat for creating test environments\n
  • \n
  • Ops tools - perhaps an additional section to the dashboard\n\nmultiple layer 3 networks on one layer 2 vlan\nnetwork devices may be a limiting factor, firewalls, load balancers, etc\n
  • discuss glance caching\npros/cons or raw vs qcow2\nscaling the scheduler\n
  • \n
  • Operating your open stack private cloud rackspace

    1. 1. Operating yourOpenStack Private Ryan Richard OpenStack Engineer ryan.richard@rackspace.com @rackninja October 12, 2012
    2. 2. Monitoring and ReportingWhere we were - April 2012 Basic CDM RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    3. 3. Monitoring and ReportingWhere we were - April 2012 Basic CDM Now RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    4. 4. Monitoring and ReportingWhere we were - April 2012 Basic CDM Now RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    5. 5. ToolsThere is no good way to get the following info: I need a list of instances on a host and their IPs I need to gracefully start/stop all instances on a host Some tools needs hostname, some need id (decimal or hex), some need uuid RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    6. 6. ToolsThere is no good way to get the following info: I need a list of instances on a host and their IPs I need to gracefully start/stop all instances on a host Some tools needs hostname, some need id (decimal or hex), some need uuidSELECTinstances.id,instances.hostname,instances.project_id,fixed_ips.addressas fixed_address,floating_ips.address as floating_address FROM instancesLEFT JOIN fixed_ips ON instances.id=fixed_ips.instance_id LEFT JOINfloating_ips ON floating_ips.fixed_ip_id=fixed_ips.id WHEREinstances.deleted="NULL" AND instances.host="<hostname of physicalmachine>" ORDER BY instances.id; RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    7. 7. ToolsWE NEED BETTER OPS TOOLS! RACKSPACE® HOSTING | WWW.RACKSPACE.COM 4
    8. 8. Tools WE NEED BETTER OPS TOOLS!Pulsar https://github.com/ rsoprivatecloud/pulsar “nova swiss army knife” requires direct nova database access RACKSPACE® HOSTING | WWW.RACKSPACE.COM 4
    9. 9. Tools WE NEED BETTER OPS TOOLS!Pulsar https://github.com/ rsoprivatecloud/pulsar “nova swiss army knife” requires direct nova database access RACKSPACE® HOSTING | WWW.RACKSPACE.COM 4
    10. 10. ToolsHolland (opensource database backup framework) Written by Rackspace DBAs http://wiki.hollandbackup.org/ RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    11. 11. Toolsdsh dsh -Mcg compute uname-a RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    12. 12. Toolsdsh dsh -Mcg compute uname-a RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    13. 13. Toolsdsh dsh -Mcg compute uname-abashfoo for i in `knife node list | grep cpu`; do knife node run_list add $i "role[single-compute]"; done for k in `seq 1 20`; do for i in {compute,network}; do nova- manage service disable computevm0$k nova-$i; done; done RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    14. 14. Performance and Scale Considerations Disk IO For high performance use remote block storage For “local” disk IO, raw image type is only slightly faster than qcow2 IO will degrade while Glance copies images between machines scheduler=cfq, KVM cache=none RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    15. 15. Performance and Scale Considerations Disk IO Async&Random&IO& rs/speed/test12"(cfq,"host"deadline,"cache=none)" For high performance Rs/speed/test13"(noop,"cache=writeback)" use remote block storage rs/speed/test13"(cfq,"cache=writeback)" Rs/speed/test12"(noop"cache=none)" randW"(direct)" Rs/speed/test12"(cfq"cache=none)" randR"(direct)" For “local” disk IO, raw Rs/speed/test13"(cfq,"cache=none,"no"ht)" randW" image type is only randR" Rs/speed/test13"(deadline"cache=none)" slightly faster than compute/host"(deadline)" qcow2 compute/host"(no"ht)" compute/host" 0" 200" 400" 600" 800" 1000" 1200" 1400" 1600" IO will degrade while Host&vs.&Instance& Glance copies images 14000" between machines 12000" 10000" scheduler=cfq, KVM 8000" compute/host" cache=none 6000" Rs/speed/test12"(cfq"cache=none)" 4000" 2000" 0" randR" randW" randR" randW" seqR" seqW"RACKSPACE® HOSTING seqR" seqw" | WWW.RACKSPACE.COM (direct)" (direct)" (direct)" (direct)"
    16. 16. Performance and Scale ConsiderationsGlance chunk Size 200Mb chunk size 1GB chunk size 5GB chunk size RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    17. 17. Performance and Scale ConsiderationsGlance chunk Size 200Mb chunk size 1GB chunk size 5GB chunk size RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    18. 18. Performance and Scale ConsiderationsGlance chunk Size 200Mb chunk size 1GB chunk size 5GB chunk size RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    19. 19. Performance and Scale ConsiderationsGlance chunk Size 200Mb chunk size 1GB chunk size 5GB chunk size RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    20. 20. Performance and Scale Considerations Swift disk usage with different chunk sizes 5 zones - 4 x 1TB disks per zone 20TB raw - 6.67TB usable RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    21. 21. Performance and Scale Considerations Swift disk usage with different chunk sizes RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    22. 22. Performance and Scale Considerations Glance chunk size Too high and swift can become unbalanced What are the downsides to being too low? RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    23. 23. Performance and Scale Considerations Glance Disk Tuning (swift) read ahead on your block device(s) - no noticeable gain deadline scheduler - no noticeable gain Best thing for glance performance - Caching RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    24. 24. Performance and Scale Considerations Glance Disk Tuning (swift) read ahead on your block device(s) - no noticeable gain deadline scheduler - no noticeable gain Best thing for glance performance - Caching Image Size Not Cached Cached 1.4GB 20secs 1sec 16.4GB 2min 21secs 1sec RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    25. 25. Performance and Scale Considerations Glance Disk Tuning (swift) read ahead on your block device(s) - no noticeable gain deadline scheduler - no noticeable gain Best thing for glance performance - Caching Image Size Not Cached Cached *times from “creating image” to 1.4GB 20secs 1sec “qemu-img create” 16.4GB 2min 21secs 1sec RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    26. 26. Performance and Scale ConsiderationsScheduler What we use by default: scheduler tasks are not processed in parallel Adding additional schedulers helps provide HA but they don’t speed up overall times to complete requests RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    27. 27. Automated Config Management Chef: http://github.com/rcbops/chef- cookbooks time to stand up controller - less than 20 minutes compute node - less than 2 min RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    28. 28. Day to Day tasksDealing with new issues resize - all nova-compute processes need to be able to log into all other compute nodes via ssh keysHardware failures We’re still managing infrastructure, failures happen RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    29. 29. Lessons LearnedWe need better Operations tools!Network Design can be confusing for people used to “the old way”OpenStack is still relatively new, help your organization understand it.It’s easy to forget we’re working with Linux machinesIt’s not you, it’s a bug :) RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    30. 30. But....But this is a design summit also Open to discussions/thoughts/questions RACKSPACE® HOSTING | WWW.RACKSPACE.COM
    31. 31. Rackspace is hiring www.rackertalent.com RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218 US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COMRACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM

    ×