Operations Playbook: Monitoring and Automation - RightScale Compute 2013
 

Operations Playbook: Monitoring and Automation - RightScale Compute 2013

on

  • 868 views

Speaker: Chris Deutsch - Systems Administrator, RightScale ...

Speaker: Chris Deutsch - Systems Administrator, RightScale

As a systems administrator, what is the best way to ensure that you don’t get paged in your sleep or on your days off? The RightScale operations team manages hundreds of cloud servers, as well as a host of other cloud services, to deliver always-on production applications. The RightScale Ops Team will share tips as power users of RightScale, including running batch updates, automating scaling, adding custom monitoring graphs, and troubleshooting configuration and performance issues.

Statistics

Views

Total Views
868
Views on SlideShare
868
Embed Views
0

Actions

Likes
0
Downloads
19
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Operations Playbook: Monitoring and Automation - RightScale Compute 2013 Operations Playbook: Monitoring and Automation - RightScale Compute 2013 Presentation Transcript

  • Automation + MonitoringChris Deutsch, RightScale OperationsCloud Management Platform
  • Cloud Management PlatformWhat Ill be talking about• Meet RightScale Operations• Monitoringo How monitoring works on RightScaleo How to build a custom monitoro How we monitor web servers and cassandra• Automationo The RightScale APIo The chimp command line toolo How we automate releases• Tips from Ops
  • Cloud Management PlatformRightScale Operations• Deployed over 5 continents• Over 700 cloud servers administered• RightScale runs on RightScale
  • Cloud Management Platformcollectd: what is it?• open source metric collection tool• modular architecture• uses the ubiquitous rrdtool• more information: http://collectd.org/
  • Cloud Management Platformcollectd: built-in plugins• host monitoringo cpuo disk spaceo disk I/Oo memoryo network• application monitoringo process stateo memory useo cpu usage
  • Cloud Management PlatformHow does monitoring work?
  • Cloud Management PlatformHow does monitoring work?
  • Cloud Management PlatformHow does monitoring work?
  • Cloud Management PlatformHow does monitoring work?
  • Cloud Management Platformcollectd: custom plugins•  Custom plugins written using the Exec plugin•  Can be written in any language•  Ruby, python and perl are common•  Simple
  • Cloud Management Platformcollectd: custom pluginsWhat were going to look at:• building an example monitor using the Exec plugin• http error code monitor• cassandra database server monitor
  • Cloud Management Platformcollectd: custom plugins: example/etc/init.d/collectd/example.conf:example.rb:https://collectd.org/wiki/index.php/Plugin:Exec#!/usr/bin/rubywhile true dotime = Time.now.to_iputs "PUTVAL "host/cpu-0/cpu_overview" interval=20 #{time}:1"sleep 20end<Plugin exec>Exec "nobody" "example.rb"</Plugin>
  • Cloud Management Platformcollectd: custom plugins: http codes
  • Cloud Management Platformcollectd: custom plugins: http codeshttps://gist.github.com/christopherdeutsch/db2380a47b62730ddf69
  • Cloud Management Platformcollectd: custom plugins: cassandra•  cassandra is a key-value data store (aka nosql) server•  data is stored on a ring•  a ring consists of nodes
  • Cloud Management Platformcollectd: custom plugins: cassandra
  • Cloud Management Platformautomation: the rightscale api
  • Cloud Management Platformautomation: the rightscale api• RightScale API is RESTful and easy to traverse• right_api_client - ruby client library• CloudFlows - the future
  • Cloud Management Platformautomation: the command line• needed a tool that would let us be lazy• the "chimp" executes commands on servers• lets jump into a demo
  • Cloud Management Platformautomation: chimp• select what to update using tags• update across multiple deployments• update one server at a time so service isnt disrupted• track success/failure
  • Cloud Management Platformautomation: scripting languages• having a command line tool lets us use scripting languages like bashor ruby to automate common tasks• we ended up using Ruby rake files to tie it all together
  • Cloud Management Platformautomation: a RightScale release• chimp used to run commands on servers• supports "rolling" operations• uses tag service to scope operations• we use rake to organise tasks that make up a release• developed chimpd so we could run more commands in parallel
  • Cloud Management Platformautomation: chimp release• RightScale has released chimp as open source!• gem install right_chimp
  • Cloud Management Platformtips• assume instances will die eventually• always reboot test ServerTemplates• use tags. everywhere. all the time.• use chimp to make ad-hoc queries• monitor not just host metrics but system metrics• design everything to be runnable in a server array
  • Thanks!Chris Deutsch, RightScale Operationschristopher@rightscale.com@ispeakdeutschCloud Management Platform