Your SlideShare is downloading. ×
Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core

2,424

Published on

Eric Loyd's presentation on Dynamic AWS Server Usage Using Nagios Core. …

Eric Loyd's presentation on Dynamic AWS Server Usage Using Nagios Core.
The presentation was given during the Nagios World Conference North America held Sept 20-Oct 2nd, 2013 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,424
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix @SmartVox
  • 2. About Bitnetix
  • 3. 3 About Eric Loyd and Bitnetix Founder and CEO of Bitnetix Incorporated VoIP services and IT/network consulting Over 25 Years in IT and management at places like Eastman Kodak Frontier Communications / Global Crossing Rochester Institute of Technology Bitnetix started its eighth year in July, 2013 Digital Rochester GREAT Award Finalist in: 2012 for Communications Technology 2013 for Rising Star Using Nagios since 2004 © 2013 Bitnetix Incorporated
  • 4. History of SmartVox: Bitnetix’s VoIP Platform
  • 5. 5 History of SmartVox, our VoIP Platform Pre-2012 – not yet called SmartVox Bitnetix primarily focused on IT consulting VoIP service was ~10% of business with servers located primarily at client sites Custom Asterisk-based servers running FreePBX We ran customer’s network so we had control over VoIP 2012 – Focus switched to VoIP Focused now on hosted VoIP solutions Made use of Amazon Web Services EC2 VPS One per customer with no proxies* or media servers Network/bandwidth was only customer responsiblity © 2013 Bitnetix Incorporated
  • 6. 6 History of SmartVox, our VoIP Platform 2013 – SmartVox name born Copyright, trademark, domain name, biz cards, etc. Third generation born with multiple proxies, registrars, configuration servers, and media servers June – Started Mission Matrix program & sales AWS architecture leveraged for geography Each customer gets own EC2 server Proxies to closest zone, secondary “to the west” Media servers located in zones base on number of simultaneous calls, conferences, etc. VMs and CDRs stored in database © 2013 Bitnetix Incorporated
  • 7. Brief Overview of AWS
  • 8. 8 AWS EC2 Concepts AWS – Amazon Web Services Collection of cloud-based services: Storage (S3), DNS (Route 53), CDN, Server (EC2) EC2 - Elastic Compute Cloud Virtual servers in AWS datacenters (zones) US (3 = VA, CA, OR), EU (1), Asia (3), SA (1) Persistent storage & flexible IP address assignment Pay by the hour that it’s up, storage and bandwidth Spot instances – “temporary” EC2 servers Bring online as needed, terminated when shut down © 2013 Bitnetix Incorporated
  • 9. 9 AWS EC2 Costs LOTS of variables, but reasonable potential costs: Reserved servers cost about $2.00 per day Reserved instance pricing is contractual and static, based on size Spot servers cost between $0.50-$2.50 per day Spot instance pricing is dynamic, we assume ~$0.10 per hour We quantize concurrent calls into 50-call blocks One media server = 50 calls = 1 spot instance Two media servers = 100 calls = 2 spot instances Bandwidth and storage will add ~10% Reducing AWS usage reduces cost We keep these savings for ourselves. Shhhh!!! © 2013 Bitnetix Incorporated
  • 10. Why Nagios?
  • 11. 11 Why Nagios? Extensive experience using it for clients Bitnetix is a Nagios reseller Needed centralized monitoring software Integrate with Twitter for notifications Integrate with Eventum via email for trouble tickets Zero cost Framework Leverage SSH, HTTP, check_mk and livestatus!! Custom checks and notifications (very important) Ability to “cookie cutter” installs for AWS © 2013 Bitnetix Incorporated
  • 12. 12 Initial Hurdles Customer Premise Equipment No real control over CPE choices Routers block some traffic, “help” other traffic incorrectly Need to be able to remotely [re-]configure phones Figure out how to “cookie-cutter” EC2 servers Customer boxes and SIP endpoints Proxies and media servers Wanted to monitor upstream providers as well How to separate apparent from actual failure Something’s broken, but overall service functional © 2013 Bitnetix Incorporated
  • 13. SmartVox Provisioning Process and Automation
  • 14. 14 SmartVox Network DNS SRV records are key to redundant servers © 2013 Bitnetix Incorporated Sends the call on to the correct phone/media server (VM, etc) Figures out what customer should receive the calls Sends incoming calls to one/more border proxies Provider Border Proxy Customer Proxy Customer Proxy Border Proxy Customer Proxy
  • 15. 15 Provisioning Process SmartVox AWS EC2 Provisioning Database Customer information Account (location/division/etc) information Number of phones*, VM boxes, etc. Computes how many proxies customer needs DNS SRV records created for batch updates Media server/VM entries created automatically Phone provisioning info created automatically Automatically places order for phones* (+some) Phones drop-shipped to customer in about 3 days © 2013 Bitnetix Incorporated
  • 16. 16 AWS EC2 Automation: Spot Instance API Create spot instance -> gives request ID Instance created with SmartVox created base image Wait a bit -> query request ID -> get instance ID Query instance -> get IP address Update DNS with server information and IP Update Nagios with server information and IP When spot instances shut down, they terminate No more expense for “burstable resources” This sounds like a Nagios event handler… © 2013 Bitnetix Incorporated
  • 17. 17 AWS EC2 Automation: Our Custom Image SmartVox media server image includes Asterisk Asterisk told to exit after waiting for calls to terminate Startup script shuts down system after Asterisk exits Instant “spot instance” Bring it online when needed, and terminate as required Same basic idea for starting/stopping proxies These tend to be more static than media servers Platform can be adjusted automatically COGS adjusts appropriately Hey, let’s hook this up to Nagios!! © 2013 Bitnetix Incorporated
  • 18. 18 AWS EC2 Automation: More ideas Quick aside about spot instances. Useful for: Database dumps Spot instance turned up to do MySQL copies Run reports, dump, compress, purge, etc & term Distributing web server load Pop up another server and add to DNS Instant on-demand capacity Anything that you only want to do repeatedly but not for a long time, and only when you want to (or maybe if you have to) © 2013 Bitnetix Incorporated
  • 19. Use Nagios for: Provisioning Monitoring Capacity Planning
  • 20. 20 Provisioning Rather than create EC2s, we just update Nagios Automatically regenerate SIP proxy and media server dynamic_hosts.cfg file as part of provisioning process Nagios looks for host up, doesn’t find it, fires off handler Event handler queries EC2 to see if it’s being turned up (~10 min) or just not running. If it’s not running, it starts it. DNS is batch updated every hour. 59 min TTLs Phone provisioning handled via automatic extract from database to create HTTP served configuration files Master/slave “config servers” (also in AWS) to send all this stuff to customers, with a URL to activate phones Entire process from signature to functional < 1 week © 2013 Bitnetix Incorporated
  • 21. 21 Monitoring Nagios looks for hosts (see previous slide) Automatically creates them if needed Note that SIP proxies are not spot instances Dedicated to lifespan of customer/account so they are only terminated as part of de-provisioning process Nagios looks at health of services Determine if we have faults, outages, etc. Can potentially reroute automatically (DNS SRV!) Store performance info for capacity calculations Notifications via Twitter and email Come back tomorrow at 10:30 for how this works © 2013 Bitnetix Incorporated
  • 22. 22 Capacity Planning Quantize by 50 simultaneous calls per server Perf data used to calculate historical usage Can use cron to automatically add/remove servers Nagios figures out “deltac” in current usage If deltac = 0, we are just right (OK) If deltac < 0, we have too much capacity (WARN) If deltac > 0, we need more capacity (CRITICAL) Event handler looks at state and either does nothing, tells least used box to stop Asterisk, or adds another box to the mix (see provisioning) Capacity (and costs) dynamically adjust with usage © 2013 Bitnetix Incorporated
  • 23. 23 Capacity Planning: DeltaC deltac – Custom Nagios module Looks at the last three times it ran on particular host Quantized by 50 calls = change in 50-call volumes If deltac = 0 then we return an OK state If deltac < 0 then we are dropping call volumes and can SSH to a box and tell Asterisk to stop This will then stop the spot instance and reduce cost If deltac > 0 then we are gaining call volumes and trigger provisioning process This will start a spot instance and increase cost © 2013 Bitnetix Incorporated
  • 24. Event Handler: DeltaC
  • 25. 25 How DeltaC Works Let’s assume we’re creating a new host ec2-request-spot-instances ami-58296831 -p 0.04 --key "BTC EC2" --group Asterisk --instance-type m1.medium -n 1 --type one-time Get back a “spotInstanceRequestId” (sir-722f4e34) ec2-describe-spot-instance-requests sir-722f4e34 Get back an “instanceId” (i-6488e31f) ec2-describe-instances i-6488e31f Get back public IP address (ipAddress) of this machine Now we have IP address and (internal) name Populate DNS batch update queue Regenerate /usr/local/nagios/etc/objects/dynamic_hosts.cfg © 2013 Bitnetix Incorporated
  • 26. 26 DeltaC Saves Lives Money Small percentage changes in usage result in large changes in Cost Of Goods For example: © 2013 Bitnetix Incorporated 100 calls • 2 boxes • $0.20/hour • ~$75/year 500 calls • 10 boxes • $1.00/hour • ~$375/year 2000 calls • 20 boxes • $2.00/hour • ~$750/year 5000 calls • 50 boxes • $5.00/hour • ~$2000/year
  • 27. Questions? Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix @SmartVox

×