Dynamic AWS Server Usage
Using Nagios Core
or
How to pay only for what you need
Eric Loyd
eric@bitnetix.com
877.33.VOICE
@...
About Bitnetix
3
About Eric Loyd and Bitnetix
Founder and CEO of Bitnetix Incorporated
VoIP services and IT/network consulting
Over 25 Ye...
History of SmartVox:
Bitnetix’s VoIP Platform
5
History of SmartVox, our VoIP Platform
Pre-2012 – not yet called SmartVox
Bitnetix primarily focused on IT consulting
Vo...
6
History of SmartVox, our VoIP Platform
2013 – SmartVox name born
Copyright, trademark, domain name, biz cards, etc.
Thir...
Brief Overview of AWS
8
AWS EC2 Concepts
AWS – Amazon Web Services
Collection of cloud-based services:
Storage (S3), DNS (Route 53), CDN, Server...
9
AWS EC2 Costs
LOTS of variables, but reasonable potential costs:
Reserved servers cost about $2.00 per day
Reserved inst...
Why Nagios?
11
Why Nagios?
Extensive experience using it for clients
Bitnetix is a Nagios reseller
Needed centralized monitoring softw...
12
Initial Hurdles
Customer Premise Equipment
No real control over CPE choices
Routers block some traffic, “help” other tr...
SmartVox Provisioning
Process and Automation
14
SmartVox Network
DNS SRV records are key to redundant servers
© 2013 Bitnetix Incorporated
Sends the call
on to the cor...
15
Provisioning Process
SmartVox AWS EC2 Provisioning Database
Customer information
Account (location/division/etc) inform...
16
AWS EC2 Automation: Spot Instance API
Create spot instance -> gives request ID
Instance created with SmartVox created b...
17
AWS EC2 Automation: Our Custom Image
SmartVox media server image includes Asterisk
Asterisk told to exit after waiting ...
18
AWS EC2 Automation: More ideas
Quick aside about spot instances. Useful for:
Database dumps
Spot instance turned up to ...
Use Nagios for:
Provisioning
Monitoring
Capacity Planning
20
Provisioning
Rather than create EC2s, we just update Nagios
Automatically regenerate SIP proxy and media server
dynamic...
21
Monitoring
Nagios looks for hosts (see previous slide)
Automatically creates them if needed
Note that SIP proxies are n...
22
Capacity Planning
Quantize by 50 simultaneous calls per server
Perf data used to calculate historical usage
Can use cro...
23
Capacity Planning: DeltaC
deltac – Custom Nagios module
Looks at the last three times it ran on particular host
Quantiz...
Event Handler:
DeltaC
25
How DeltaC Works
Let’s assume we’re creating a new host
ec2-request-spot-instances ami-58296831 -p 0.04 --key
"BTC EC2"...
26
DeltaC Saves Lives Money
Small percentage changes in usage
result in large changes
in Cost Of Goods
For example:
© 2013...
Questions?
Eric Loyd
eric@bitnetix.com
877.33.VOICE
@Bitnetix @SmartVox
Upcoming SlideShare
Loading in …5
×

Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core

3,008 views

Published on

Eric Loyd's presentation on Dynamic AWS Server Usage Using Nagios Core.
The presentation was given during the Nagios World Conference North America held Sept 20-Oct 2nd, 2013 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,008
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Nagios Conference 2013 - Eric Loyd - Dynamic AWS Server Usage Using Nagios Core

  1. 1. Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix @SmartVox
  2. 2. About Bitnetix
  3. 3. 3 About Eric Loyd and Bitnetix Founder and CEO of Bitnetix Incorporated VoIP services and IT/network consulting Over 25 Years in IT and management at places like Eastman Kodak Frontier Communications / Global Crossing Rochester Institute of Technology Bitnetix started its eighth year in July, 2013 Digital Rochester GREAT Award Finalist in: 2012 for Communications Technology 2013 for Rising Star Using Nagios since 2004 © 2013 Bitnetix Incorporated
  4. 4. History of SmartVox: Bitnetix’s VoIP Platform
  5. 5. 5 History of SmartVox, our VoIP Platform Pre-2012 – not yet called SmartVox Bitnetix primarily focused on IT consulting VoIP service was ~10% of business with servers located primarily at client sites Custom Asterisk-based servers running FreePBX We ran customer’s network so we had control over VoIP 2012 – Focus switched to VoIP Focused now on hosted VoIP solutions Made use of Amazon Web Services EC2 VPS One per customer with no proxies* or media servers Network/bandwidth was only customer responsiblity © 2013 Bitnetix Incorporated
  6. 6. 6 History of SmartVox, our VoIP Platform 2013 – SmartVox name born Copyright, trademark, domain name, biz cards, etc. Third generation born with multiple proxies, registrars, configuration servers, and media servers June – Started Mission Matrix program & sales AWS architecture leveraged for geography Each customer gets own EC2 server Proxies to closest zone, secondary “to the west” Media servers located in zones base on number of simultaneous calls, conferences, etc. VMs and CDRs stored in database © 2013 Bitnetix Incorporated
  7. 7. Brief Overview of AWS
  8. 8. 8 AWS EC2 Concepts AWS – Amazon Web Services Collection of cloud-based services: Storage (S3), DNS (Route 53), CDN, Server (EC2) EC2 - Elastic Compute Cloud Virtual servers in AWS datacenters (zones) US (3 = VA, CA, OR), EU (1), Asia (3), SA (1) Persistent storage & flexible IP address assignment Pay by the hour that it’s up, storage and bandwidth Spot instances – “temporary” EC2 servers Bring online as needed, terminated when shut down © 2013 Bitnetix Incorporated
  9. 9. 9 AWS EC2 Costs LOTS of variables, but reasonable potential costs: Reserved servers cost about $2.00 per day Reserved instance pricing is contractual and static, based on size Spot servers cost between $0.50-$2.50 per day Spot instance pricing is dynamic, we assume ~$0.10 per hour We quantize concurrent calls into 50-call blocks One media server = 50 calls = 1 spot instance Two media servers = 100 calls = 2 spot instances Bandwidth and storage will add ~10% Reducing AWS usage reduces cost We keep these savings for ourselves. Shhhh!!! © 2013 Bitnetix Incorporated
  10. 10. Why Nagios?
  11. 11. 11 Why Nagios? Extensive experience using it for clients Bitnetix is a Nagios reseller Needed centralized monitoring software Integrate with Twitter for notifications Integrate with Eventum via email for trouble tickets Zero cost Framework Leverage SSH, HTTP, check_mk and livestatus!! Custom checks and notifications (very important) Ability to “cookie cutter” installs for AWS © 2013 Bitnetix Incorporated
  12. 12. 12 Initial Hurdles Customer Premise Equipment No real control over CPE choices Routers block some traffic, “help” other traffic incorrectly Need to be able to remotely [re-]configure phones Figure out how to “cookie-cutter” EC2 servers Customer boxes and SIP endpoints Proxies and media servers Wanted to monitor upstream providers as well How to separate apparent from actual failure Something’s broken, but overall service functional © 2013 Bitnetix Incorporated
  13. 13. SmartVox Provisioning Process and Automation
  14. 14. 14 SmartVox Network DNS SRV records are key to redundant servers © 2013 Bitnetix Incorporated Sends the call on to the correct phone/media server (VM, etc) Figures out what customer should receive the calls Sends incoming calls to one/more border proxies Provider Border Proxy Customer Proxy Customer Proxy Border Proxy Customer Proxy
  15. 15. 15 Provisioning Process SmartVox AWS EC2 Provisioning Database Customer information Account (location/division/etc) information Number of phones*, VM boxes, etc. Computes how many proxies customer needs DNS SRV records created for batch updates Media server/VM entries created automatically Phone provisioning info created automatically Automatically places order for phones* (+some) Phones drop-shipped to customer in about 3 days © 2013 Bitnetix Incorporated
  16. 16. 16 AWS EC2 Automation: Spot Instance API Create spot instance -> gives request ID Instance created with SmartVox created base image Wait a bit -> query request ID -> get instance ID Query instance -> get IP address Update DNS with server information and IP Update Nagios with server information and IP When spot instances shut down, they terminate No more expense for “burstable resources” This sounds like a Nagios event handler… © 2013 Bitnetix Incorporated
  17. 17. 17 AWS EC2 Automation: Our Custom Image SmartVox media server image includes Asterisk Asterisk told to exit after waiting for calls to terminate Startup script shuts down system after Asterisk exits Instant “spot instance” Bring it online when needed, and terminate as required Same basic idea for starting/stopping proxies These tend to be more static than media servers Platform can be adjusted automatically COGS adjusts appropriately Hey, let’s hook this up to Nagios!! © 2013 Bitnetix Incorporated
  18. 18. 18 AWS EC2 Automation: More ideas Quick aside about spot instances. Useful for: Database dumps Spot instance turned up to do MySQL copies Run reports, dump, compress, purge, etc & term Distributing web server load Pop up another server and add to DNS Instant on-demand capacity Anything that you only want to do repeatedly but not for a long time, and only when you want to (or maybe if you have to) © 2013 Bitnetix Incorporated
  19. 19. Use Nagios for: Provisioning Monitoring Capacity Planning
  20. 20. 20 Provisioning Rather than create EC2s, we just update Nagios Automatically regenerate SIP proxy and media server dynamic_hosts.cfg file as part of provisioning process Nagios looks for host up, doesn’t find it, fires off handler Event handler queries EC2 to see if it’s being turned up (~10 min) or just not running. If it’s not running, it starts it. DNS is batch updated every hour. 59 min TTLs Phone provisioning handled via automatic extract from database to create HTTP served configuration files Master/slave “config servers” (also in AWS) to send all this stuff to customers, with a URL to activate phones Entire process from signature to functional < 1 week © 2013 Bitnetix Incorporated
  21. 21. 21 Monitoring Nagios looks for hosts (see previous slide) Automatically creates them if needed Note that SIP proxies are not spot instances Dedicated to lifespan of customer/account so they are only terminated as part of de-provisioning process Nagios looks at health of services Determine if we have faults, outages, etc. Can potentially reroute automatically (DNS SRV!) Store performance info for capacity calculations Notifications via Twitter and email Come back tomorrow at 10:30 for how this works © 2013 Bitnetix Incorporated
  22. 22. 22 Capacity Planning Quantize by 50 simultaneous calls per server Perf data used to calculate historical usage Can use cron to automatically add/remove servers Nagios figures out “deltac” in current usage If deltac = 0, we are just right (OK) If deltac < 0, we have too much capacity (WARN) If deltac > 0, we need more capacity (CRITICAL) Event handler looks at state and either does nothing, tells least used box to stop Asterisk, or adds another box to the mix (see provisioning) Capacity (and costs) dynamically adjust with usage © 2013 Bitnetix Incorporated
  23. 23. 23 Capacity Planning: DeltaC deltac – Custom Nagios module Looks at the last three times it ran on particular host Quantized by 50 calls = change in 50-call volumes If deltac = 0 then we return an OK state If deltac < 0 then we are dropping call volumes and can SSH to a box and tell Asterisk to stop This will then stop the spot instance and reduce cost If deltac > 0 then we are gaining call volumes and trigger provisioning process This will start a spot instance and increase cost © 2013 Bitnetix Incorporated
  24. 24. Event Handler: DeltaC
  25. 25. 25 How DeltaC Works Let’s assume we’re creating a new host ec2-request-spot-instances ami-58296831 -p 0.04 --key "BTC EC2" --group Asterisk --instance-type m1.medium -n 1 --type one-time Get back a “spotInstanceRequestId” (sir-722f4e34) ec2-describe-spot-instance-requests sir-722f4e34 Get back an “instanceId” (i-6488e31f) ec2-describe-instances i-6488e31f Get back public IP address (ipAddress) of this machine Now we have IP address and (internal) name Populate DNS batch update queue Regenerate /usr/local/nagios/etc/objects/dynamic_hosts.cfg © 2013 Bitnetix Incorporated
  26. 26. 26 DeltaC Saves Lives Money Small percentage changes in usage result in large changes in Cost Of Goods For example: © 2013 Bitnetix Incorporated 100 calls • 2 boxes • $0.20/hour • ~$75/year 500 calls • 10 boxes • $1.00/hour • ~$375/year 2000 calls • 20 boxes • $2.00/hour • ~$750/year 5000 calls • 50 boxes • $5.00/hour • ~$2000/year
  27. 27. Questions? Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix @SmartVox

×