Operations Driven Web Services
-A Case Study of Service Evolution at Rent the
Runway

Camille Fournier, Head of Engineerin...
In The Beginning, There Was Drupal
There was also all of these folks…
Can‟t Just Burn the World Down
Hollow It Out!
Hollow It Out!
Hollow It Out!
Hollow It Out!
Jul-13

Jun-13

May-13

Apr-13

Mar-13

Feb-13

Jan-13

Dec-12

Nov-12

Oct-12

Sep-12

Aug-12

Jul-12

Jun-12

May-12

Ap...
Operations first…
 Availability and performance of our services is critical to
running our business
 The software we dev...
Metrics
 Gauges – instantaneous value
 Counters – counter with +/ Meters – rate over time (mean, 1, 5, & 15 moving avg....
Metrics - Reporting
 HTTP
 JMX
 Graphite
Dropwizard: What is it?
 Quality open source Java webservice components glued
together in a modular way
 Eliminates the ...
A Few Words from Coda…
“I had no one I had to toss a WAR to. I had no one to
stand up a Tomcat server and fiddle with it u...
Dropwizard: The Ingredients
 Jersey for REST
 Jackson for JSON
 Jetty for a webserver
 Metrics for measuring
 YAML fo...
Dropwizard – Healthchecks
 Register hooks that check the health of your app
 An HTTP endpoint that iterates over all the...
Dropwizard + Metrics
 Dropwizard has lots of platform instrumentation baked in using
Metrics, happens for free! (i.e. Jet...
Other Frameworks
 Play 1.X
 Abandonware for Play 2.X, which was still beta
 Magic

 Glassfish
 OSGI hell
 “standards...
What do I get out of it? Dev agenda
 Story telling: causation & correlation
 Integral piece of the operational excellenc...
Story telling
 The grid is slow why?
 Is it load?
 Is it dependent service latency?
 How does that compare to yesterda...
Operational Excellence: The ingredients
 Application Instrumentation (Dropwizard)
 Time Series Data & Graphing (Graphite...
OMG, we are on GMA, are we OK?
 10+ services
 Each services runs in a cluster behind an LB
 „OK‟ is somewhat service sp...
Graphite Dashboard
Tasseo dashboard (D3)
• Red, Yellow, & Green Lights
• Realtime
• Endless cool things: graphite + D3
If we see yellow or re...
Free Lunch? Really
 DB connection pool monitoring
 Http client connection pool monitoring
 JVM Heap & GC info
 Http Se...
Where do I sign up?
 You install Graphite, one time hit + some TLC. Medium
Difficulty
 You annotate your endpoints and m...
Demo
 Show a simple dropwizard codebase


0.6.2 Slim Example: https://github.com/cab222/choco



0.7.0-SNAPSHOT Complet...
References
 dropwizard.codahale.com
 metrics.codahale.com
 graphite.wikidot.com
Presenters
 @CarloBarbara (www.cabkata.com)
 @Skamille (whilefalse.blogspot.com)
 Rent The Runway is hiring! (renttheru...
Rent The Runway: Transitioning to Operations Driven Webservices
Upcoming SlideShare
Loading in …5
×

Rent The Runway: Transitioning to Operations Driven Webservices

891 views

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
891
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Rent The Runway: Transitioning to Operations Driven Webservices

  1. 1. Operations Driven Web Services -A Case Study of Service Evolution at Rent the Runway Camille Fournier, Head of Engineering @skamille Carlo Barbara, Senior Engineer @CarloBarbara
  2. 2. In The Beginning, There Was Drupal
  3. 3. There was also all of these folks…
  4. 4. Can‟t Just Burn the World Down
  5. 5. Hollow It Out!
  6. 6. Hollow It Out!
  7. 7. Hollow It Out!
  8. 8. Hollow It Out!
  9. 9. Jul-13 Jun-13 May-13 Apr-13 Mar-13 Feb-13 Jan-13 Dec-12 Nov-12 Oct-12 Sep-12 Aug-12 Jul-12 Jun-12 May-12 Apr-12 Mar-12 Feb-12 Jan-12 Dec-11 Complexity Number of Services in Production 14 12 10 8 6 4 2 0
  10. 10. Operations first…  Availability and performance of our services is critical to running our business  The software we develop has to make delivering on our SLAs possible  How (besides sane design):  Healthchecks + Nagios  Measurements  Historical Data with Graphs
  11. 11. Metrics  Gauges – instantaneous value  Counters – counter with +/ Meters – rate over time (mean, 1, 5, & 15 moving avg.)  Histograms – distribution of data (mean, median, max, std. div., 75th, 90th, 95th, 98th, 99th, & 99.9th percentiles)  Timers – Meter of requests & Histogram of duration (frequency & latency)
  12. 12. Metrics - Reporting  HTTP  JMX  Graphite
  13. 13. Dropwizard: What is it?  Quality open source Java webservice components glued together in a modular way  Eliminates the need for picking a platform stack, it‟s all there  It‟s opinionated. If you don‟t like a Dropwizard core component, that‟s too bad, don‟t use Dropwizard  Developers focus on business logic, not framework  It‟s easy, maintainable, and it works!
  14. 14. A Few Words from Coda… “I had no one I had to toss a WAR to. I had no one to stand up a Tomcat server and fiddle with it until their eyes bled. I had no one who didn't trust me to spin up my own threads or connection pools. So I wrote something which worked as simply and in as straightforward a manner as possible because my own ass was on the line if it didn't work.”
  15. 15. Dropwizard: The Ingredients  Jersey for REST  Jackson for JSON  Jetty for a webserver  Metrics for measuring  YAML for configuring  Dropwizard for weaving everything together
  16. 16. Dropwizard – Healthchecks  Register hooks that check the health of your app  An HTTP endpoint that iterates over all the hooks  “The meaning of healthy” is decided by you (i. e. Database Connections, Client Connections, DeadLock Count)
  17. 17. Dropwizard + Metrics  Dropwizard has lots of platform instrumentation baked in using Metrics, happens for free! (i.e. Jetty, JVM, Log Counts, etc…)  Ability to add Timers to your endpoints with @Timed  Ability to add arbitrary metrics as you see fit
  18. 18. Other Frameworks  Play 1.X  Abandonware for Play 2.X, which was still beta  Magic  Glassfish  OSGI hell  “standards”  Spring  Everything and the kitchen sink  Also I hate XML
  19. 19. What do I get out of it? Dev agenda  Story telling: causation & correlation  Integral piece of the operational excellence puzzle  State of the world – Dashboards  Developers focus on features, operations is mostly free lunch  Code review & demo Disclaimer: You need graphite to really harness the value
  20. 20. Story telling  The grid is slow why?  Is it load?  Is it dependent service latency?  How does that compare to yesterday  JVM throws out of memory, what‟s the problem?  What does the GC jigsaw look?  When did it change?  Is it correlated with increased load?  How is that new „performance‟ tweak?  If you never measured, then you didn‟t tune. True story!  What does my 5XX graph look like?
  21. 21. Operational Excellence: The ingredients  Application Instrumentation (Dropwizard)  Time Series Data & Graphing (Graphite, D3)  Centralized logging & log parsing (Rsyslog, Logstash, Nagios)  Automated alerting & escalation (Pagerduty) DW & Graphite will get you very far, but if you want total control & visibility you need the rest. This is the stack that RTR is moving towards, rather than relying on basic java logging smtp appenders
  22. 22. OMG, we are on GMA, are we OK?  10+ services  Each services runs in a cluster behind an LB  „OK‟ is somewhat service specific Basically you need a lot of info at your fingertips. Pictures are worth a thousand words. Get yourself some dashboards!
  23. 23. Graphite Dashboard
  24. 24. Tasseo dashboard (D3) • Red, Yellow, & Green Lights • Realtime • Endless cool things: graphite + D3 If we see yellow or red, start diagnosing
  25. 25. Free Lunch? Really  DB connection pool monitoring  Http client connection pool monitoring  JVM Heap & GC info  Http Server response counts  Http Server connection info  Endpoint duration & throughput stats
  26. 26. Where do I sign up?  You install Graphite, one time hit + some TLC. Medium Difficulty  You annotate your endpoints and maybe add finer telemetry. Easy  You configure so your service is feeding into graphite. Hopefully consistently across services, via a „Bundle‟. Easy
  27. 27. Demo  Show a simple dropwizard codebase  0.6.2 Slim Example: https://github.com/cab222/choco  0.7.0-SNAPSHOT Complete: https://github.com/dropwizard/dropwizard/tree/master/dropwizard-example  Do some curls  Show the admin endpoints
  28. 28. References  dropwizard.codahale.com  metrics.codahale.com  graphite.wikidot.com
  29. 29. Presenters  @CarloBarbara (www.cabkata.com)  @Skamille (whilefalse.blogspot.com)  Rent The Runway is hiring! (renttherunway.com/careers)

×