Operations-Driven Web Services at Rent the Runway
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Operations-Driven Web Services at Rent the Runway

  • 4,219 views
Uploaded on

From a meetup on 8/26, Rent the Runway's operations-driven service infrastructure including an overview of why we use Dropwizard

From a meetup on 8/26, Rent the Runway's operations-driven service infrastructure including an overview of why we use Dropwizard

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,219
On Slideshare
1,922
From Embeds
2,297
Number of Embeds
15

Actions

Shares
Downloads
31
Comments
0
Likes
2

Embeds 2,297

http://www.hakkalabs.co 1,495
http://g33ktalk.com 729
https://www.hakkalabs.co 41
http://www.feedspot.com 8
https://twitter.com 6
http://g33ktalk.staging.wpengine.com 3
http://www.hakkalabs.com 3
http://cloud.feedly.com 3
http://moderation.local 2
http://www.newsblur.com 2
http://feeds.feedburner.com 1
https://hakka.herokuapp.com 1
http://newsblur.com 1
http://localhost 1
http://digg.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Operations Driven Web Services -A Case Study of Service Evolution at Rent the Runway Camille Fournier, Head of Engineering @skamille Carlo Barbara, Senior Systems Engineer @CarloBarbara
  • 2. In The Beginning, There Was Drupal
  • 3. There was also all of these folks…
  • 4. Can‟t Just Burn the World Down
  • 5. Hollow It Out!
  • 6. Hollow It Out!
  • 7. Hollow It Out!
  • 8. Hollow It Out!
  • 9. Complexity 0 2 4 6 8 10 12 14 Dec-11 Jan-12 Feb-12 Mar-12 Apr-12 May-12 Jun-12 Jul-12 Aug-12 Sep-12 Oct-12 Nov-12 Dec-12 Jan-13 Feb-13 Mar-13 Apr-13 May-13 Jun-13 Jul-13 Number of Services in Production
  • 10. Operations first…  Availability and performance of our services is critical to running our business  The software we develop has to make delivering on our SLAs possible  How (besides sane design):  Healthchecks + Nagios  Measurements  Historical Data with Graphs
  • 11. Metrics  Gauges – instantaneous value  Counters – counter with +/-  Meters – rate over time (mean, 1, 5, & 15 moving avg.)  Histograms – distribution of data (mean, median, max, std. div., 75th, 90th, 95th, 98th, 99th, & 99.9th percentiles)  Timers – Meter of requests & Histogram of duration (frequency & latency)
  • 12. Metrics - Healthchecks  Verify that your service is running correctly
  • 13. Metrics - Reporting  HTTP  JMX  Graphite
  • 14. Dropwizard: What is it?  Quality open source Java webservice components glued together in a modular way  Eliminates the need for picking a platform stack, it‟s all there  It‟s opinionated. If you don‟t like a Dropwizard core component, that‟s too bad, don‟t use Dropwizard  Developers focus on business logic, not framework  It‟s easy, maintainable, and it works!
  • 15. A Few Words from Coda… “I had no one I had to toss a WAR to. I had no one to stand up a Tomcat server and fiddle with it until their eyes bled. I had no one who didn't trust me to spin up my own threads or connection pools. So I wrote something which worked as simply and in as straight- forward a manner as possible because my own ass was on the line if it didn't work.”
  • 16. Dropwizard: The Ingredients  Jersey for REST  Jackson for JSON  Jetty for a webserver  Metrics for measuring  YAML for configuring  Dropwizard for weaving everything together
  • 17. Dropwizard – Healthchecks  Register hooks that check the health of your app  An HTTP endpoint that iterates over all the hooks  “The meaning of healthy” is decided by you (i. e. Database Connections, Client Connections, DeadLock Count)
  • 18. Dropwizard + Metrics  Dropwizard has lots of platform instrumentation baked in using Metrics, happens for free! (i.e. Jetty, JVM, Log Counts, etc…)  Ability to add Timers to your endpoints with @Timed  Ability to add arbitrary metrics as you see fit
  • 19. Other Frameworks  Play 1.X  Abandonware for Play 2.X, which was still beta  Magic  Glassfish  OSGI hell  “standards”  Spring  Everything and the kitchen sink  Also I hate XML
  • 20. What do I get out of it? Dev agenda  Story telling: causation & correlation  Integral piece of the operational excellence puzzle  State of the world – Dashboards  Developers focus on features, operations is mostly free lunch  Code review & demo Disclaimer: You need graphite to really harness the value
  • 21. Story telling  The grid is slow why?  Is it load?  Is it dependent service latency?  How does that compare to yesterday  JVM throws out of memory, what‟s the problem?  What does the GC jigsaw look?  When did it change?  Is it correlated with increased load?  How is that new „performance‟ tweak?  If you never measured, then you didn‟t tune. True story!  What does my 5XX graph look like?
  • 22. Operational Excellence: The ingredients  Application Instrumentation (Dropwizard)  Time Series Data & Graphing (Graphite, D3)  Centralized logging & log parsing (Rsyslog, Logstash, Nagios)  Automated alerting & escalation (Pagerduty) DW & Graphite will get you very far, but if you want total control & visibility you need the rest. This is the stack that RTR is moving towards, rather than relying on basic java logging smtp appenders
  • 23. OMG, we are on GMA, are we OK?  10+ services  Each services runs in a cluster behind an LB  „OK‟ is somewhat service specific Basically you need a lot of info at your fingertips. Pictures are worth a thousand words. Get yourself some dashboards!
  • 24. Graphite Dashboard
  • 25. Tasseo dashboard (D3) • Red, Yellow, & Green Lights • Realtime • Endless cool things: graphite + D3 If we see yellow or red, start diagnosing
  • 26. Free Lunch? Not really  DB connection pool monitoring  Http client connection pool monitoring  JVM Heap & GC info  Http Server response counts  Http Server connection info  Endpoint duration & throughput stats
  • 27. Where do I sign up?  You install Graphite, one time hit + some TLC. Medium Difficulty  You annotate your endpoints and maybe add finer telemetry. Easy  You configure so your service is feeding into graphite. Hopefully consistently across services, via a „Bundle‟. Easy
  • 28. Demo  Show a simple dropwizard codebase  Do some curls  Show the admin endpoints
  • 29. References  dropwizard.codahale.com  metrics.codahale.com  graphite.wikidot.com
  • 30. Presenters  @CarloBarbara (www.cabkata.com)  @Skamille (whilefalse.blogspot.com)  Rent The Runway is hiring! (renttherunway.com/careers)