CodeFest 2014. Vedran Mikulic — Booking Fast Development

1,554 views

Published on

Published in: Internet
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,554
On SlideShare
0
From Embeds
0
Number of Embeds
413
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

CodeFest 2014. Vedran Mikulic — Booking Fast Development

  1. 1. Booking fast development Vedran Mikulic @ Codefest 2014
  2. 2. About me ● Vedran – a developer ● @ Booking.com for 2 years ● Team lead and product owner of the Event System team 2
  3. 3. About Booking.com ● We sell room nights ● Over 650 000 per day ● More than 400 000 hotels ● We employ over 6500 people ( globally ) ● Hundreds of developers and designers working on code and templates ● Thousands in: customer care, hotels department, content department, etc. 3
  4. 4. ● Dozens of deploy-able systems ● And we really like rolling them out... ● Yesterday: 50 roll-outs ● ( + experiments ) ● This is a regular day at Booking.com 4
  5. 5. Organization & Structure ● Small  teams  …   ● …  that  own the systems they work on. ● Flat hierarchy ● Self steering ● Frontend and Backend 5
  6. 6. Workflow ● Beyond Scrum ● Standup  and  “Scrum  of  Scrums” ● Small steps ● Progress and priorities constantly tracked and managed ● Shared codebase ● No formal QA step ( before production ) ● Failure is OK! 6
  7. 7. Development and Testing ● DQS – Our test environment ● Virtual machines mirroring production environments ● Managed  by  a  “self-build”  tool ● Backed by great tooling ● No QA to get in the way! ● Next stop: production! 7
  8. 8. Deployment ● Get new code to servers reliably and quickly ● Sometimes: get old code back even faster ● git-deploy: github.com/git-deploy ● Integrated with other systems - simple and robust ● Flexible 8
  9. 9. 9 git-deploy start Test on staging git-deploy sync New code live
  10. 10. ● Ownership of deploy ● Hotfix ● Incident handling ● Communication, communication, communication 10
  11. 11. Monitoring ● We want to monitor everything ● Focus on relevant metrics ● Event-logging system – information aggregator ● Information is conveyed by events ● Free-form and accessible data feed 11
  12. 12. ● Multiple destinations for processed data feed 12 Graphite
  13. 13. Monitoring - Graphite ● Cluster nodes in the hundreds ● Tens of millions of metrics per minute ● Custom modifications: ● github.com/grobian/carbon-c-relay ● github.com/dgryski/carbonzipper 13
  14. 14. ● Graphite is: versatile, simple, fast... ● …but  not a silver bullet ● It kills SSD-s ! ● Application level metrics + full server health monitoring 14
  15. 15. Monitoring - goals ● Visibility and accessibility ● Flexibility ● Robustness ( isolation ) ● Correctness ( trust ) 15
  16. 16. Monitoring – tools ● Time proven tools used daily: ● Live application monitoring ( Landweg / Graphite dashboards) ● Server health monitoring ( graphite tools ) ● Error/warning aggregation ( show_errors ) ● Real-time business reporting ( Controlrooms ) ● Constantly improving 16
  17. 17. ● The Experiment Tool ● Backbone tool for Frontend teams ● Real time experiment data ● Complex analytics and breakdowns 17
  18. 18. ● Shadowapp ● Our  “canary  in  the  coalmine” ● Runs  code  from  “trunk”  vs real requests ● Smokes out subtle bugs and issues ● Edge cases by real users, not developers 18
  19. 19. Challenges ● Keeping it flat ● Deployment complexity ● Deployment speed ● Controlling Guiding the constant creation of new data and monitoring systems ● Scaling event-logging and Graphite 19
  20. 20. 20 Thank you for your attention! Questions?

×