Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ruxit - How we launched a global monitoring platform on AWS in 80 days.

1,438 views

Published on

How Ruxit has developed a global monitoring solution on AWS within 80 days. Talking about architecture, processes and tools.

Published in: Technology
  • Be the first to comment

Ruxit - How we launched a global monitoring platform on AWS in 80 days.

  1. 1. ruxit theme 2014.05.15 Behind the scenes @ ruxit Running a global monitoring infrastructure on AWS Alois Reitbauer, ruxit @aloisreitbauer
  2. 2. ruxit theme 2014.05.15 Ruxit – what we do SaaS-based Monitoring and Management Solution
  3. 3. ruxit theme 2014.05.15
  4. 4. ruxit theme 2014.05.15
  5. 5. ruxit theme 2014.05.15
  6. 6. ruxit theme 2014.05.15
  7. 7. ruxit theme 2014.05.15
  8. 8. ruxit theme 2014.05.15 A bit of history How we moved to a global AWS deployment in 80 days
  9. 9. ruxit theme 2014.05.15
  10. 10. ruxit theme 2014.05.15 How we moved to the Cloud in 80 days June 2014 – Beta Cloud Deployment July 2014 – Open Beta Offering to Public August 2014 – Full automation September 2014 – Official Product Launch October 2014 - >1000 active companies
  11. 11. ruxit theme 2014.05.15
  12. 12. ruxit theme 2014.05.15 Our architecture Lessons learned building a global cloud platform
  13. 13. ruxit theme 2014.05.15 Cluster
  14. 14. ruxit theme 2014.05.15 Cluster Cassandra DB Cluster Server Cluster Public Security Gateways AvailabilityZoneAvailabilityZoneAvailabilityZone Amazon EC2 HA Proxy Elastic Load Balancer
  15. 15. ruxit theme 2014.05.15 Cluster 3rdP 3rdP 3rdP 3rdP 3rdP cloudcontrol.ruxit.com account.ruxit.com *.live.ruxit.com *.live.ruxit.com *.live.ruxit.com
  16. 16. ruxit theme 2014.05.15 Ruxit is build on AWS How we solve challenges using AWS technology stack
  17. 17. ruxit theme 2014.05.15 Challenge: Growth Being one of the fastest growing B2B SaaS companies
  18. 18. ruxit theme 2014.05.15 Challenge: Usability Real Time provisioning of DNS names
  19. 19. ruxit theme 2014.05.15 Challenge: Reliability Zero downtime without manual intervention
  20. 20. ruxit theme 2014.05.15 Challenge: Delivery Manage deployment artifacts globally
  21. 21. ruxit theme 2014.05.15 How we achieve zero downtime Your application will break; your users should not recognize
  22. 22. ruxit theme 2014.05.15 Key Guiding Principles Over Provisioning Quarantine Mode Rolling Updates Soft Stickyness
  23. 23. ruxit theme 2014.05.15 We never run above two thirds of capacity Over provisioning is built into our architecture.
  24. 24. ruxit theme 2014.05.15 Cassandra DB Cluster Server Cluster Public Security Gateways AvailabilityZoneAvailabilityZoneAvailabilityZone HA Proxy Elastic Load Balancer Quarantine and Diagnose in Production
  25. 25. ruxit theme 2014.05.15 How we handle upgrades We have to be able to upgrade without any downtimes
  26. 26. ruxit theme 2014.05.15 Server Cluster Public Security GatewaysHA Proxy Elastic Load Balancer Rolling update Cloud Control AWS S3
  27. 27. ruxit theme 2014.05.15 Soft Stickiness Combining Data Locality with Transparent Failover
  28. 28. ruxit theme 2014.05.15 Server Cluster Public Security GatewaysHA Proxy Elastic Load Balancer Dynamic Traffic Routing A B C
  29. 29. ruxit theme 2014.05.15 Server Cluster Public Security GatewaysHA Proxy Elastic Load Balancer Constant Failover Mode A B C
  30. 30. ruxit theme 2014.05.15 Server Cluster Public Security GatewaysHA Proxy Elastic Load Balancer Routing with Wishlist A B C B
  31. 31. ruxit theme 2014.05.15 Server Cluster Public Security GatewaysHA Proxy Elastic Load Balancer Routing with Failover A C B B
  32. 32. ruxit theme 2014.05.15 Our road from DevOps to NoOps We don’t have a dedicated Operations team and we don’t want one
  33. 33. ruxit theme 2014.05.15 Key Guiding Principles Autonomous Operations Feedback and Transparency Everything is production Data-Driven Operations
  34. 34. ruxit theme 2014.05.15 Run books become backlogs If you describe what to do, you can also code it into the platform
  35. 35. ruxit theme 2014.05.15 Ruxit needs to be able to mange itself
  36. 36. ruxit theme 2014.05.15 Feedback and Transparency Everybody has access to our production monitoring data.
  37. 37. ruxit theme 2014.05.15 Full Transparency on Quality
  38. 38. ruxit theme 2014.05.15 We treat all environments like production Everybody has access to our production monitoring data.
  39. 39. ruxit theme 2014.05.15
  40. 40. ruxit theme 2014.05.15 Data-Driven Operations There is no decision without data.
  41. 41. ruxit theme 2014.05.15 Java OS Apache IIS .NET Understand the impact of deployments
  42. 42. ruxit theme 2014.05.15 1.57 1.59 1.61 1.63 1.65 1.67 1.69 1.58 1.54 Information on Agent Deployment
  43. 43. ruxit theme 2014.05.15 Questions?
  44. 44. ruxit theme 2014.05.15 Member of
  45. 45. ruxit theme 2014.05.15 Alois Reitbauer @aloisreitbauer alois.reitbauer@ruxit.com blog.ruxit.com

×