Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek

3,473 views

Published on

31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek

Published in: Internet

31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek

  1. 1. Bergas Bimo Branarto Arlinda Juwitasari Rama Notowidigdo SCALING GOJEK
  2. 2. WHAT MAKES US WHO WE ARE
  3. 3. SPEED INNOVATION SOCIAL IMPACT OUR VALUES
  4. 4. Product Releases 2011: go-ride, go-send, go-shop (all phone order) January 2015: app release with go-ride, go-send, go-shop April 2015: go-food September 2015: go-mart Okt 2015: go-box, go-massage, go-clean, go-glam, go-busway Desember 2015: go-tix Januari 2016: go-kilat (e-commerce partnership, not on app) April 2016: go-car
  5. 5. 11,000,000 downloads in 15 months 0 3000000 6000000 9000000 12000000 January February March April May June July August MARCH cumulative total app downloads
  6. 6. JABODETABEK 123.500 drivers BANDUNG 37,000 drivers BALI 11,200 drivers surabaya 23,000 drivers MAKASSAR 7,100 drivers ARE WE? WHERE PALEMBANG 510 drivers MEDAN 440 drivers BALIKPAPAN 11,200 drivers YOGYAKARTA 690 drivers SEMARANG 370 drivers
  7. 7. The Growth of Driver Gojek 400 800 1,200 1,600 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7
  8. 8. The Growth of Customer 100 200 300 400 500 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
  9. 9. The growth of Order 350 700 1,050 1,400 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 number of order requestmultipliers
  10. 10. CONWAY’S LAW Any Organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure
  11. 11. SOCIAL Monolithic SOCIAL Backend Driver Customer Internal User
  12. 12. Issue : • Bugs • Unexpected Load growth • Long process time which resulted with server crash during peak Challenge : - Small tech team (5 devs: for mobile apps (cust&driver), portals (3 different portals), backend) vs 4 (or more) divisions - Keep server alive in unexpected high loads - So many features (and products) need to be released to keep business running
  13. 13. Transform 1 backend - 1 Disk backend - n backend - 2proxy
  14. 14. Transform 1 Plus : At least we’re still alive Minus: Long process time DB bottleneck
  15. 15. Transform 2 Disk proxy service C core backend core backend service C service B core backend core backend service B service A core backend service A service A redis queue
  16. 16. Transform 2 Plus : - Splitting functionalities to services makes code more efficient (at least for new services) - Queue enables core backend to push and forget for one way communication - Process time reduced - Enable throttling in queue workers Minus: • Process time still long enough for incoming traffic since it only split non-transactional functionalities • DB bottleneck
  17. 17. Transform 3 Disk proxy service C core backend core backend service C redis queue service B core backend core backend service B Disk service A core backend service A service A Disk rest api service D core backend core backend service D Disk rest api redis cache service E core backend core backend service E Disk
  18. 18. Transform 3 Plus : - Split some transaction processes to another services: load splitted, process time reduced - Redis cache: reduce db bottleneck - Each service owns their own db: reduce db bottleneck Minus: • API calls in a flow of more than 2 services cause cascading failures
  19. 19. Transform 4 Disk proxy service C core backend core backend service C kafka queue service B core backend core backend service B Disk service A core backend service A service A Diskservice D core backend core backend service D Disk inline redis cache service E core backend core backend service E Disk service F core backend service A service F Disk grpc (http/2)
  20. 20. Transform 4 Plus : - Asynchronous communication between services via kafka: reduce api calls between services, reduces cascade failures - Shared redis (inline) cache: reduce db queries, reduce api calls between services, reduce cascade failures - grpc (uses http/2) should reduce network time Minus: ?
  21. 21. Stack : • Java: Spring MVC and Spark • Go • Jruby on Rails • AngularJS • MySQL • PostgreSQL • MongoDB • Elasticsearch • Redis • Kafka • RabbitMQ
  22. 22. Response Time vs Throughput 25,000 50,000 75,000 100,000 0 400 800 1200 1600 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 response time (ms) throughput (rpm)
  23. 23. Order Growth vs Response Time 0 400 800 1,200 1,600 0 350 700 1,050 1,400 order response time (ms)
  24. 24. Order Growth vs Throughput 0 25,000 50,000 75,000 100,000 0 350 700 1,050 1,400 order throughput (rpm)
  25. 25. TRUE HAPPINESS IS THE JOURNEY, NOT THE DESTINATION THANK YOU

×