SlideShare a Scribd company logo
1 of 40
Breaking the Monolith
Organizing Your Team to Embrace Microservices
Paul Osman, 500px
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/500px-services
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
“Any organization that designs a system
(defined broadly) will produce a design whose
structure is a copy of the organization’s
communication structure.”
— Melvyn Conway, 1967
A Story
500px - Made of people
500px Team
• Approx 40 full-time employees

• Approx 20 full-time engineers + 6 designers / ux researchers

• Product Teams

- iOS & Android (3)

- Prime (2)

- 500px Web (4)

- Platform Engineering (Operations + Dev) (7)

- QA / Release Engineering (3)
Microservices
Our Journey
January 2014
• Classic web architecture (LB, App Servers, DB Servers).

• Many data stores, one large monolithic codebase. 

• Scaling: all or nothing.

• Cascading failures. Lots of SPOFs.

• Large, unwieldy codebase.

• “Get in line” approach to deployments.
Search & Indexing
• Indexing is naturally asynchronous.

• Introduced dependency on RabbitMQ.

• Wrote new indexer in Go. First service and first Go project.

- Huge performance improvement. 20 hours to 20 minutes.

• Search service - HTTP service that translates query strings into ES queries.

• Eliminated major SPOF. ElasticSearch / Search outages no longer affect customers.

• Isolated knowledge of data store (ElasticSearch).
Uploads & Resizing
• Perfect for first service to move to EC2.

• Continued in Go, because it was working well for us.

• Performance improvement. Upload times dropped significantly. 

- Bulk uploads via Lightroom plugin.

• Eliminated another SPOF. Next step: front-end work to gracefully communicate to user.

• Eliminated write-half of S3 knowledge. Read-half still to be done.
Activities & Notifications
• Ongoing.

• Replacing poorly designed service built on top of MongoDB.

• Will eliminate another SPOF. Fix a lot of reliability complaints.

• Get to revisit architecture, centre around data needs (both real-time and ETL workflows).
3 Lessons
Lesson # 1 - DevOps
12 Factor Apps *
* http://12factor.net/
Monitor Everything
On Call Rotation
Measure Incident Response
* MTTD
* MTTR
Gameday Exercises
Automation
Lesson # 2 - Service
Design
Where to Start
500px in January
Search & Indexing
Uploads & Resizing
Use Circuit Breakers
Use Circuit Breakers
https://github.com/500px/ruby-service-client
Where to Start
• Pick a business capability that is:

• Off the Critical Path

• Can be double-written to easily (i.e. indexing, aggregating, etc)

• Look out for natural integration points between disparate systems — also good candidates
Design for Business Capabilities
• Break your application into business capabilities (i.e. search, recommendations, uploads).

• Tempting to design around technology layers (front-end, back-end). Don’t.

• Even if you don’t (yet) have the numbers, treat each business capability as a team.
Lesson # 3 - Teams
API Facades
Team Structure
• Web App Team - Own front-end code, web-bff API facade

• Search Team - Own search infrastructure & service

• Media Team - Own upload service, resizing & converting services, watermarking, etc

• Mobile Team - Own mobile apps, mobile-bff API facade

• Platform Eng - Everything else, operations, devtools, QA / Release engineering
Cross Functional Teams
Move People Around
Thank you!
Paul Osman

@paulosman

paul@500px.com
https://500px.com/
jobs
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/500px-
services

More Related Content

More from C4Media

More from C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Organizing Your Company to Embrace Microservices

  • 1. Breaking the Monolith Organizing Your Team to Embrace Microservices Paul Osman, 500px
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /500px-services
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4.
  • 5. “Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.” — Melvyn Conway, 1967
  • 7. 500px - Made of people
  • 8. 500px Team • Approx 40 full-time employees • Approx 20 full-time engineers + 6 designers / ux researchers • Product Teams - iOS & Android (3) - Prime (2) - 500px Web (4) - Platform Engineering (Operations + Dev) (7) - QA / Release Engineering (3)
  • 11. January 2014 • Classic web architecture (LB, App Servers, DB Servers). • Many data stores, one large monolithic codebase. • Scaling: all or nothing. • Cascading failures. Lots of SPOFs. • Large, unwieldy codebase. • “Get in line” approach to deployments.
  • 12. Search & Indexing • Indexing is naturally asynchronous. • Introduced dependency on RabbitMQ. • Wrote new indexer in Go. First service and first Go project. - Huge performance improvement. 20 hours to 20 minutes. • Search service - HTTP service that translates query strings into ES queries. • Eliminated major SPOF. ElasticSearch / Search outages no longer affect customers. • Isolated knowledge of data store (ElasticSearch).
  • 13. Uploads & Resizing • Perfect for first service to move to EC2. • Continued in Go, because it was working well for us. • Performance improvement. Upload times dropped significantly. - Bulk uploads via Lightroom plugin. • Eliminated another SPOF. Next step: front-end work to gracefully communicate to user. • Eliminated write-half of S3 knowledge. Read-half still to be done.
  • 14. Activities & Notifications • Ongoing. • Replacing poorly designed service built on top of MongoDB. • Will eliminate another SPOF. Fix a lot of reliability complaints. • Get to revisit architecture, centre around data needs (both real-time and ETL workflows).
  • 16. Lesson # 1 - DevOps
  • 17. 12 Factor Apps * * http://12factor.net/
  • 23. Lesson # 2 - Service Design
  • 31. Where to Start • Pick a business capability that is: • Off the Critical Path • Can be double-written to easily (i.e. indexing, aggregating, etc) • Look out for natural integration points between disparate systems — also good candidates
  • 32. Design for Business Capabilities • Break your application into business capabilities (i.e. search, recommendations, uploads). • Tempting to design around technology layers (front-end, back-end). Don’t. • Even if you don’t (yet) have the numbers, treat each business capability as a team.
  • 33. Lesson # 3 - Teams
  • 35. Team Structure • Web App Team - Own front-end code, web-bff API facade • Search Team - Own search infrastructure & service • Media Team - Own upload service, resizing & converting services, watermarking, etc • Mobile Team - Own mobile apps, mobile-bff API facade • Platform Eng - Everything else, operations, devtools, QA / Release engineering
  • 40. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/500px- services