Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1bJ6PjQ.
Paul Osman discusses their experiences evolving 500px from a single, monolithic Ruby on Rails application to a series of composable microservices written in Ruby and Go. Filmed at qconsf.com.
Paul Osman is the Director of Platform Engineering at 500px, the world’s premier photography community.
2. InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/500px-services
3. Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
4.
5. “Any organization that designs a system
(defined broadly) will produce a design whose
structure is a copy of the organization’s
communication structure.”
— Melvyn Conway, 1967
11. January 2014
• Classic web architecture (LB, App Servers, DB Servers).
• Many data stores, one large monolithic codebase.
• Scaling: all or nothing.
• Cascading failures. Lots of SPOFs.
• Large, unwieldy codebase.
• “Get in line” approach to deployments.
12. Search & Indexing
• Indexing is naturally asynchronous.
• Introduced dependency on RabbitMQ.
• Wrote new indexer in Go. First service and first Go project.
- Huge performance improvement. 20 hours to 20 minutes.
• Search service - HTTP service that translates query strings into ES queries.
• Eliminated major SPOF. ElasticSearch / Search outages no longer affect customers.
• Isolated knowledge of data store (ElasticSearch).
13. Uploads & Resizing
• Perfect for first service to move to EC2.
• Continued in Go, because it was working well for us.
• Performance improvement. Upload times dropped significantly.
- Bulk uploads via Lightroom plugin.
• Eliminated another SPOF. Next step: front-end work to gracefully communicate to user.
• Eliminated write-half of S3 knowledge. Read-half still to be done.
14. Activities & Notifications
• Ongoing.
• Replacing poorly designed service built on top of MongoDB.
• Will eliminate another SPOF. Fix a lot of reliability complaints.
• Get to revisit architecture, centre around data needs (both real-time and ETL workflows).
31. Where to Start
• Pick a business capability that is:
• Off the Critical Path
• Can be double-written to easily (i.e. indexing, aggregating, etc)
• Look out for natural integration points between disparate systems — also good candidates
32. Design for Business Capabilities
• Break your application into business capabilities (i.e. search, recommendations, uploads).
• Tempting to design around technology layers (front-end, back-end). Don’t.
• Even if you don’t (yet) have the numbers, treat each business capability as a team.
35. Team Structure
• Web App Team - Own front-end code, web-bff API facade
• Search Team - Own search infrastructure & service
• Media Team - Own upload service, resizing & converting services, watermarking, etc
• Mobile Team - Own mobile apps, mobile-bff API facade
• Platform Eng - Everything else, operations, devtools, QA / Release engineering