2. Shutterfly Inc.
• Founded in December 1999
• Public company (NASDAQ: SFLY)
• Millions of customers have billions of pictures on
Shutterfly
• Photo site, books, sharing, prints, gifts
• Only photo sharing site that doesn’t down-
sample, compress, or force delete photos
• > 6B photos, adding 400TB/mo
April 30, 2010 Business Confidential 2
3. Existing Metadata Storage Architecture
• Metadata is persisted in RDBMS
• Images/media stored outside DB
• Java/Spring, C#,.Net
• Oracle™ RDBMS
• Sun™ servers and storage
• Vertically partitioned by function
• Hot Standbys used for availability
• > 20tb of RDBMS storage
• > 10000 ex/sec
• Extreme uptime requirements
April 30, 2010 Business Confidential 3
4. Problems
• Time to Market
• Cost
• Performance
• Scalability
April 30, 2010 Business Confidential 4
5. New Metadata Storage Architecture
• Performance
! Reduce complexity
! Partition data
• Scalability
! Move to clustered system
• Time to Market
! Simple API
• Cost
! OSS software
! Simple hardware
April 30, 2010 Business Confidential 5
6. New Data Architecture Fundamentals
• Partition data
• Relax consistency (where applicable)
• Data locality
• Highly available configuration
• Keep design simple/fast
• Keep hardware simple/cheap
• Keep software simple/cheap
April 30, 2010 Business Confidential 6
7. MongoDB
• Open Source
• Best of RDBMS, yet not quite k,v store
• Features we need
• Commercial support
• Active community
• Performance
April 30, 2010 Business Confidential 7
8. MongoDB Development
• Data modeling
• Java, .Net
• Simple, fast development
• JSON just makes sense
• Data access layer
• GridFS
April 30, 2010 Business Confidential 8
9. MongoDB in production
• Simple use case, simple project
• Primary and 2 replica DB’s, 1 ‘lagged’
• Manual failover
• Monitoring: http interface
• Tools: mongostat, custom rrd graphs
• Linux on Intel™
• MongoDB 1.4.2 (stable)
April 30, 2010 Business Confidential 9
10. Going Live Plan
• Walk before you run
• Shutterfly project/product selection
• Write through architecture
•
• Good metrics
• Subset of MongoDB features
April 30, 2010 Business Confidential 10
11. So how did we do?
• Time to Market
• Application developed in 1 sprint
• Cost
• 500% improvement
• Performance
• 900% improvement
• 18ms to 2ms avg latency for inserts
• Scalability
• Shard on demand
April 30, 2010 Business Confidential 11
12. The future
• More MongoDB
• Replication as durability (getLasterror(w=2))
• Replica sets
• Excitement from developers
• Lots of attribute and media metadata types
• Object mapper
• New projects and old systems
• Evaluate as they come up
April 30, 2010 Business Confidential 12
13. Lessons Learned
• Keep it simple
• Data Modeling
• Walk before you run
• Use Jira for MongoDB issues
• There is life after Larry
April 30, 2010 Business Confidential 13
14. Q&A
Questions?
Contact:
kg@kennygorman.com
http://www.kennygorman.com
http://github.com/kgorman
http://www.shutterfly.com
kgorman@shutterfly.com
April 30, 2010 Business Confidential 14