Implementing MongoDB at Shutterfly (Kenny Gorman)

M
MongoSFMongoSF
Implementing MongoDB

MongoSF
April 2010
Kenny Gorman, Data Architect
Shutterfly Inc.

 •  Founded in December 1999
 •  Public company (NASDAQ: SFLY)
 •  Millions of customers have billions of pictures on
    Shutterfly
 •  Photo site, books, sharing, prints, gifts
 •  Only photo sharing site that doesn’t down-
    sample, compress, or force delete photos
 •  > 6B photos, adding 400TB/mo



April 30, 2010                                Business Confidential   2
Existing Metadata Storage Architecture

 •       Metadata is persisted in RDBMS
 •       Images/media stored outside DB
 •       Java/Spring, C#,.Net
 •       Oracle™ RDBMS
 •       Sun™ servers and storage
 •       Vertically partitioned by function
 •       Hot Standbys used for availability
 •       > 20tb of RDBMS storage
 •       > 10000 ex/sec
 •       Extreme uptime requirements

April 30, 2010                                Business Confidential   3
Problems

 •       Time to Market
 •       Cost
 •       Performance
 •       Scalability




April 30, 2010            Business Confidential   4
New Metadata Storage Architecture

 •  Performance
                 ! Reduce complexity
                 ! Partition data
 •  Scalability
                 ! Move to clustered system
 •  Time to Market
                 ! Simple API
 •  Cost
                 ! OSS software
                 ! Simple hardware


April 30, 2010                                Business Confidential   5
New Data Architecture Fundamentals

 •       Partition data
 •       Relax consistency (where applicable)
 •       Data locality
 •       Highly available configuration
 •       Keep design simple/fast
 •       Keep hardware simple/cheap
 •       Keep software simple/cheap




April 30, 2010                                  Business Confidential   6
MongoDB

 •       Open Source
 •       Best of RDBMS, yet not quite k,v store
 •       Features we need
 •       Commercial support
 •       Active community
 •       Performance




April 30, 2010                                    Business Confidential   7
MongoDB Development

 •       Data modeling
 •       Java, .Net
 •       Simple, fast development
 •       JSON just makes sense
 •       Data access layer
 •       GridFS




April 30, 2010                      Business Confidential   8
MongoDB in production

 •       Simple use case, simple project
 •       Primary and 2 replica DB’s, 1 ‘lagged’
 •       Manual failover
 •       Monitoring: http interface
 •       Tools: mongostat, custom rrd graphs
 •       Linux on Intel™
 •       MongoDB 1.4.2 (stable)




April 30, 2010                                    Business Confidential   9
Going Live Plan

 •  Walk before you run
 •  Shutterfly project/product selection
 •  Write through architecture

 •  




 •  Good metrics
 •  Subset of MongoDB features

April 30, 2010                             Business Confidential   10
So how did we do?

 •  Time to Market
           •  Application developed in 1 sprint
 •  Cost
           •  500% improvement
 •  Performance
           •  900% improvement
           •  18ms to 2ms avg latency for inserts
 •  Scalability
           •  Shard on demand


April 30, 2010                                      Business Confidential   11
The future

 •  More MongoDB
           •  Replication as durability (getLasterror(w=2))
           •  Replica sets
 •  Excitement from developers
    •  Lots of attribute and media metadata types
    •  Object mapper
 •  New projects and old systems
           •  Evaluate as they come up




April 30, 2010                                          Business Confidential   12
Lessons Learned

 •  Keep it simple
 •  Data Modeling
 •  Walk before you run
 •  Use Jira for MongoDB issues
 •  There is life after Larry




April 30, 2010                    Business Confidential   13
Q&A

 Questions?

 Contact:
 kg@kennygorman.com
 http://www.kennygorman.com
 http://github.com/kgorman
 http://www.shutterfly.com
 kgorman@shutterfly.com




April 30, 2010                Business Confidential   14
1 of 14

More Related Content

What's hot(20)

Implementing MongoDB at Shutterfly (Kenny Gorman)

  • 2. Shutterfly Inc. •  Founded in December 1999 •  Public company (NASDAQ: SFLY) •  Millions of customers have billions of pictures on Shutterfly •  Photo site, books, sharing, prints, gifts •  Only photo sharing site that doesn’t down- sample, compress, or force delete photos •  > 6B photos, adding 400TB/mo April 30, 2010 Business Confidential 2
  • 3. Existing Metadata Storage Architecture •  Metadata is persisted in RDBMS •  Images/media stored outside DB •  Java/Spring, C#,.Net •  Oracle™ RDBMS •  Sun™ servers and storage •  Vertically partitioned by function •  Hot Standbys used for availability •  > 20tb of RDBMS storage •  > 10000 ex/sec •  Extreme uptime requirements April 30, 2010 Business Confidential 3
  • 4. Problems •  Time to Market •  Cost •  Performance •  Scalability April 30, 2010 Business Confidential 4
  • 5. New Metadata Storage Architecture •  Performance ! Reduce complexity ! Partition data •  Scalability ! Move to clustered system •  Time to Market ! Simple API •  Cost ! OSS software ! Simple hardware April 30, 2010 Business Confidential 5
  • 6. New Data Architecture Fundamentals •  Partition data •  Relax consistency (where applicable) •  Data locality •  Highly available configuration •  Keep design simple/fast •  Keep hardware simple/cheap •  Keep software simple/cheap April 30, 2010 Business Confidential 6
  • 7. MongoDB •  Open Source •  Best of RDBMS, yet not quite k,v store •  Features we need •  Commercial support •  Active community •  Performance April 30, 2010 Business Confidential 7
  • 8. MongoDB Development •  Data modeling •  Java, .Net •  Simple, fast development •  JSON just makes sense •  Data access layer •  GridFS April 30, 2010 Business Confidential 8
  • 9. MongoDB in production •  Simple use case, simple project •  Primary and 2 replica DB’s, 1 ‘lagged’ •  Manual failover •  Monitoring: http interface •  Tools: mongostat, custom rrd graphs •  Linux on Intel™ •  MongoDB 1.4.2 (stable) April 30, 2010 Business Confidential 9
  • 10. Going Live Plan •  Walk before you run •  Shutterfly project/product selection •  Write through architecture •  •  Good metrics •  Subset of MongoDB features April 30, 2010 Business Confidential 10
  • 11. So how did we do? •  Time to Market •  Application developed in 1 sprint •  Cost •  500% improvement •  Performance •  900% improvement •  18ms to 2ms avg latency for inserts •  Scalability •  Shard on demand April 30, 2010 Business Confidential 11
  • 12. The future •  More MongoDB •  Replication as durability (getLasterror(w=2)) •  Replica sets •  Excitement from developers •  Lots of attribute and media metadata types •  Object mapper •  New projects and old systems •  Evaluate as they come up April 30, 2010 Business Confidential 12
  • 13. Lessons Learned •  Keep it simple •  Data Modeling •  Walk before you run •  Use Jira for MongoDB issues •  There is life after Larry April 30, 2010 Business Confidential 13
  • 14. Q&A Questions? Contact: kg@kennygorman.com http://www.kennygorman.com http://github.com/kgorman http://www.shutterfly.com kgorman@shutterfly.com April 30, 2010 Business Confidential 14