App Sharding to Autosharding at Sailthru
 

Like this? Share it with your network

Share

App Sharding to Autosharding at Sailthru

on

  • 766 views

 

Statistics

Views

Total Views
766
Views on SlideShare
633
Embed Views
133

Actions

Likes
0
Downloads
9
Comments
0

6 Embeds 133

https://twitter.com 73
https://www.mongodb.com 32
http://www.mongodb.com 21
https://tame.it 4
https://comwww-drupal.10gen.com 2
https://live.mongodb.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • Only 16% of companies place primary focus here though (vs. acquisition)
  • Only 16% of companies place primary focus here though (vs. acquisition)
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • 2/3+ of people under 45 are always addressable
  • Mention engagement as the end-all metric again

App Sharding to Autosharding at Sailthru Presentation Transcript

  • 1. Ian White CTO and Co-Founder, Sailthru @eonwhite ian@sailthru.com www.sailthru.com ian@sailthru.com App Sharding to Autosharding
  • 2. • Every user is unique • Email, onsite, mobile, social, offline personalization on an individual level • Optimizes conversion and drives retention for eCommerce and media • Founded in 2008 by three engineers • 170 employees in NYC, SF, LA, London
  • 3. Sailthru Using MongoDB 120 40 TB Since 2009 primary datastore Replicaset nodes on metal infrastructure 25,000 writes/second
  • 4. Basic Sailthru Objects 850 Million 75 Million 2.5 Billion User Profiles Content Documents Messages Per Month
  • 5. The Challenge • Sailthru is both • Some apps are ready-heavy • Some apps are write-heavy
  • 6. Why Shard? • Using MongoDB since 2009 • No autosharding capabilities at the time • Too much data for a single node
  • 7. Application Sharding? • Application-level sharding • Partition data by client • Db class examines query and routes to an appropriate replica set and collection
  • 8. Application Sharding Query db[‘profile’].find( {“client_id”:450, ”email”:”ian@sailthru.com”} Query db[‘profile.450’].find( {”email”:”ian@sailthru.com”}) Shard Map Config File {“profile”: {“shard_key”:”client_id”,”shards”: {“450”:”profile1”, “766”:”profile2”} } }
  • 9. App Sharding: Advantages • Smaller indexes due to collection partitioning • Ability to add specific indices per client (not done much in practice)
  • 10. App Sharding: Problems • Uneven load distribution • Writes bottlenecked by capacity of single server • Manual rebalancing and allocation = lots of work for DB team
  • 11. Solution: Autosharding (Since MongoDB 1.6)
  • 12. Selecting a Shard Key • Individual reads • Individual writes • Cursored reads
  • 13. Shard Key Options • client_id? Uneven distribution • email? Hard to handle null bucket • _id? Uneven time-based distribution
  • 14. Best Option sh.shardCollection( "profile", { _id: "hashed" } ) • hash of _id • Available since MongoDB 2.4
  • 15. What about lookups by email? Don’t want to hit every shard on every lookup
  • 16. Solution: key collection {‘_id’:’<client> <keytype> <sha256_of_value>’, ‘sid’:<mongoid>} profileprofile.key _id _i d • Two quick lookups to individual shards is more scalable than hitting all. • And autoshard that!
  • 17. How We Did The Move. Uptime is critical- cannot bring service down for infrastructure changes
  • 18. Solution: Mongo-Connector Created by MongoDB interns two summers ago. The Swiss army knife of moving data from set to set.
  • 19. Solution: Mongo-Connector • Tail oplog in legacy replica set • Pipe data into autoshard cluster with mongo-connector • Repoint app to read/write autoshard • Zero downtime
  • 20. Solution: Mongo-Connector • Our fork contains some improvements • ts(timestamp) and ns(namespace) get added in separate collection instead of the target document https://github.com/sailthru/mongo-connector
  • 21. But Wait! There’s More • Mongo-Connector can also be used to • Pipe data into alternate data stores (Hadoop, Solr, etc) • Change autoshard keys if you made a mistake
  • 22. In Conclusion • Autosharding is helpful • Think about shard key early • Start by writing to a mongos, even when its just one set profileprofile.key _id _i d
  • 23. Q&A www.sailthru.com sales@sailthru.com 817.812.8689 @sailthru NYC HQ 160 Varick St., 12th Floor New York, NY 10013 San Francisco 25 Taylor St., Room 724 San Francisco, CA 94102 London 18 Soho Square London, UK, W1D 3QL Los Angeles 7083 Hollywood Blvd Los Angeles, CA 90028 Ian White CTO and Co-Founder, Sailthru @eonwhite ian@sailthru.com