2. what is foursquare?
location-based social network - “check-in” to bars,
restaurants, museums, parks, etc
friend-finder (where are my friends right now?)
virtual game (badges, points, mayorships)
city guide (local, personalized recommendations)
location diary + stats engine (where was I a year ago?)
specials (get rewards at your favorite restaurant)
4. foursquare: the tech
Nginx, HAProxy
Scala, Lift
MongoDB, PostgreSQL (legacy)
(Kestrel, Munin, Ganglia, Python, Memcache, ...)
All on EC2
5. foursquare <3’s mongodb
fast
indexes & rich queries
sharding, auto-balancing
replication (see: http://engineering.foursquare.com/)
geo-indexes
amazing support
6. mongodb: our numbers
8 clusters
some sharded, some not
some master/slave, some replica set
~40 machines (68.4GB, m2.4xl on EC2)
2.3 billion records
~15k QPS
7. mongodb: lessons learned
keep working set in memory
avoid long-running queries (reads or writes)
monitor everything (especially per-collection stats)
shard from day 1
beware EBS
use small field names for large collections
14. mongodb: pain points
mongos -- failover and thundering herds
index creation -- production impact unclear
auto-balancing -- getting there
replication chains -- use replica sets
15. rogue: a scala dsl for mongo
type-safe
all mongo query features
logging & validation hooks
pagination
index-aware
http://github.com/foursquare/rogue
17. rogue: code example
Venue where (_.mayorid <= 100)
and (_.venuename eqs “Starbucks”)
and (_.tags contains “wifi”)
and (_.latlng near
(39.0, -74.0, Degrees(0.2))
orderDesc (_._id)
fetch (5)