Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

MongoDB at Yle

174

Published on

MongoDB at Yle

MongoDB at Yle

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
174
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. MongoDB @ Yle.fi An Evening with MongoDB Helsinki, June 11th Kalle Ylä-Anttila | Jussi Pöri
  • 2. Agenda • About Yle.fi • Technology stack • MongoDB • Use cases • Why • How • Lessons learned
  • 3. Yle.fi • Some stats • ~3,5 million unique browsers per week • yle.fi/uutiset 2,4 - Areena 1,6 • Lots of traffic during big events • sports, elections, etc • ~40 developers (mostly external)
  • 4. Yle.fi • Main focus at the moment: Yle API • End user device fragmentation is a challenge • Provides “pure” data without presentation info • Enables flexibility and rapid changes • Hides the complexity of our legacy backend services (mainly broadcast services) • All new customer services are powered by Yle API
  • 5. Yle.fi • Yle API • Consist of several small API:s • Technology stacks vary between API:s • All API:s are used via public REST end points • No direct linking, no direct reads of another APIs data store, no back-doors whatsoever • = “The Jeff Bezos way” :) • We eat our own dog food
  • 6. Technology stack
  • 7. Yle API architecture
  • 8. Yle API architecture
  • 9. Architecture of a single API
  • 10. Why mongodb • Easy of use • High availability • Automatic failover • Horizontal scaling • Widely used • Open source
  • 11. How • MongoDB 2.4.8 • 8 Replica set clusters • 3 + 1 nodes per replica set • Overall 32 production nodes • Configuration with Puppet • MMS for monitoring • New relic for server monitoring and alerts • Cashbah(2.6.2) and reactivemongo (0.10.2)
  • 12. Case - uutisvahti • Yle news mobile app • User can give weights for interesting topics
  • 13. Case - uutisvahti • Over 100 000 subscribers • 2.7M topic preferences • 12.6M articles read • 25M Push notification sent • Lead time from published news to app notification only ~2s. • 50 - 100ms average API response time • Backend processing for news ratings
  • 14. Case - uutisvahti Stats • 15 449 826 objects • Avg. object size 1462 KB • Data size 21 GB • Index size 14 GB
  • 15. Case - uutisvahti Lessons learned • Network storage iops • Use separate databases for heavy writes • Replication hidden(backup) node does effect on voting. Do not shutdown 2/4 nodes. • Look for index sizes, not to fill your memory • Less getMore’s • Reindexing, be careful when updating indexes. • Use secondary read’s where possible • Slow network will effect on replication
  • 16. Case - metrics • Provide metrics for applications about publications • Social media metrics, likes, tweets • View/play counts • Store historical data • Background processing to fetch data from different sources
  • 17. Case - metrics • some-data • 10 collections • 3.5 M objects • Avg object size 164 KB • Index size 650MB • Data size 557MB • somedata • 5 collections • 176 M objects • Avg object size 167 KB • Index size 32 GB • Data size 27.6 GB • somedata-comscore • 4 collections • 57 M objects • Avg object size 167 KB • Index size 10.3 GB • Data size 8.2 GB • somedata-facebook • 4 collections • 3.5 M objects • Avg object size 164 KB • Index size 154 MB • Data size 131 MB
  • 18. Case - metrics Lesson learned • No collection/DB level isolation - Be careful when updating big batches of data. Clients will see incomplete view. • When querying medium amounts of data be careful with built in functions - mapReduce, distinct, aggregate have pretty small built in memory limits, so your query will not complete • Sharding a cluster will be expensive (lot of nodes) • When playing around use some graphical query browser (e.g. Robomongo) , sometimes it's easier to write queries when you include underscore.js
  • 19. Future • Upgrade to 2.6.1 • SSD disks • Architecture with public cloud • Improve backups

×