Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MongoDB @ SFRsfr.fr
WelcomeAntoine Raith, technical team leader @ SFR
Apache, Tomcat, JEE1 mutualised platform30 physical application servers150 Tomcat deployedWeb development at Internet Dire...
22M pageviews per day4.5M only on homepage8M customers authentication per dayWe NEED to scale!What do we face ?
Increase our scalability Avoid Schema/Table/Column dependency Closer to developper team than sysadmin or DBA teamNoSQL?
ScalableComplex queriesSchema-lessEasy deployment and monitoringOpen-SourceWhy MongoDB ?
[Live project] customers data[Live project] sfr.fr targeted ads[Development project] Products catalogOur projects based on...
MongoDB @ SFRCustomer Data
Hello!Jérôme Leleu, web architect @ SFRIn charge of SSO and user profile service
User profile service (UPS)Web services (SOAP or JSON)Get the profile of SFR clientsData are agregated from many backends o...
Java 1.6, mongo driver 2.6.5, replicat set + shardingTechnical data : « local storage » collection■   only 1 collection in...
My choice : read on slave and write (without acknowledge)on master« local storage » collection needs to be readable immedi...
2 Go of data and 2 Go of index for 14 millions documents(from « db.stats(); »)Insert / update : 600 k each day / communica...
Default values of the Java mongo driver are inappropriate :unlimited connect timeout, unlimited read timeout, wait 120seco...
Mongo @ SFRTargeted ads application
Hi!Matthieu BlancWeb architect @ Degetel, contractor for SFR
ContextPresent targeted ads to www.sfr.fr web visitorsBased on :●   Their profile●   Their web browsing history●   Date/Ti...
Ex : A web visitor consult a smartphone @ www.sfr.fr
A smartphones ad is shown when he goes back tohomepage
Ex : A web visitor goes to www.sfr.fr from a searchengine
An ad related to his search is shown
ProblemNeed to keep web visitor web browsing historyNeed to track down every :● Ad views● Clicks● ConversionsMongo DB to t...
image from http://www.flickr.                                           com/photos/cayusa/The D.U.N.C.E. principle : every...
Java 1.6         Spring Data for MongoDB 1.0.0         (uses mongo driver 2.7.1)         Read/Write on master         No S...
Case StudyEvent Logging with MongoDB
Capped collections :Event Loggingdb.createCollection("mycoll", {capped: true, size:100000})Old log data automatically LRU’...
Map Reduce <- we are bad at this  Cron Job -> Server side logs aggregation by minute  and by ad  Aggregated logs persisted...
Event Logging
The Result
The Result
The Result
Main collection (visitors web browsing history):36 millions documents and growingSome DataAvg. document size 430 bytes80 m...
It works! :)Some DataDefault properties are good enough even for a high trafficwebsite (for now...)Conclusion
Mongo @ SFRProducts catalog
Good morning!David Rault, web architect @ SFRIn charge of MarketPlace project@squat80       http://fr.linkedin.com/pub/dav...
●   Products classified by categories ●   Categories determine products features ●   Multiple sellers     ○   can create n...
●   Schema-less: products are structured    documents    ○   Different properties depending on product category        (TV...
Java 7 - Tomcat 7Direct use of Java driver (2.7.2)Replicat-set (2 replicas + 1 arbiter)Sharding enabledWrites are replicas...
●   WS for creation/update of products and     offers ●   Triggers (scheduled) to consolidate data     ○   for each produc...
●   Straight-forward queries     ○   mostly READs     ○   by product id, by category     ○   filtering (min/max price, by ...
●   Need to unlearn 10+ years EXP in     relational design/development     ○   Think "document", not relation     ○   No m...
●   Good performance     ○   Although relatively low number of documents         (~5-10 000 documents) ●   Fast developmen...
"borrowed" from Geek and Poke http://geekandpoke.typepad.com/       Thank You!
Upcoming SlideShare
Loading in …5
×

MongoDB@sfr.fr

3,942 views

Published on

Published in: Technology, Design
  • Be the first to comment

MongoDB@sfr.fr

  1. 1. MongoDB @ SFRsfr.fr
  2. 2. WelcomeAntoine Raith, technical team leader @ SFR
  3. 3. Apache, Tomcat, JEE1 mutualised platform30 physical application servers150 Tomcat deployedWeb development at Internet Direction
  4. 4. 22M pageviews per day4.5M only on homepage8M customers authentication per dayWe NEED to scale!What do we face ?
  5. 5. Increase our scalability Avoid Schema/Table/Column dependency Closer to developper team than sysadmin or DBA teamNoSQL?
  6. 6. ScalableComplex queriesSchema-lessEasy deployment and monitoringOpen-SourceWhy MongoDB ?
  7. 7. [Live project] customers data[Live project] sfr.fr targeted ads[Development project] Products catalogOur projects based on MongoDB
  8. 8. MongoDB @ SFRCustomer Data
  9. 9. Hello!Jérôme Leleu, web architect @ SFRIn charge of SSO and user profile service
  10. 10. User profile service (UPS)Web services (SOAP or JSON)Get the profile of SFR clientsData are agregated from many backends of the informationsystemContext
  11. 11. Java 1.6, mongo driver 2.6.5, replicat set + shardingTechnical data : « local storage » collection■ only 1 collection in a database■ « last connection date » of web account■ 14 millions■ read/writes by identifier of the web account (shard key)Some functional data are coming : « internautes » collection(6 millions)…Data in UPS
  12. 12. My choice : read on slave and write (without acknowledge)on master« local storage » collection needs to be readable immediatlyafter write-> not really compatible with asynchronous replication andreads on slave-> use of memcached (like for most data in UPS) as acache for reads (let replication happens)Implementation in MongoDB
  13. 13. 2 Go of data and 2 Go of index for 14 millions documents(from « db.stats(); »)Insert / update : 600 k each day / communication exception: 6 k each dayAverage insert/update time : 56 msSome figures
  14. 14. Default values of the Java mongo driver are inappropriate :unlimited connect timeout, unlimited read timeout, wait 120seconds to get a connection from pool !Cant’ make « AND » query on the same fieldbefore mongo 2.0Is it a good choice to read on slave / write on master ?Replication time ? Is it a real use case ?To replace by :force acknowledge on writes and read on slave ?ORdon’t acknowledge writes and read on master ?Problems & pending question
  15. 15. Mongo @ SFRTargeted ads application
  16. 16. Hi!Matthieu BlancWeb architect @ Degetel, contractor for SFR
  17. 17. ContextPresent targeted ads to www.sfr.fr web visitorsBased on :● Their profile● Their web browsing history● Date/Time of the day● etc.
  18. 18. Ex : A web visitor consult a smartphone @ www.sfr.fr
  19. 19. A smartphones ad is shown when he goes back tohomepage
  20. 20. Ex : A web visitor goes to www.sfr.fr from a searchengine
  21. 21. An ad related to his search is shown
  22. 22. ProblemNeed to keep web visitor web browsing historyNeed to track down every :● Ad views● Clicks● ConversionsMongo DB to the rescue!
  23. 23. image from http://www.flickr. com/photos/cayusa/The D.U.N.C.E. principle : everything by default
  24. 24. Java 1.6 Spring Data for MongoDB 1.0.0 (uses mongo driver 2.7.1) Read/Write on master No Sharding WriteConcern.NORMALThe D.U.N.C.E. principle : everything by default
  25. 25. Case StudyEvent Logging with MongoDB
  26. 26. Capped collections :Event Loggingdb.createCollection("mycoll", {capped: true, size:100000})Old log data automatically LRU’s outNo risk of filling up a diskno need to write log archival / deletion scriptsGood performance for a high number of writes compared toreadsEvent Logging
  27. 27. Map Reduce <- we are bad at this Cron Job -> Server side logs aggregation by minute and by ad Aggregated logs persisted in a dedicated collection Cron Job 2 consolidate aggregated logs by hour every day Cron Job 3 consolidate aggregated logs by day every weekLog Analysis
  28. 28. Event Logging
  29. 29. The Result
  30. 30. The Result
  31. 31. The Result
  32. 32. Main collection (visitors web browsing history):36 millions documents and growingSome DataAvg. document size 430 bytes80 millions events processed in less than 3 monthsBy seconds 60 reads 50 writes (60 finds, 30 updates, 20inserts)Conclusion
  33. 33. It works! :)Some DataDefault properties are good enough even for a high trafficwebsite (for now...)Conclusion
  34. 34. Mongo @ SFRProducts catalog
  35. 35. Good morning!David Rault, web architect @ SFRIn charge of MarketPlace project@squat80 http://fr.linkedin.com/pub/david-rault/37/722/963
  36. 36. ● Products classified by categories ● Categories determine products features ● Multiple sellers ○ can create new products (based on EAN/MPN) ■ can modify the products they created ■ can only refer to products created by other sellers ○ publish offers (product id + price) ● Order management is out-of-scope ○ delegated to existing order-management system ● Still in developmentContext
  37. 37. ● Schema-less: products are structured documents ○ Different properties depending on product category (TVs, phone protections, wires, ...) ○ No JOIN required - documents load in a single call ○ New categories will come : no migration required● Searching capabilities ○ Empowers navigating through the store ○ Complex-queries on products features● Performance ○ Our Ops forbid intensive writes into Oracle DB (!)Why Mongo ?
  38. 38. Java 7 - Tomcat 7Direct use of Java driver (2.7.2)Replicat-set (2 replicas + 1 arbiter)Sharding enabledWrites are replicas-safeTechnical choices
  39. 39. ● WS for creation/update of products and offers ● Triggers (scheduled) to consolidate data ○ for each product : valid offers on a 2-day window are agregated into the product ○ for each categories : product counts, pseudo- enumerated field values (e.g. list of brands) are agregrated into the product ● "Live streaming" into Google Search Appliance ○ feed for both internal keyword searches & portal- wide searches (within *.sfr.fr sites)"Back-office" Design
  40. 40. ● Straight-forward queries ○ mostly READs ○ by product id, by category ○ filtering (min/max price, by brand, by color, ...) ■ filters are category-specific ● Customer-activity tracking ○ build knowledge base for future features: ■ recommendation engine ○ products viewed, previous orders, wish-list, etc. ○ both for identified and anonymous visitors"Front-office" design
  41. 41. ● Need to unlearn 10+ years EXP in relational design/development ○ Think "document", not relation ○ No magical (a.k.a ORM) framework ● bye bye Hibernate ;) ○ Some surprises/confusion with the query syntax ■ No "$and" in versions <2.0, didnt manage some queries (though it worked in mongo shell) ● "min_price > a and min_price > b" with the Java driver ■ Function operators appear at varying positions ● { "$lt": { "some_field": some_value }} ● { "some_field": { "$in" : some_values }}How is it going ?
  42. 42. ● Good performance ○ Although relatively low number of documents (~5-10 000 documents) ● Fast development cycle ○ Only a few hours to have the first prototype running ○ With googles help and a couple of hours, build a micro full-text indexing search feature ● Mongo Shell is my friend ○ as well as Google & MongoDB.org ○ at last, a developer-friendly (command-line) tool ● bye bye sqlplus ;)How is it still going ?
  43. 43. "borrowed" from Geek and Poke http://geekandpoke.typepad.com/ Thank You!

×