Your SlideShare is downloading. ×
0
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
MongoFr : MongoDB as a log Collector
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

MongoFr : MongoDB as a log Collector

8,455

Published on

MongoDB can be used simply as a log collector using for example a capped collection. Fotopedia has such a system which is used for quick introspection and realtime analysis. …

MongoDB can be used simply as a log collector using for example a capped collection. Fotopedia has such a system which is used for quick introspection and realtime analysis.

Speech done the 23rd of March, 2011 at MongoFR days in Paris, la Cantine by Pierre Baillet and Mathieu Poumeyrol

Published in: Technology
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
8,455
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
71
Comments
0
Likes
9
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • pierre baillet, server architect\nmathieu poumeyrol, director of cloud engineering\n
  • \n
  • \n
  • next slide is what we do\n\n
  • next slide is about how we do\n
  • next slide is about key technologies\n
  • Dernier slide de la section\n
  • \n
  • \n
  • Dernier slide de la section\n
  • \n
  • \n
  • \n
  • next slide is why should we improve\n
  • next slide show original logging layout\n
  • Dernier Slide de la section\n
  • \n
  • \n
  • \n
  • \n
  • Dernier slide de la section\n
  • \n
  • details on next slide\n
  • search in exception in next slide\n
  • next slide is about sampling and ramplr\n
  • next slide is about technologies used in sapin\n
  • next slide is about facility filtering\n
  • describe sapin facility:\n- column selection\n- reloading\n- list of facility\n\nnext slide is about url filtering\n
  • next slide is about url filtering\n
  • next slide details an op-id session\n
  • next slide shows a timeline\n
  • next slide is about current issues\n
  • Dernier slide de la section\n
  • \n
  • Dernier slide de la présentation et de la section avant les questions\n
  • \n
  • Transcript

    1. MONGODB AS A LOG COLLECTOR photo by Jean-Michel BAUD Pierre Bai!et & Mathieu Poumeyrol oct & kali @ fotopedia.com
    2. DB.SLIDES.FIND({‘TYPE’:‘TITLE’})Fotopedia, who we are, what we do, how we doMongoDB at Fotopedia, current state of our artLogging, the answer to life, the universe and everythingHow we fullfilled this needLog usage on a daily basisFuture work
    3. FOTOPEDIA«Photos de fami!e»
    4. FOTOPEDIA WHO ARE WE ?Company created in 2006Located in Paris, near the Opéra17 people, including 8 MongoDB regular users (akadevelopers)we’re hiring
    5. FOTOPEDIA WHAT DO WE DO ?Images for HumanityOpen to anyone, Amateur or professionnalCreative Commons awareBeautiful Wikipedia (http://www.fotopedia.com)iPad tablebooks (iPhone too): Heritage, National Parks andMemory of Color
    6. INFRASTRUCTUREBased on Amazon Web ServicesAround 20 servers located in the US datacentersUse centralized deployment procedure (Chef)Deploy at least once a week with no downtime
    7. KEY TECHNOLOGIESRuby on Rails (with REE) Lackr (in house java proxy)Unicorn SinatraVarnish Redis and ResqueHAProxy MysqlNGinx MongoDB
    8. MONGODB AT FOTOPEDIA«C:UtilisateursfotopediaMes Documents»
    9. CURRENT STATE OF OUR ARTLast year speech about our MongoDB powered metacacheStore complete Wikipedia data in > 10 languagesSince spring 2010, all new database-centric features havebeen developped with MongoDBOur goal : slowly migrate all DB feature to MongoDBwhenever possible
    10. MYSQL MIGRATIONS Alter table 3022.5 15 7.5 0 08/Q3 08/Q4 09/Q1 09/Q2 09/Q3 09/Q4 10/Q1 10/Q2 10/Q3 10/Q4 2011
    11. OUR SETUP4 clusters (business data, log and reporting, wikipedia, andone more)3 EC-2 XL virtual machines hosting 5 replica-setat the current time, one machine is master on all RS5 replica-set are allocated to one of the clustersevery instance holds the 4 mongos
    12. SOME FIGURESin production since september 2009wikipedia data: wikipedia/en: 5GB, 8M documents (andabout 10 other languages), batch load: 17k insert/swebcache: 2GB, 11M records, avg 60 op/s, peak 300 op/soverall, average 250 op/s
    13. jm3LOGGING «l’oeil du lynx»
    14. ORIGINAL PHILOSOPHY Log everything, don’t delete Collected by Scribe Comprehensive daily log stored in AWS S3 Hadoop jobs to generates statistics grep and his merry friends for issue inquiringQuite efficient, but cumbersome and slow
    15. WHY IMPROVEIssue analysis in realtime (debugging)Realtime activity analysis Traffic spikes Misbehaving crawlers and other suspicious activity
    16. ORIGINAL STACK LAYOUT
    17. Stefano ConstanzoHOW WE SOLVED THIS ISSUE «démons et mervei!es»
    18. NORMALIZED LOG FORMAT{ "_id" : ObjectId("4d7e11cc7ea68d34fb01f2ac2"),"facility" : "varnish","instance" : "a01","date" : NumberLong("1300107724534"),"http_host" : "www.fotopedia.com","method" : "GET","http_version" : "HTTP/1.1","path" : "/albums/fotopedia-fr-Cath%C3%A9drale_m%C3%A9tropolitaine_de_Buenos_Aires","status" : "404","size" : 13,"elapsed" : 0.00007748600182821974 }
    19. LOG COLLECTINGFile logging daemons (NGinx, HAProxy) Ruby tailer scriptMemory logging daemons (Varnish) Dedicated binary that streams varnish SHM into MongoDBOther Daemons (Lackr, Picor) Extended logging system to store data in MongoDB also log ruby exceptions into MongoDB
    20. MONGO SHARDINGAll servers host the «logs» mongos on port 27002.All daemons push their logs to«localhost:27002»The actual storage is a capped collection in a non-shardeddatabase.
    21. CURRENT STACK LAYOUT
    22. Jesús García FerrerLOG USAGE ON A DAILY BASIS «l’aigui!e dans la meule de sapin»
    23. SAPIN: EXCEPTION LOGGING View Latest Errors
    24. SAPIN: EXCEPTION LOGGING Useful informations: •Source url and parameters •Date and time •Browser identifiers (IP, cookie values, User-Agent) •Full stack dump •Full headers dump •Full user model dump
    25. SAPIN: EXCEPTION LOGGING Searching in Exceptions
    26. RAMPLR: SAMPLING ANALYSISSample analysis
    27. SAPIN: REALTIME LOGGINGjQuery-ui based interfaceSinatra BackedFilter by FacilitySearchable criterias: IP Address, Follow Operation-IDDisplay HTTP execution Timeline
    28. SAPIN: REALTIME LOGGING Facility Filtering
    29. SAPIN: REALTIME LOGGING Url Filtering
    30. SAPIN: REALTIME LOGGING IP Address Filtering
    31. SAPIN: REALTIME LOGGING Operation ID Filtering
    32. SAPIN: REALTIME LOGGING Timeline display
    33. ISSUE WITH MONGODBScalability of using a capped collection Official doc says no indicesSize limit vs indices efficiency (400 000 lines for < 2 hours oflog) : our plan is to have 2 days worth of logs.
    34. The Library of CongressFUTURE WORK «vers l’infini et au delà»
    35. FUTURE WORKLeaner interface Ugly and jquery-ui based. Should switch to Sencha frameworkKeep more log Abandon Capped collections Keep log longer, one collection per day(?)
    36. Great BeyondQUESTIONS ? «je vous dis : au revoir.»

    ×