Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
30 billion requests per day with a NoSQL architecture

USI 2013, Paris
Julien SIMON
Vice President, Engineering
j.simon@cr...
GO
 GO
GO
Powered by 
PERFORMANCE DISPLAY
Copyright © 2013 Criteo. Confidential
A user sees products 
on your …
… and sees
...
CRITEO "

3
•  R&D EFFORT
•  RETARGETING
•  CPC
PHASE 1 : 2005-2008
CRITEO CREATION 
•  MORE THAN 3000 CLIENTS
•  35 COUNT...
INFRASTRUCTURE
4
Copyright © 2013 Criteo. Confidential.
 DAILY TRAFFIC
- HTTP REQUESTS: 30+ BILLION
- BANNERS SERVED: 1+ BI...
5
Copyright © 2013 Criteo. Confidential.
HIGH PERFORMANCE COMPUTING
FETCH, STORE, CRUNCH, QUERY 20 additional TB EVERY DAY ...
CASE STUDY #1: PRODUCT CATALOGUES
•  Catalogue = product feed provided by advertisers (product id, description,
category, ...
FROM SQL SERVER TO MONGODB
•  Ah, database migrations… everyone loves them !
•  1st step: solve replication issue
–  Impor...
MONGODB, 2.5 YEARS LATER
•  Stable (2.4.3 much better)
•  Easy to (re)install and administer
•  Great for small datasets (...
CASE STUDY #2: HADOOP
9
Copyright © 2013 Criteo. Confidential.
1st cluster live in June 2011
(2 Petabytes)
« Express » laun...
HADOOP IS AWESOME… BUT CAVEAT EMPTOR!
•  Batch processing architecture, not real-time " hence our work on Storm
•  Beware ...
THANKS A LOT FOR YOUR ATTENTION!
11
Copyright © 2013 Criteo. Confidential.
www.criteo.com
engineering.criteo.com
30 billion requests per day with a NoSQL architecture (2013)
Upcoming SlideShare
Loading in …5
×

30 billion requests per day with a NoSQL architecture (2013)

803 views

Published on

Presentation done at the USI conference, Paris, 2013.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

30 billion requests per day with a NoSQL architecture (2013)

  1. 1. 30 billion requests per day with a NoSQL architecture USI 2013, Paris Julien SIMON Vice President, Engineering j.simon@criteo.com @julsimon
  2. 2. GO GO GO Powered by PERFORMANCE DISPLAY Copyright © 2013 Criteo. Confidential A user sees products on your … … and sees After on the banner, the user goes back to the product page. ...then browses the 2
  3. 3. CRITEO " 3 •  R&D EFFORT •  RETARGETING •  CPC PHASE 1 : 2005-2008 CRITEO CREATION •  MORE THAN 3000 CLIENTS •  35 COUNTRIES, 15 OFFICES •  R&D: MORE THAN 300 PEOPLE PHASE 2 : 2008-2012 GLOBAL LEADER : + 700 EMPLOYEES! 2007 15 EMPLOYEES 2009 84 EMPLOYEES 6 EMPLOYEES 2005 2010 203 EMPLOYEES 2012 +700 EMPLOYEES SO FAR 2006 2011 395 EMPLOYEES 2008 33 EMPLOYEES
  4. 4. INFRASTRUCTURE 4 Copyright © 2013 Criteo. Confidential.  DAILY TRAFFIC - HTTP REQUESTS: 30+ BILLION - BANNERS SERVED: 1+ BILLION  PEAK TRAFFIC (PER SECOND) - HTTP REQUESTS: 500,000+ - BANNERS: 25,000+  7 DATA CENTERS  SET UP AND MANAGED IN-HOUSE  AVAILABILITY > 99.95%
  5. 5. 5 Copyright © 2013 Criteo. Confidential. HIGH PERFORMANCE COMPUTING FETCH, STORE, CRUNCH, QUERY 20 additional TB EVERY DAY ? …SUBTITLED « HOW I LEARNED TO STOP WORRYING AND LOVE HPC »
  6. 6. CASE STUDY #1: PRODUCT CATALOGUES •  Catalogue = product feed provided by advertisers (product id, description, category, price, URL, etc) •  3000+ catalogues, ranging from a few MB to several tens of GB •  About 50% of products change every day •  Imported at least once a day by an in-house application •  Data replicated within a geographical zone •  Accessed through a cache layer by web servers •  Microsoft SQL Server used from day 1 •  Running fine in Europe, but… –  Number of databases (1 per advertiser)… and servers –  Size of databases –  SQL Server issues hard to debug and understand •  Running kind of fine in the US, until dead end in Q1 2011 –  transactional replication over high latency links Copyright © 2010 Criteo. Confidential.
  7. 7. FROM SQL SERVER TO MONGODB •  Ah, database migrations… everyone loves them ! •  1st step: solve replication issue –  Import and replicate catalogues in MongoDB –  Push content to SQL Server, still queried by web servers •  2nd step: prove that MongoDB can survive our web traffic –  Modify web applications to query MongoDB –  C-a-r-e-f-u-l-l-y switch web queries to MongoDB for a small set of catalogues –  Observe, measure, A/B test… and generally make sure that the system still works •  3rd step: scale ! –  Migrate thousands of catalogues away from SQL Server –  Monitor and tweak the MongoDB clusters –  Add more MongoDB servers… and more shards –  Update ops processes (monitoring, backups, etc) •  About 150 MongoDB servers live today (EU/US/APAC) –  Europe: 800M products, 1TB of data, 1 billion requests / day –  Started with 2.0 (+ Criteo patches) " 2.2 " 2.4.3 Copyright © 2010 Criteo. Confidential.
  8. 8. MONGODB, 2.5 YEARS LATER •  Stable (2.4.3 much better) •  Easy to (re)install and administer •  Great for small datasets (i.e. smaller than server RAM) •  Good performance if read/write ratio is high •  Failover and inter-DC replication work (but shard early!) •  Performance suffers when : –  dataset much larger than RAM –  read/write ratio is low –  Multiple applications coexist on the same cluster •  Some scalability issues remain (master-slave, connections) •  Criteo is very interested in the 10gen roadmap ! Copyright © 2010 Criteo. Confidential.
  9. 9. CASE STUDY #2: HADOOP 9 Copyright © 2013 Criteo. Confidential. 1st cluster live in June 2011 (2 Petabytes) « Express » launch required by brutal growth of traffic Traditional processing (in-house tools + SQL Server) completely replaced by Hadoop Dual-use: production (prediction, recommandation, etc.) and Business Intelligence (reporting, traffic analysis) Visible ROI : Increase of CTR and CR 2nd cluster live in April 2013 (6 Petabytes"?)
  10. 10. HADOOP IS AWESOME… BUT CAVEAT EMPTOR! •  Batch processing architecture, not real-time " hence our work on Storm •  Beware how data is organized and presented to jobs (LZO, RCFile, etc) " hence our work on Parquet with Twitter & Cloudera •  namenode = SPOF " backup + HA in CDH4 •  Understand HDFS replication (under-replicated blocks) •  Have a stack of extra hard drives ready •  Lack of ops / prod tools: data import/export, monitoring, metrology, etc •  Lots of work needed for an efficient multi-user setup: scheduling, quotas, etc. •  At scale, infrastructure skills are mandatory –  Server selection, CPU/storage ration –  Linux & Java tuning –  LAN architecture: watch your switches! Copyright © 2010 Criteo. Confidential.
  11. 11. THANKS A LOT FOR YOUR ATTENTION! 11 Copyright © 2013 Criteo. Confidential. www.criteo.com engineering.criteo.com

×