Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Criteo Infrastructure (Platform) Meetup

1,246 views

Published on

Presentations from Criteo Labs’ Infrastructure team with a guest speakers from Yandex.
• FastTrack: scaling customer integration
• Evolution of data structures in Yandex.Metrica
• Don't take your software for granted
• Evolution of analytics at Criteo

Published in: Technology
  • Hello! I can recommend a site that has helped me. It's called ⇒ www.WritePaper.info ⇐ So make sure to check it out!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Great information about writing! If you ever need any help with proofreading, editing or research check out Writer’s Help. They are a great resource for personal, educational or business writing needs. The website is ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • You can ask here for a help. They helped me a lot an i`m highly satisfied with quality of work done. I can promise you 100% un-plagiarized text and good experts there. Use with pleasure! ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! I do no use writing service very often, only when I really have problems. But this one, I like best of all. The team of writers operates very quickly. It's called ⇒ www.HelpWriting.net ⇐ Hope this helps!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • The professtional essay writer are having more knowledege about the writing papers. The professional essay writer are providing the best essay writing services papers to the students. The writeersity writing company had to providing the more writing papers for the professtionalist. The papers should be very quality and possible to acedemic success. ⇒ www.HelpWriting.net ⇐ Good luck!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Criteo Infrastructure (Platform) Meetup

  1. 1. Criteo Infrastructure (Platform) Meetup 22nd February 2017 Diarmuid Gill, VP R&D - Platforms Introduction & welcome note
  2. 2. About Criteo 1
  3. 3. 3 | Copyright © 2017 Criteo Our mission TARGET THE RIGHT USER AT THE RIGHT TIME WITH THE RIGHT MESSAGE
  4. 4. 4 | Copyright © 2017 Criteo Key Figures 18 000 PUBLISHERS90% RETENTION RATE2 +130 COUNTRIES LISTED ON THE NASDAQ SINCE OCTOBER 2013 R&D REPRESENTS 21% OF THE WORKFORCE 2500 EMPLOYEES 21 BILLIONS $3 14 000 ADVERTISERS $1,799 million1 31 OFFICES 1: REVENUE IN 2016 2: ANNUAL RATE 2015 3: $ OF TURNOVER GENERATED TO OUR CLIENTS - TURNOVER POST-CLICK WW FROM JANUARY TO DECEMBER 2015
  5. 5. How does it work ? 2
  6. 6. 6 | Copyright © 2017 Criteo GENERAL CONCEPT Users visit an advertiser’s website 1 Criteo identifies the users (via cookies) 2 Users leave the advertiser’s website & browse publisher on the Internet 3 Criteo identifies users on these pages (via cookie) 4 Criteo displays an advertising banner, personalized for each user 5 Click through directly to the advertiser’s page 6 @ Retargeting principles
  7. 7. Underlying infrastructure 3
  8. 8. 8 | Copyright © 2017 Criteo • 3.2B catalog items ingested/day, 6B items stored • 3.6B cookies/device IDs seen per month • 3.9B personalized banners/day • 49 RTBs @ 120B bid requests/day • 3M QPS at peak • 90 Gbps bandwidth • 20K servers • 27PB of data stored • 3.6PB of data read daily • 500B log lines processed/day • 363TB of RAM in memcached, 37M req/s • 300K Hadoop jobs/day Scale @ Criteo
  9. 9. 9 | Copyright © 2017 Criteo Batch processing: • Hadoop as a Service: • 2 clusters – main + backup one for degraded mode • Cloudera CDH5 • 2300 servers total (1300 + 1000), 76K vcores • 50PiB storage capacity • Own job scheduler for improved data flow and coordination • 300k jobs per day Hadoop @ Criteo
  10. 10. 10 | Copyright © 2017 Criteo Infrastructure Key Figures Hosting Global Partners : Sunnyvale 2 PoP 500 kVA 2 006 Servers New York 2 PoP 930 kVA 2 793 Servers Hong Kong 2 PoP 472 kVA 2 185 Servers Paris 3 Pop 1 800 kVA 5 003 Servers Amsterdam 2 PoP +2 500 kVA 3 874 Servers Tokyo 2 PoP 455 kVA 2 564 Servers Shanghai 1 PoP 200 kVA 907 Servers Worldwide 16 PoP ~8 MVA Contracted 20 526 Servers Up to 90 Gbps 3M QPS Ashburn 2 PoP 1,1 MVA 1 170 Servers Hosting Global Partners :
  11. 11. 11 | Copyright © 2017 Criteo Some of the many technologies used at Criteo
  12. 12. What does “Platforms” mean in Criteo? 4
  13. 13. 13 | Copyright © 2017 Criteo Top Level Applications Platforms Infrastructure SRE Advertiser Publisher WebScale Prediction Dynamic Creative Recommendation Engine • Catalog • User Events • Campaigns • Reporting • RTB • Direct • Campaigns • Reporting Systems Platforms Systems Engine
  14. 14. 14 | Copyright © 2017 Criteo Analytics Platforms Advertiser Publisher Analytics AX/BI Reporting / Billing Reporting / Payments
  15. 15. Tonight’s programme 4
  16. 16. 16 | Copyright © 2017 Criteo Tonight’s menu Bill of Fare *** 1st talk: FastTrack: scaling customer integration - Nicolas Laveau, Leo-Paul Goffic & Camille Coueslant - 2nd talk: Evolution of data structures in Yandex.Metrica - Alexey Milovidov - 3rd talk: Don't take your software for granted - Cedrick Montout - 4th talk: Evolution of analytics at Criteo - Justin Coffey - *** 21:05 - 22:00 Networking
  17. 17. Thank you!
  18. 18. Camille Coueslant, Léo-Paul Goffic, Nicolas Laveau 2017/02/22 Scaling customer integration FastTrack
  19. 19. 19 | Copyright © 2017 Criteo What do we do in Criteo? Deliver the right message to the right user at the right time
  20. 20. 20 | Copyright © 2017 Criteo Integration: Creatives settings • Banners need branding • Logo • Font • Color palette • Banners come in many formats
  21. 21. 21 | Copyright © 2017 Criteo Integration: Tags • Banners are based on user intent • Tags on customer store • Different types of intent • Home page view • Product view • Listing view • Basket • Sales • Intent at product level <script type="text/javascript" src="//static.criteo.net/js/ld/ld.js" async="true"> </script> <script type="text/javascript"> window.criteo_q = window.criteo_q || []; window.criteo_q.push( { event: "setAccount", account: 666 }, { event: "setEmail", email: "harry.potter@hogwarts.org" }, { event: "setSiteType", type: "g" }, { event: "viewHome" } ); </script> <script type="text/javascript" src="//static.criteo.net/js/ld/ld.js" async="true"> </script> <script type="text/javascript"> window.criteo_q = window.criteo_q || []; window.criteo_q.push( { event: "setAccount", account: 666 }, { event: "setEmail", email: "harry.potter@hogwarts.org" }, { event: "setSiteType", type: "g" }, { event: "trackTransaction", id: "tr-56182-2123", item: [ { id: "patronus", price: 12.54, quantity: 3 }, { id: "avada-kedavra", price: 1099.99, quantity: 1 } /* add a line for each item in the user's basket */ ]} ); </script> Home Sales
  22. 22. 22 | Copyright © 2017 Criteo Integration: Product Feed • Banners contain products • Characteristics of products are used for recommendation • Name, description, image, price for display <item> <g:id>0</g:id> <title>Abracadabra</title> <g:image_link> http://www.magic.com/assets/spells/abracadabra.png </g:image_link> <link> http://www.magic.com/spells/abracadabra </link> <description> Multi-purpose spell. Your companion for every occasion! </description> <g:price>625.99</g:price> <g:google_product_category>35</g:google_product_category> </item> id;title;image_link;link;description;price;google_product_ category 0;Abracadabra;http://www.magic.com/assets/spells/abracadab ra.png;http://www.magic.com/spells/abracadabra;Multi- purpose spell. Your companion for every occasion!;625.99;Arts & Entertainment > Hobbies & Creative Arts > Magic & Novelties XML CSV
  23. 23. 23 | Copyright © 2017 Criteo Back in 2014 When the customer was seeing what he had to implement
  24. 24. 24 | Copyright © 2017 Criteo Back in 2014 When the technical support was seeing the first implementation
  25. 25. 25 | Copyright © 2017 Criteo Back in 2014 When the customer was trying to debug his implementation
  26. 26. 26 | Copyright © 2017 Criteo Criteo grows… fast! This does not scale! « Performance is everything » BUT we need to onboard first Clients TS
  27. 27. 27 | Copyright © 2017 Criteo All is not lost! Technology & UX to the rescue!
  28. 28. Tags Part 1: Tag Validation Dashboard
  29. 29. 29 | Copyright © 2017 Criteo Goal  Show near real-time metrics on trackers format issues  Detect mismatches between the trackers and the product feed  Provide fine-grained data (max 24 hours)  Available for each of our clients (=worldwide)
  30. 30. 30 | Copyright © 2017 Criteo How Initial trackers architecture
  31. 31. 31 | Copyright © 2017 Criteo How 1. Audit the tracker events 2. Send this audit event to Kafka 3. Consume it from Druid
  32. 32. 32 | Copyright © 2017 Criteo Why Druid • Druid is an open-source column-oriented distributed data store • Advantages: • Fast aggregation queries on huge amount of metrics • Real-time streaming ingestion • Scalable • Highly available
  33. 33. 33 | Copyright © 2017 Criteo 1. Audit the tracker events 2. Send this audit event to Kafka 3. Consume it from Druid 4. Query Druid from Integrate How
  34. 34. 34 | Copyright © 2017 Criteo Result
  35. 35. Tags Part 2: Tag Debug Mode
  36. 36. 36 | Copyright © 2017 Criteo Tag Debug Mode How do I make sure I send Criteo the right information from my website? ? ? Fig 1: Criteo Hotline
  37. 37. 37 | Copyright © 2017 Criteo Tag Debug Mode How do I make sure I send Criteo the right information from my website? Fig 2: Happy customer
  38. 38. 38 | Copyright © 2017 Criteo How tags work https://www.mvmtwatches.com/
  39. 39. 39 | Copyright © 2017 Criteo How tags work https://www.mvmtwatches.com/ ld.js
  40. 40. 40 | Copyright © 2017 Criteo How tags work https://www.mvmtwatches.com/ ld.js GET /event?a=%5B30072%…
  41. 41. 41 | Copyright © 2017 Criteo How tags work https://www.mvmtwatches.com/ ld.js GET /event?a=%5B30072%… 200 OK
  42. 42. 42 | Copyright © 2017 Criteo Tag Debug Mode
  43. 43. 43 | Copyright © 2017 Criteo Tag Debug Mode https://www.mvmtwatches.com/#enable-tag-debug-mode
  44. 44. 44 | Copyright © 2017 Criteo Tag Debug Mode https://www.mvmtwatches.com/#enable-tag-debug-mode ld.js if (document.location.hash == debugHash) loadLdDebug();
  45. 45. 45 | Copyright © 2017 Criteo Tag Debug Mode https://www.mvmtwatches.com/#enable-tag-debug-mode ld.js ld-debug.js if (document.location.hash == debugHash) loadLdDebug(); addDebugIframe();
  46. 46. 46 | Copyright © 2017 Criteo Tag Debug Mode https://www.mvmtwatches.com/#enable-tag-debug-mode ld.js GET /event?a=%5B30072%…&debugMode=1 ld-debug.js if (document.location.hash == debugHash) loadLdDebug(); addDebugIframe();
  47. 47. 47 | Copyright © 2017 Criteo Tag Debug Mode https://www.mvmtwatches.com/#enable-tag-debug-mode ld.js GET /event?a=%5B30072%…&debugMode=1 200 OK Content-Type: application/javascript sendDebugInformationToIframe({ audit: { product: { image: ‘…’ }, errors: […] } }); ld-debug.js if (document.location.hash == debugHash) loadLdDebug(); addDebugIframe();
  48. 48. 48 | Copyright © 2017 Criteo Tag Debug Mode  Gives you fine-grained insights on the quality of information sent  Requires no technical knowlege  Mirrors exactly what will be processed down the line
  49. 49. Feed
  50. 50. 50 | Copyright © 2017 Criteo Goal  Provide feedbacks ASAP on a subset of products  Provide feedbacks on the whole feed  Automatic format detection (Google specs)  User can validate the structure of the feed  User can review some products  As close as possible as the daily feed import
  51. 51. 51 | Copyright © 2017 Criteo Full import Daily import architecture
  52. 52. 52 | Copyright © 2017 Criteo Full import Update feed processing Hadoop job to compute errors and attributes statistics
  53. 53. 53 | Copyright © 2017 Criteo Full import Launch full import from Integrate, retrieve and display statistics
  54. 54. 54 | Copyright © 2017 Criteo Test import Create a Marathon application that: - Stream incoming feed - Detect format - Reuse part of feed processing Hadoop job java code - Save import & statistics in DB - Provide API to fetch statistics
  55. 55. 55 | Copyright © 2017 Criteo Result
  56. 56. 56 | Copyright © 2017 Criteo Result
  57. 57. Creatives
  58. 58. 58 | Copyright © 2017 Criteo How banners work at Criteo • Actual humans pick predefined layouts, colors, CTAs • Then those are combined with product information and optimized on-the-fly Je découvre ! J’achète ! × × × =
  59. 59. 59 | Copyright © 2017 Criteo How banners work at Criteo “Can I have drop shadows on my products?” “I’m not sure about the pink” “Could it autoplay loud music?” As a result, clients worry “What will my banners look like?”
  60. 60. 60 | Copyright © 2017 Criteo How banners work at Criteo There is stuff we can’t do, and stuff we don’t necessarily want to do “What will my banners look like?” “Can I have drop shadows on my products?” “I’m not sure about the pink” “Could it autoplay loud music?”
  61. 61. 61 | Copyright © 2017 Criteo Creatives to the rescue And it takes back and forth. Our goal: • Give advertisers a preview of what it’ll look like • Give advertisers customization options • Feedback the performance impact • 80% of advertisers validate their Creatives in < 2 minutes • 80% of advertisers don’t ask for a change
  62. 62. 62 | Copyright © 2017 Criteo Creatives Bring on UX, R&D, Product, Sales, Creatives & Technical Support
  63. 63. 63 | Copyright © 2017 Criteo Creatives Bring on UX, R&D, Product, Sales, Creatives & Technical Support
  64. 64. 64 | Copyright © 2017 Criteo Creatives 1 Education Preview Performance Customization 2 3 4 1 2 3 4
  65. 65. Going further! And mostly faster
  66. 66. 66 | Copyright © 2017 Criteo eCommerce Platforms Lots of our clients run on ready-to-use platforms that have APIs As a result, we can completely automate the integration workflow for them!
  67. 67. 67 | Copyright © 2017 Criteo Shopify integration Only 2 clicks needed! Reduced integration time from 14 days to 20 minutes
  68. 68. Integration today
  69. 69. 69 | Copyright © 2017 Criteo How customers / technical support / we feel
  70. 70. 70 | Copyright © 2017 Criteo “ ” • Only 25% in 2014 • 66% complete Feed in < 1h • 43 days in 2014 • 2014: 600 integrations/quarter • 2016: 1800 integrations/quarter • 50% handled through Integrate • 95% accept “as-is” • 4% accept with performance downgrade • Only 1% ask for modification Nassim Aissat, Global TS I’m in love with the Tag Debug Mode 7514d %Median integration time Tags without help Integrate achievements 92%Validate Creatives < 2 mn 20mnIntegration w/ Shopify App
  71. 71. Questions?
  72. 72. 72 | Copyright © 2017 Criteo
  73. 73. 73 | Copyright © 2017 Criteo What does Black Friday mean at Criteo?
  74. 74. 74 | Copyright © 2017 Criteo Release freeze: trying to guarantee the stability of the platform... ... with nasty side-effects Getting ready for Black Friday
  75. 75. 75 | Copyright © 2017 Criteo How to know evaluate at a glance the health of the datacenter? Comes grafana Monitoring the datacenter
  76. 76. 76 | Copyright © 2017 Criteo With specific filters, deviant machines can be spotted easily Monitoring the datacenter
  77. 77. 77 | Copyright © 2017 Criteo Drilling down... Monitoring the datacenter
  78. 78. 78 | Copyright © 2017 Criteo Until finding a likely culprit Monitoring the datacenter
  79. 79. 79 | Copyright © 2017 Criteo And switching to micro analysis to find the root cause • Process Explorer • Profiling • Windbg • ClrMD Monitoring the datacenter
  80. 80. 80 | Copyright © 2017 Criteo Load Balancing HA Proxy
  81. 81. 81 | Copyright © 2017 Criteo Basic of Client Side Load Balancing
  82. 82. 82 | Copyright © 2017 Criteo Basic of Client Side Load Balancing
  83. 83. 83 | Copyright © 2017 Criteo Mixed technical specifications
  84. 84. 84 | Copyright © 2017 Criteo Gen8 Load test
  85. 85. 85 | Copyright © 2017 Criteo • This is a bullet • 2nd level bullet Gen8 vs Gen9 servers
  86. 86. 86 | Copyright © 2017 Criteo Observable result 2/3 1/3
  87. 87. 87 | Copyright © 2017 Criteo Conclusion Do not take your software for granted • Internal Infrastructure will change • External workload will change … be prepared
  88. 88. 88 | Copyright © 2017 Criteo The Analytics Stack at Criteo Yesterday, Today and Tomorrow with an assist from Bill Murray Justin Coffey, Team Lead
  89. 89. 89 | Copyright © 2017 Criteo The Ghost of Christmas Present What do we have now?
  90. 90. 90 | Copyright © 2017 Criteo Criteo: Scale of Data • 4 Billion ads served each day • 200+ Billion events logged each day • 50TBs of data ingested each day • 10 trillion records processed each day
  91. 91. 91 | Copyright © 2017 Criteo Criteo: Scale of the Analytics Stack 50+ TB ingested / day 2000+ jobs / day 7+PB Under Management 200+ Analysts 400+ Engineers 1000+ Sales and Ops
  92. 92. 92 | Copyright © 2017 Criteo Criteo: Scaling Analysts 0 20 40 60 80 100 120 140 160 180 Analysts Hired since 2010
  93. 93. 93 | Copyright © 2017 Criteo Criteo: Scaling Data 0 2E+10 4E+10 6E+10 8E+10 1E+11 1.2E+11 1.4E+11 Growth of a Single Dataset Since July 2014
  94. 94. 94 | Copyright © 2017 Criteo Criteo: The Analytics Stack Today Ad-Hoc Analysis Hadoop for primary storage and point of ingestion Data Transformation on top of Hadoop Hive (7PB) and Vertica (100+ TB) Data Warehouses Ad-Hoc SQL on Hive and Vertica, Reporting on Tableau and Vertica OrchestrationviaLangoustine
  95. 95. 95 | Copyright © 2017 Criteo Our Stack is Simple • Few moving parts • Purposefully built with Shiny Thing blinders on • It's okay to not have the "latest and greatest" tech • Good enough is, actually, always good enough
  96. 96. 96 | Copyright © 2017 Criteo On Shiny Things: the universe is vast so be selective, and master what you select
  97. 97. 97 | Copyright © 2017 Criteo The Ghost of Christmas Past Before we continue, a quick history lesson of how we got here is in order...
  98. 98. 98 | Copyright © 2017 Criteo Everything starts somewhere and it's not always pretty.
  99. 99. 99 | Copyright © 2017 Criteo In early 2013, you could use SQL Server… AdServer_Db Publisher_Db LogStatus_Db BlogWidgetStat_Db BlogWidgetAdStat_dbTraffic_custom_db Extranet_DbTraffic_custom_db CATEGORY_DB Mail_MonitorDB Inventory_Db AdServerBo_Db AdServerStat_Db DashBoard_DB Dashboard_Security_DB WebServerStat_db ABTesting_DB AdvertiserFatigueStats_db ADVERTISING_DB StatPrediction_DB CAST_DB CriteoRefdb ImportDB RISK_DBGalacticaStats_DB MaxCpc_DB UserProfilingDB WorkflowPersistency_db CAST_DB_HOURLY StatEngine_Db Crawler_Db BICustom_DB Lookalike_DB Widget_db AOC_DB AOC_DB Build_Deploy_Fake_db publisher_stats_db TestFwk_Db LogMonitorDb ADMINLOGS_DB SqoopExport_db FraudDetection_db HPClink_DB DW_DB tsuissesbenl_stat_db Heyokr_Stat_db kiabiit_stat_db Ultaus_Stat_db Crutchfieldus_Stat_db Forzierijp_Stat_db Retailchoiceuk_Stat_db Ryanairhotelses_Stat_db Speakyplanetfr_Stat_db Autowayjp_Stat_db Sicilianobr_Stat_db Jukenhousingjp_Stat_db Cosyforyoufr_Stat_db Tripadvisorru_Stat_db Linasmatkassese_Stat_db Ellepassionsfr_Stat_db Skyde_Stat_db Swimdoctormallkr_Stat_db Sitescoutbr_Stat_db Travelzoousnewusers_Stat_db Platekompanietno_Stat_db Testaoc110413frcom_Stat_db Megapoolnl_Stat_db Elektrototaalmarktnl_Stat_db Intersportuk_Stat_db Usineadesignfr_Stat_db Lekmerno_Stat_db Vuelingit_Stat_db Valuedopinions_Stat_db Forzierino_Stat_db Artisantiuk_Stat_db Idbusit_Stat_db Cocostorykr_Stat_db Artnaturejp_Stat_db Byggmaxse_Stat_db Corporatecriteopmit_Stat_db Aramisauto_Stat_db Migoaes_Stat_db Degrotespeelgoedwinkelnl_Stat_db Diorcouturit_Stat_db Kaufuniquede_Stat_db Codigallerykr_Stat_db Mandarinaduckfr_Stat_db Comarketingorangenokiafr_Stat_db Sinbiangkr_Stat_db Cheapflightsuk_Stat_db Undergirlkr_Stat_db Agradinl_Stat_db Kofferprofide_Stat_db Domodipl_Stat_db Mandarinaduckat_Stat_db Mobilegermany_Stat_db Chlit_Stat_db Spreadshirtuk_Stat_db Casalrunningfr_Stat_db Bloomfm_Stat_db Hotelsbe_Stat_db Strumentimusicaliit_Stat_db Bathroomworlduk_Stat_db Verivoxde_Stat_db Mcmkr_Stat_db Viaggiedreamsit_Stat_db Brille24de_Stat_db Yjgakuseikaikan_Stat_db Stylepitnl_Stat_db Cvlibraryrecruiter_Stat_db Preis24de_Stat_db Tigershedsuk_Stat_db Duvetandpillowuk_Stat_db Noths_Stat_db Wizwidkr_Stat_db Ticketonlinede_Stat_db Lifestyleeuropeuk_Stat_db Shopeccose_Stat_db Swanhellenicuk_Stat_db Deguisementdiscountfr_Stat_db Freshcottonnl_Stat_db Tikamoonfr_Stat_db Testfp1_Stat_db warehouse_stat_db Hisjeans_Stat_db Mountfieldlawnmowers_Stat_db Sitescoutnl_Stat_db Lancomeus_Stat_db Brandelijp_Stat_db Mesdessousfr_Stat_db Beautyplanningjp_Stat_db Lgcobrandingpriceminister_Stat_db Stockngous_Stat_db Kickzde_Stat_db Rockymountaindecorus_Stat_db Cellbesse_Stat_db Yvesrocheres_Stat_db Toshibadirectjp_Stat_db Seneukr_Stat_db Waterfeaturesuk_Stat_db Cottagesforyouuk_Stat_db Camif_Stat_db Lojaskdbr_Stat_db Hipmunkhotels_Stat_db Sorteonline_Stat_db Ediets_Stat_db Bonsportru_Stat_db Jobjsenjp_Stat_db Redcoonit_Stat_db Hmuk_Stat_db Srtestcetelem2_Stat_db Iamprettykr_Stat_db Lebunnybleushopkr_Stat_db Condenastit_Stat_db Hotusaes_Stat_db Chilitvit_Stat_db Hellinefr_Stat_db Cobrasonfr_Stat_db madeindesign_stat_db Megagadgetsnl_Stat_db Todaofertabr_Stat_db bulbus_Stat_db Calcioshopit_Stat_db Edenlyes_Stat_db Recruiterucajp_Stat_db Engelhornde_Stat_db Spreadshirtno_Stat_db Dusparstde_Stat_db Tabletbr_Stat_db Ventesecretfr_Stat_db Venteunique_Stat_db Dellchde_Stat_db Dressforlessnl_Stat_db Multipopkr_Stat_db allheartus_Stat_db Trovitdejobs_Stat_db lesjeudisfr_stat_db Expediaukcrosssell_Stat_db Furniturebrituk_Stat_db Yooxbe_Stat_db Skyscannerno_Stat_db Bluetomatoat_Stat_db Mechakaitaijp_Stat_db Destinationlightingus_Stat_db and 10K+ more
  100. 100. 100 | Copyright © 2017 Criteo SQL Server was Production Infrastructure • Analyst access to data was an afterthought • Production databases were not designed for analytics • Reports and queries were tightly coupled to production • UX was low and Analysts occasionally broke production systems!
  101. 101. 101 | Copyright © 2017 Criteo Hive also made an early appearance… 2013-04-22 11:28:59,942 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:01,010 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:02,071 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:03,134 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:04,876 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:05,112 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:06,047 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec 2013-04-22 11:29:06,984 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 365222.27 sec ZZZZ…
  102. 102. 102 | Copyright © 2017 Criteo But Hive was also an afterthought • Raw production data batch loaded with no transformations • Query tools were non-existant • Queries were slow and only expert analysts could run them • UX and productivity were extremely low
  103. 103. 103 | Copyright © 2017 Criteo This just wasn't working! we needed a new approach
  104. 104. 104 | Copyright © 2017 Criteo First things first we need a database!
  105. 105. 105 | Copyright © 2017 Criteo Requirements for an Analytic Database • It must be extremely fast • It must be able to store our most actionable data sets • Dozens (at the time!) of TBs, now hundreds • It must be queryable with proper SQL • It must be deployable on hardware we specify
  106. 106. 106 | Copyright © 2017 Criteo Defining a Proof of Concept Evaluation • Work with Analysts to identify key data sets • Analyze query patterns • Define benchmark queries • Work with vendors to test closed source solutions • Test OSS in-house
  107. 107. 107 | Copyright © 2017 Criteo The results • Vertica struck the right balance between cost, performance and deployment options • PoC evaluation took ~3 months • Initial deployment took another ~3 months • Operations ramped up over the following ~6 months
  108. 108. 108 | Copyright © 2017 Criteo Working with Analysts during deployment • Analysts in the team helped define and document the data model • They also created training materials • Training was done in concert with engineers
  109. 109. 109 | Copyright © 2017 Criteo But was it a success? • Within a year of the rollout we were able to decomission SQL server for analytics • Today Vertica has over 100 unique ad-hoc users connected each day • It executes hundreds of thousands of queries each day • It is the most important piece of analytics infrastructure at Criteo
  110. 110. 110 | Copyright © 2017 Criteo A fresh deployment to mature infrastructure • Vertica at Criteo has scaled from ~12TB to ~120TB (going PB soon) • Ad-hoc users have grown from ~40 to ~200 • Reporting users have grown from ~300 to ~1500 • The number of tables has grown from ~50 to >500
  111. 111. 111 | Copyright © 2017 Criteo Wait, 500 tables in 3 years? That's a lot of data modelling!
  112. 112. 112 | Copyright © 2017 Criteo Analysts contribute to the data model • Engineers know how the DB works and know how to optimize a data model, but they don't always know what to put in it • With good tools Analysts contribute to the evolutions of the data model, including schema additions and modifications • Engineers in the team can help guide them in the finer details • Rinse and repeat
  113. 113. 113 | Copyright © 2017 Criteo Side bar: We also had dashboards with SSRS But we were told it was ugly and complicated. We traded ugly for slow, btw, and it's still complicated
  114. 114. 114 | Copyright © 2017 Criteo From SSRS to Tableau and SQL Server to Vertica • Actually, "slow" is just our current perception—we had SSRS dashboards with timeouts on the order of hours. • SSRS served as our de facto ETL between those 10K+ SQL Server DBs • Those SQL Server DBs were also production databases.
  115. 115. 115 | Copyright © 2017 Criteo So to Summarize the Past • Analysts had to query across thousands of DBs • Dashboards were slow and complicated • Analytics work was strongly coupled to production life was great back then wasn't it?
  116. 116. 116 | Copyright © 2017 Criteo We're done then? Not quite. Things can go awry!
  117. 117. 117 | Copyright © 2017 Criteo The Ghost of Christmas Future ...here's hoping it's a near future...
  118. 118. 118 | Copyright © 2017 Criteo Criteo is World Wide We have hundreds of analysts spread across dozens of countries!
  119. 119. 119 | Copyright © 2017 Criteo Criteo has a Rich Product Offering • Banner Ads, Mobile, In-App, Email, Search • 10's of Thousands of Advertisers and Publishers • Some of them very big and very demanding
  120. 120. 120 | Copyright © 2017 Criteo And (reminder!) our Scale Never Seems to Stop Growing 0 2E+10 4E+10 6E+10 8E+10 1E+11 1.2E+11 1.4E+11 Growth of a Single Dataset Since July 2014
  121. 121. 121 | Copyright © 2017 Criteo (reminder #2) Number of analysts hired since 2010 0 20 40 60 80 100 120 140 160 180
  122. 122. 122 | Copyright © 2017 Criteo What could go wrong?
  123. 123. 123 | Copyright © 2017 Criteo New Challenges • With so many hungry analysts to feed and with so much volume and variety of data, Vertica's query planner is working over time • We need to instrument and monitor more • We need to level-up analysts' SQL skills • And yes, finally, we do need some data governance* *oh how I've resisted this day!
  124. 124. 124 | Copyright © 2017 Criteo 2 Analysts and 3 Engineers ain't gonna cut it • We have scaled up our PM team • We are moving from a proto-CoE team to an official CoE team • We are scaling engineering operations
  125. 125. 125 | Copyright © 2017 Criteo What's on the TODO list? • Documentation, and automating it as much as possible • Non-invasive, but very intimate query monitoring • Workload isolation • Query suggestions and preëmptive query blocking
  126. 126. 126 | Copyright © 2017 Criteo More about query inspection • No matter how wonderful a database may be its performance comes down to how much IO it has and how much contention there is for it • The difference between a poorly optimized query and a well optimized one for the IO subsystem can be orders of magnitude • Better queries means more concurrent, happier users
  127. 127. 127 | Copyright © 2017 Criteo More about query inspection • Vertica offers lots of ways to find out what is going on behind the scenes, but one of the best ways is to EXPLAIN your users' queries and identify those who need to be trained!
  128. 128. 128 | Copyright © 2017 Criteo Recalling our Current Challenges • Tableau Workbooks are Slow • Vertica is Overloaded • Reporting Data is Frequently Late
  129. 129. 129 | Copyright © 2017 Criteo Patches and the Arc of History • Each of our currently challenges can be addressed in the short term • But we need long term solutions to avoid regressions
  130. 130. 130 | Copyright © 2017 Criteo Tableau Relief Program (TaRP) Short Term: • Double the cores on production server • Isolate critical workbooks Medium Term: • Require all production workbooks to go through gerrit/git review • Score workbook complexity pre-release • Monitor released workbooks for QoS Not So Long Term: • Work with Product and Central Ops to create Tableau Center of Excellence and level up BI
  131. 131. 131 | Copyright © 2017 Criteo TaRP: reporting alchemy Push to production Productive Analyst Angry Sales Person No SLA dataset Productive Analyst Happy Sales Person SLA dataset Push to review Automated deploy Knowledgeable Analyst
  132. 132. 132 | Copyright © 2017 Criteo Why impose a dev cycle on report building? not to be trite, but, well: that's good money!
  133. 133. 133 | Copyright © 2017 Criteo More seriously • Tableau workbooks consume data • Data comes in all sorts of volumes and velocities (sorry) • Data query complexity is linked to workbook complexity and features • If you don't know what you're doing, your workbooks will be: • slow, because of internal workbook complexity • slow, because of complex database queries • not be up to date if it doesn't query the proper data sources Tableau workbook developers are developers, full stop. Treat them like they are.
  134. 134. 134 | Copyright © 2017 Criteo Consul Vertica Roadmap RTIngester HDFSIngest er HL L JDBC VProxy Admin VIcO JVMIngeste r DataDisco
  135. 135. 135 | Copyright © 2017 Criteo Vertica as a Service Short Term: • Scale out as fast as reasonable • Split reporting and ad hoc workloads • Better hardware configuration • More monitoring Not So Long Term: • Better monitoring • Control Input: Trickle and Bulk Loading, Consistently, Durably and Efficiently • Control Output: Query inspection/prioritization, Workload management
  136. 136. 136 | Copyright © 2017 Criteo Fixing Your Latent Data Problem Short Term: • Migrate critical data workflows to Langoustine • Optimize DAG and long running queries Medium Term: • Migrate long-tail datasets to Langoustine • Better metrics, capacity planning Not So Long Term: • Refactor data model to cull useless data sets • Better complexity analysis of workflow modifications pre-release
  137. 137. 137 | Copyright © 2017 Criteo We're going to need better instrumentation Better Workflow Insights in Langoustine Better Hadoop Job Performance Metrics
  138. 138. 138 | Copyright © 2017 Criteo Let's spend less time making data workflows Langoustine IDE makes building Hive workflows trivial
  139. 139. 139 | Copyright © 2017 Criteo Langoustine IDE promotes best practices Workflows are source controlled: Reviews are built-in:
  140. 140. 140 | Copyright © 2017 Criteo We'll need better dev tools (eg dev-cluster) build an AWS hadoop cluster: connect to it via a local docker container: and load it with data saved in S3:
  141. 141. 141 | Copyright © 2017 Criteo SLAB: SLA Boards That Say A Lot
  142. 142. 142 | Copyright © 2017 Criteo Wait, what about Opera and Vizatra? didn't you guys do a lot of work on that?
  143. 143. 143 | Copyright © 2017 Criteo A Quick Opera Recap Opera is the internal replacement for CPOP, built in two parts A scalding-langoustine data pipeline: And a vizatra-OLAP web app:
  144. 144. 144 | Copyright © 2017 Criteo We learned a lot from building Opera • How to use SQL to describe a dashboard • How to master SQL queries executed from an OLAP app • How to build big, fast databases • How to build optimal (or so we think) data processing pipelines • How to make a decent UI with decent UX
  145. 145. 145 | Copyright © 2017 Criteo Let's focus on the SQL stuff
  146. 146. 146 | Copyright © 2017 Criteo Using SQL for dashboard meta-data SELECT time_id as hour, country_code as country, network_id as network, SUM(clicks) as clicks, SUM(displays) as displays, SUM(clicks) / SUM(displays) as ctr FROM facts WHERE time_id BETWEEN ?start AND ?end GROUP BY time_id, country_code, network_id Time dimensions Dimensions Metrics Parameters
  147. 147. 147 | Copyright © 2017 Criteo Using SQL for dashboard meta-data Time dimension Dimensions Metrics Parameters
  148. 148. 148 | Copyright © 2017 Criteo Big-O(lap) SELECT time_id as hour, country_code as country, network_id as network, SUM(clicks) as clicks, SUM(displays) as displays, SUM(clicks) / SUM(displays) as ctr FROM facts WHERE time_id BETWEEN ?start AND ?end GROUP BY time_id, country_code, network_id PROJECTION Revenue by country SELECTION Last 7 days in EUR
  149. 149. 149 | Copyright © 2017 Criteo Big-O(lap) SELECT time_id as hour, country_code as country, network_id as network, SUM(clicks) as clicks, SUM(displays) as displays, SUM(clicks) / SUM(displays) as ctr FROM facts WHERE time_id BETWEEN ?start AND ?end GROUP BY time_id, country_code, network_id PROJECTION Revenue by country SELECTION Last 7 days in EUR
  150. 150. 150 | Copyright © 2017 Criteo Big-O(lap) SELECT country_code as country, SUM(clicks) as clicks, SUM(displays) as displays FROM facts WHERE time_id BETWEEN ‘2016-03-01’ AND ‘2016-03-07’ GROUP BY country_code PROJECTION Revenue by country SELECTION Last 7 days in EUR
  151. 151. 151 | Copyright © 2017 Criteo Now that we've gotten intimate with SQL... Let's see what else we can build...
  152. 152. 152 | Copyright © 2017 Criteo Vizatra Client: One DB Client to Rule Them All
  153. 153. 153 | Copyright © 2017 Criteo Vizatra Client: One DB Client to Rule Them All • Parse every query and analyze complexity before executing it • Enforce best practices (e.g. predicates on partitions) • Degrade gracefully (e.g. don't submit queries to an overloaded DB) • Score users and queries, share with other users • Provide basic visualizations to increase analytic productivity • Support non-SQL datasources • And your feature?
  154. 154. 154 | Copyright © 2017 Criteo The End. Thanks for listening. If any of this sounds fun, we're hiring!

×