Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hardcore Data Science - in Practice

4,782 views

Published on

My talk given at the Hardcore Data Science Track at O'Reilly's StrataHadoop in London, June 1, 2016.

Published in: Software
  • For data visualization,data analytics,data intelligence and ERP Tools, online training with job placements, register at http://www.todaycourses.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Hardcore Data Science - in Practice

  1. 1. Hardcore Data Science— in Practice Dr. Mikio L. Braun, Delivery Lead for Recommendation and Search StrataConf 2016, London 
 mikio.braun@zalando.de @mikiobraun
 tech.zalando.com
  2. 2. Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London • 15 countries, 3 warehouses, 16+ million customers, 3bn€ revenue in 2015, … • Heavily using data science for recommendation
  3. 3. Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London Recommendations
  4. 4. Data Driven Recommendations • Collaborative filtering • Content based recommendation • Personalised recommendations • … Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  5. 5. For Example, One-pass Ranking Models (Freno, Jenatton, Saveski, Archambeau, “One-Pass Ranking Models for Low-Latency Product Recommendations”, KDD 2015) Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  6. 6. Hardcore Data Science to Production • Usually one shot computation • Sometimes done in Python • Getting raw data hard initially Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  7. 7. Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London Production System • Realtime system • Usually done in Java/ JVM based • Events and article data continually upgraded
  8. 8. Data Science vs. Production • A/B Test offline evaluation • Iterate on data science part • Iterate on the whole system! Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  9. 9. Data Scientists and Developers Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  10. 10. DS&D: Coding Very different approaches to coding… ← developers data scientists → Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  11. 11. DS&D: Collaboration • What is the most productive way? • Ideally, interface on code, not just documentation • Production logs often become data analysis input! Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  12. 12. Organization • Cross-functional teams • Communication! • Microservices, at Zalando:
 STUPS (Docker on AWS) Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  13. 13. Summary • “Static” Data Analysis vs. Production: Real-time, frequently update & monitor. • Facilitate fast iteration of data analysis & production system. • Data Scientists and Developers: Different approaches, find a common ground • Organizations: Cross-functional teams, micro services Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

×