Hardcore Data Science - in Practice

3,899 views

Published on

My talk given at the Hardcore Data Science Track at O'Reilly's StrataHadoop in London, June 1, 2016.

Published in: Software
1 Comment
11 Likes
Statistics
Notes
  • For data visualization,data analytics,data intelligence and ERP Tools, online training with job placements, register at http://www.todaycourses.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
3,899
On SlideShare
0
From Embeds
0
Number of Embeds
134
Actions
Shares
0
Downloads
15
Comments
1
Likes
11
Embeds 0
No embeds

No notes for slide

Hardcore Data Science - in Practice

  1. 1. Hardcore Data Science— in Practice Dr. Mikio L. Braun, Delivery Lead for Recommendation and Search StrataConf 2016, London 
 mikio.braun@zalando.de @mikiobraun
 tech.zalando.com
  2. 2. Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London • 15 countries, 3 warehouses, 16+ million customers, 3bn€ revenue in 2015, … • Heavily using data science for recommendation
  3. 3. Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London Recommendations
  4. 4. Data Driven Recommendations • Collaborative filtering • Content based recommendation • Personalised recommendations • … Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  5. 5. For Example, One-pass Ranking Models (Freno, Jenatton, Saveski, Archambeau, “One-Pass Ranking Models for Low-Latency Product Recommendations”, KDD 2015) Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  6. 6. Hardcore Data Science to Production • Usually one shot computation • Sometimes done in Python • Getting raw data hard initially Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  7. 7. Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London Production System • Realtime system • Usually done in Java/ JVM based • Events and article data continually upgraded
  8. 8. Data Science vs. Production • A/B Test offline evaluation • Iterate on data science part • Iterate on the whole system! Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  9. 9. Data Scientists and Developers Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  10. 10. DS&D: Coding Very different approaches to coding… ← developers data scientists → Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  11. 11. DS&D: Collaboration • What is the most productive way? • Ideally, interface on code, not just documentation • Production logs often become data analysis input! Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  12. 12. Organization • Cross-functional teams • Communication! • Microservices, at Zalando:
 STUPS (Docker on AWS) Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London
  13. 13. Summary • “Static” Data Analysis vs. Production: Real-time, frequently update & monitor. • Facilitate fast iteration of data analysis & production system. • Data Scientists and Developers: Different approaches, find a common ground • Organizations: Cross-functional teams, micro services Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

×