Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building a Data Science Capability: how to get started and how to succeed


Published on

Keynote talk given at Data Leaders Summit Europe in OCtober 2018

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Building a Data Science Capability: how to get started and how to succeed

  1. 1. Building a Data Science Capability HOW TO GET STARTED AND HOW TO SUCCEED ENDA RIDGE CHIEF DATA SCIENTIST, SAINSBURY’S #GuerrillaAnalytics @enda_ridge
  2. 2. Understand what Applied Science is 1 #GuerrillaAnalytics @enda_ridge
  3. 3. The Overarching Challenge PhD ‘Design of Experiments for Tuning Algorithms’ Data mining Software pre-sales Forensic Data Analytics Senior Manager Professional Services Chief Data Scientist #GuerrillaAnalytics @enda_ridge • Organisations cannot accommodate flexibility 2
  4. 4. About my organisation (and yours) 2nd largest grocery retailer in the UK (& General Merchandise & a Bank) Almost ~150 years old Employ almost 200,000 colleagues ~1,500 stores, > 90,000 products ~250,000 Online orders per week #GuerrillaAnalytics @enda_ridge 3 Large established enterprise Legacy systems Traditional approaches to operations Many functional areas Not all pristine digital data
  5. 5. Challenge: flexibility of scientific equipment Pitfalls Scale Permission groups Proxy access Local admin rights Licencing Tech Support Data feeds Action ‘Lab’ Data store App Server Desktop tools #GuerrillaAnalytics @enda_ridge 4
  6. 6. Challenge: Flexibility to choose opportunities #GuerrillaAnalytics @enda_ridge 5 • Smaller data • Minimal UI • Simple machine learning Focus on low dependency, high value ‘Marketing collateral’ Delivery black hole Death march projects Not building support for funding in year 2
  7. 7. Challenge: Flexibility with People • Outputs not commercial, not understood Wrong scientists • Cannot productionise results No Engineers • Lose your people No HR engagement #GuerrillaAnalytics @enda_ridge 6 Pitfalls Hire scientists who can communicate Hire engineers AS WELL AS scientists Begin the HR conversations early Actions
  8. 8. Challenge #5: Flexibility of a scientific culture Rejection of recommendations Lack of delivery engagement You’re unable to quantify success #GuerrillaAnalytics @enda_ridge 7 Pitfalls Find customers willing to experiment Mix teams (science, engineering, business) Train business for data literacy Actions
  9. 9. The Core Roles in an Applied Data Science Team #GuerrillaAnalytics @enda_ridge 8 • Data engineers? • Machine Learning Engineers? • Advanced analysts? • Product managers? • Big Data Developers? • Data modellers? Head of team Product Data Science Engineering / Infrastructure
  10. 10. A maturity test for Applied Data Science capability Capture your objectives? Results reproducible? Analyse robustness of models? Translate model performance to commercial KPIs? Do you build measurable data products? Data pipelines that you can rebuild with one command? Do you use source control? Do you track bugs in your models and pipelines? Do you have scalable compute and storage? 9 Do you manage delivery? Do scientists work alongside engineering? Can Data Scientists choose tools without IT intervention? Operating model Technology Culture #GuerrillaAnalytics @enda_ridge
  11. 11. Implications of a ‘Science’ capability It’s not about ‘Big Data’ • Kepler didn’t need a Hubble Telescope It’s not about Machine Learning • That’s a last resort when maths & stats fail Models that cannot be used are useless • Your capability needs more than scientists Models that won’t be used are useless • Are you ready to be scientific? #GuerrillaAnalytics @enda_ridge 10
  12. 12. How to get started #GuerrillaAnalytics @enda_ridge 11 @enda_ridge Recognise what Applied Science is Flexibility of equipment Flexibility of work you choose Flexibility of People Flexibility of culture