Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Agile Data Science

4,997 views

Published on

Is Agile Data Science just two buzzwords put together? I argue that agile is a very practical and applicable methodology, that does work well in the real world for all sorts of Analytics and Data Science workflows.

http://theinnovationenterprise.com/summits/digital-web-analytics-summit-london-2015/schedule

Published in: Data & Analytics
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • ODD "WATER HACK" CRUSHES FOOD CRAVINGS and MELTS Fat ♣♣♣ http://t.cn/AirVsp6C
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @Nick Mead That's Tableau software.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • What is the tool you are using on slide 41? It looks like some kind of pivot table / olap query.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @samthemonad We use git for all prod code, of course. Our dashboards, ETL, calculators as well as library for interactive analytics is in GIT. Dropbox is for notebooks only
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Agile Data Science

  1. 1. 1 © All rights reserved to Agile Data Science Volodymyr (Vlad) Kazantsev Head of Data Science at Product Madness 2015
  2. 2. 2 volodymyrk What are we trying to solve?
  3. 3. 3 volodymyrk What do we want to achieve? Happy Data Team Transparency over delivery and priorities Minimize Waste Deliver lots of Value
  4. 4. 4 volodymyrk What we do?
  5. 5. 5 volodymyrk Heart of Vegas in (public) Numbers * source: App Annie, 2nd of March Top Grossing Games US Top Grossing Games AU iphone 29 (+1) 1 (+1) ipad 8 (+2) 1 (-) Android 16 (+2) 1 (-) Facebook 5(+1)
  6. 6. 6 volodymyrk Data Team ● Ad-hoc analytics and daily fires; dashboards ● Deep dive analysis; Predictive analytics ● ETL, Data Viz tools, R&D, DBA Analytics Data Science Data Engineering 8 people; 4 in London
  7. 7. 7 volodymyrk Technology Stack ETL orchestration Transformation & Aggregation SQL Data Products Reports Dashboards +
  8. 8. 8 volodymyrk Technology Stack ETL orchestration Transformation & Aggregation SQL Data Products Reports Dashboards +
  9. 9. 9 volodymyrk few examples .. A B A/B TestsCustomer Lifetime Value days $value Segmentation group 1 group 2 group 3 group 4
  10. 10. 11 volodymyrk Data Scientist.. Coding Maths and Stats Business and Marketing expert
  11. 11. 12 Lesson 1: Agile Philosophy for Data Science 1
  12. 12. 13 volodymyrk Agile Manifesto Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan * agilemanifesto.org
  13. 13. 14 volodymyrk Agile Data Science Manifesto Individuals and interactions over processes and tools Actionable insights over comprehensive reports Customer collaboration over project negotiation Responding to change over following a plan
  14. 14. 15 “If a building doesn’t encourage [collaboration], you’ll lose a lot of innovation and the magic that’s sparked by serendipity” - Steve Jobs Individuals and interactions over processes and tools
  15. 15. 16 Individuals and interactions over processes and tools Standing Desks + Easily Available Whiteboard
  16. 16. 17 Agile Principles Iterative, incremental and evolutionary Efficient and face-to-face communication Very short feedback loop and adaptation cycle Quality focus - iterations, timeboxed estimates - no to tasks by email (with no face-to-face) - daily standups, pair analysis - verifiable, reproducible findings
  17. 17. 18 Scrum-Ban in Data Science @ProductMadness ● Weekly cycle ● Daily standup meeting @10am ● ToDo/WIP/Waiting buckets are kept small ● Disruptions to weekly plan are expected ● On-demand planning
  18. 18. 19 Data Science Board
  19. 19. 20 Lesson 1: Agile methods in Data Science 1. co-location matter; whiteboard next to your desk 2. Work with decision maker; share preliminary findings 3. Make a research plan; pivot early 4. Book “Findings” meeting before project start 5. MVP for Data Products 6. Do Daily Stand-ups !
  20. 20. 21 Lesson 2: Agile Velocity vs. Acceleration 2
  21. 21. 22 What is Agile Acceleration Waterfall Scrum Units of Work Time Interval Velocity = ΔVelocity = Acceleration* ΔTime VS.
  22. 22. 23 a = F m I run SQL, copy- paste data to Excel and send it by email I created a deep neural network to predict high spenders
  23. 23. 24 Case Study: to Git or not to Git Scripts (ruby, bash, python) Python Apps Python Modules IPython Notebooks Research Documents (word) Presentations (powerpoint) Spreadsheets (excel)
  24. 24. 25 Case Study: Git or not to Git Scripts (ruby, bash, python) Python Apps Python Modules IPython Notebooks ? Research Documents (word) Slides (powerpoint) Spreadsheets (excel)
  25. 25. 26 Case Study: Git or not to Git Scripts (ruby, bash, python) Python Apps Python Modules IPython Notebooks Research Documents (word) Slides (powerpoint) Spreadsheets (excel)
  26. 26. 27 Remove unnecessary weight
  27. 27. 28 Friction
  28. 28. 29 Friction: Mini Case Studies re.dash for self-service analytics cloud-hosted Jupyter notebooks
  29. 29. 30 Lesson 2: find the lightest suitable tool 1. IPython notebooks: Dropbox over Git 2. Google Slides over Powerpoint Google Slides over Email with images 3. Google Spreadsheets over Excel 4. Podio over Jira 5. Data Transformations in DWH in SQL over Hadoop 6. re.dash over SQL Workbench+csv export+excel 7. Hosted Jupyter over local python
  30. 30. 31 Lesson 3: Focus on Closing the Loop 3
  31. 31. 32 Iterative development 7-30 days
  32. 32. 33 Scrum for Data Science? Assumptions: ● Motivated ninjas ● Isolated and co-located team ● Clear direction ● You can estimate work Reality: ● Unicorns are rare ● Constant interruption; 3 locations ● Lots of unknown-unknowns ● You can estimate very little
  33. 33. 34 Analytics Loop Spot Opportunity Ask the Right Question Make Decision Improve the Business Data Science @work
  34. 34. 35 Analytics Spiral Ideas & Questions Data Analysis Insights Impact
  35. 35. 36 Limit the number of Open Loops 90% 90% 75%80% 80%60% 100% 100% 100%100% 0% 0% Always prefer to have: 90% of tasks are 100% complete over 100% of tasks are 90% complete VS.
  36. 36. 37 Lesson 3: Focus on Closing the Loop 1. Don’t build predictive models that you can’t act upon. Don’t analyse stuff that cannot help to make a decision 2. The best way to deal with Analytics Spiral is to avoid the spiral. Practise Crack a Case and “what if” method. 3. Limit the number of “open loops”
  37. 37. 38 Lesson 4: Reproducibility Matters 4
  38. 38. 39 To the and back!
  39. 39. 40 Why? Boss: “Great! Can you run this for all monthly cohorts?” Because:
  40. 40. 41 Why? Because: Boss: “Sam is on holiday. Can you re-run his analysis?”
  41. 41. 42 Few IPython Tips
  42. 42. 43 Import all commonly used tools in one line. All access and security is abstracted away. Focus on SQL, not data access formatting and publishing a .png in one line of code PyCharm has great SQL editor
  43. 43. 44 Lesson 4: Reproducibility ● Get rid of Windows and you get rid of Excel ● ipynb are always shared and versioned; Prefer simple cloud sharing to VCS ● Streamline data access functions ● Cache long-running code and queries ● Develop a common library
  44. 44. 45 In Summary...
  45. 45. 46 Summary ● Agile approach works well for Data Science ● Find the lightest suitable tool for a task ● Reproducibility is not negotiable ● Focus on closing the loop(s)
  46. 46. 47 2015 © All rights reserved to Thank You! jobs.productmadness.com volodymyr.kazantsev@productmadness.com volodymyrk We Are Hiring ! jobs.productmadness.com

×