Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Data Science Conference Agile Data

1,235 views

Published on

To rephrase an old saying: ‘It takes a village to raise an Analyst.’ Data Analysts and Scientists are working in teams delivering insight and analysis on an ongoing basis. So how do you get the team to support experimentation and insight delivery without ending up in an IT Engineer vs Analyst vs Data Governance war? We present 5 shocking steps to get these teams of people working together with practical, doable steps that can help you achieve data agility.

Published in: Data & Analytics

Open Data Science Conference Agile Data

  1. 1. AGILE DATA Christopher Bergh Head Chef, DataKitchen O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci
  2. 2. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  3. 3. 3K I T C H E N DATA Algorithm Nerd Columbia, MIT, NASA- Ames; ATC Automation Into In 1990 Fuzzy Logic, Neural Networks, Constraint Satisfaction; Unix/C Software Nerd CTO, Dir Engineering, VP Product Management Into In 2000 Management of Software Teams & Startups; PowerPoint Data Nerd COO: ETL Engineers, Analysts & Analytic Tool Into In 2010 W. Edwards Deming, Data, Bootstrapping; Excel Hacking WHO AM I
  4. 4. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  5. 5. SO WHAT IS THE PROBLEM? In one word ….
  6. 6. LOTSA Technologies in Analytics
  7. 7. LOTSA People In Analytic Teams DATA SCIENTIST REPORTING ANALYST ETL ENGINEER DATABASE ARCHITECT DEV OPS ENGINEERData Governance
  8. 8. LOTSA Data & Analysis ONE OFF RE USE
  9. 9. LOTSA Missed Expectations Analyze Prepare Data C Analyze Prepare Data Business Customer Expectation Analyst Reality Communicate The business does not think that Analysts are preparing data Analysts don’t want to prepare data
  10. 10. Complexity Another Field, Software Development, Ran into the Same Problems With Complexity ... … They Used Something Called ‘Agile’ To Solve The Problem
  11. 11. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  12. 12. AGILEMANIFESTO.ORG 5/31/2015 12 AGILEMANIFESTO.ORG
  13. 13. AGILEMANIFESTO.ORG 13 analytics
  14. 14. s/software/analytics/
  15. 15. PRACTICES THAT ARE EASY TO APPLY  Development Sprints  User Stories  Daily Meetings  Defined Roles  Retrospectives  Pair Programming  Burn Down Charts
  16. 16. SOME PRACTICES HAVE BEEN DIFFICULT TO APPLY  Test Driven Development  Branching And Merging  Refactoring  Small Releases  Frequent Or Continuous Integration  Experimentation For Learning  Individual Development Environments
  17. 17. AGILE – WHAT IS UNIQUE TO ANALYTICS? 17 PUT THE ANALYST AT THE CENTER
  18. 18. AGILE – WHAT IS UNIQUE TO ANALYTICS? ANALYICS PERCIEVED VALUE DECAY CURVE
  19. 19. AGENDA Who Am I? What Is The Problem? A Look At Agile Through Data Lens How To Do Agile Data In Five Shocking Steps
  20. 20. Why? Your work is just code: models, transforms, etc. Use a source code control system (like GIT) to enable: Branching Merging Diff 5/31/2015 20 1. MANAGE YOUR WORK LIKE CODE
  21. 21. 2. TEST AND CONTAIN 1. Create and monitor tests 2. Test on separate data from production 3. Run tests early and often 4. Target 20% of code for tests 5/31/2015 21 Unit Tests & Systems Test … Keep Adding & Improving 1. Break up you work into components 2. Manage the environment for each component (e.g. Docker, AMI) 3. Practice Environment Version Control
  22. 22. 3. PROVIDE SEPARATE ENVIRONMENTS FOR ANALYSTS Why? Analysts need their data the data to iterate, develop & explore. 5/31/2015 22
  23. 23. 4. SUPPORT THREE TYPES OF WORKFLOWS Small Team Work directly on production Feature Branch Merge back to production branch Data Governance 3rd party verification before production merge 5/31/2015 23 Review Test Approve
  24. 24. 5. GIVE ANALYSTS ABILITY TO EDIT DATABASE SAFELY 5/31/2015 24 Best-in-class companies take 12 days to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards average 143 days Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information Figure out how to do this in minutes
  25. 25. CONCLUSION
  26. 26. CONCLUSION
  27. 27. AGILE DATA Christopher Bergh cbergh@datakitchen.io Questions? Comments? BOSTON 2015 @opendatasci

×