Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Do Agile Data in Just 5 Shocking Steps!

1,024 views

Published on

For over 10 years, we have been doing agile for software development yet people struggle to do agile for data, BI, and analytics. After a quick review of the agile manifesto and principles, this talk looks at which agile practices have worked for data and which are still hard. Then, with analyst requirements in mind, this talk reveals the 5 shocking steps to actually do agile with data.

Published in: Data & Analytics
  • Be the first to comment

Do Agile Data in Just 5 Shocking Steps!

  1. 1. 1 K I T C H E N DATA Do agile data in just 5 shocking steps! Copyright © 2015 by DataKitchen, Inc. All Rights Reserved. by Gil Benghiat gil@datakitchen.io @benghiat @datakitchen_io Tuesday, May 19 CIC (Cambridge Innovation Center) 1 Broadway, Cambridge, MA
  2. 2. Agenda •Gil & DataKitchen •A look at Agile through Data lenses •How to do Agile Data 2
  3. 3. Gil Benghiat – decades working with data • Network Management Data • Database Management • Clinical Trial Data • Pharmaceutical Sales Data • Data Liberation • Data Preparation gil@datakitchen.io @benghiat 6/2/2015 3 Solid Oak Consulting
  4. 4. 4 Data Analysts And Their Teams Are Spending 60-80% Of Their Time On Data Preparation And Production
  5. 5. This creates an expectation gap 5 Analyze Prepare Data C Analyze Prepare Data Business Customer Expectation Analyst Reality Communicate The business does not think that Analysts are preparing data Analysts don’t want to prepare data
  6. 6. 6 DataKitchen is on a mission to integrate and organize data to make analysts super-powered. • Offering • Set-up service • Software subscription • UI to integrate data • Benefits • Data warehouse • Eliminate drudgery of repeated integrations
  7. 7. agilemanifesto.org 6/2/2015 7 analytics Switch the word “software” to “analytics”
  8. 8. agilemanifesto.org 6/2/2015 8 and excel files s/software/analytics/ The switch works for the 12 principles too. Iterate to improve the analytics. Iterate to improve the process.
  9. 9. Agile methodologies contain a number of practices that can apply to data  Sprints  Stories  Prioritization  Daily Meeting  Defined roles  Retrospectives  Pair Programming  Burn down charts  etc. 9 The Data Analyst has the central role as the bridge between business and data
  10. 10. What do analysts and data scientists want? Flexibility & Speed 6/2/2015 10 You need to be fast and produce trustworthy data
  11. 11. Some practices have been difficult to apply to data  Test Driven Development  Branching and merging  Refactoring  Small Releases  Frequent or Continuous Integration  Experimentation for learning 11
  12. 12. Do agile data in just 5 shocking steps! 12
  13. 13. ❶ Add tests Types 1. Error – stop the line 2. Warning – investigate later 3. Info – list of changes Examples 1. Input file row count way below a critical threshold 2. Input file row count a little below a threshold 3. These customers changed territories 6/2/2015 13 And keep adding them with each feature developed!
  14. 14. ❷ Manage your transforms like code Use a source code control system (like GIT) to enable: • Branching • Merging • Diff 6/2/2015 14
  15. 15. ❸ Provide a data environment for each branch The underlying data is needed to develop and test the code/transformations 6/2/2015 15
  16. 16. ❹ Support three types of workflows Small Team Promote directly to production Feature Branch Merge back to production branch Data Governance 3rd party verification before production merge 6/2/2015 16 Review Test Approve
  17. 17. ❺ Give you analysts and data scientists the ability to edit the DW safely 6/2/2015 17 Best-in-class companies take 12 days to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards average 143 days Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information Figure out how to do this in minutes
  18. 18. 18 K I T C H E N DATA Do agile data in just 5 shocking steps! Copyright © 2015 by DataKitchen, Inc. All Rights Reserved. by Gil Benghiat gil@datakitchen.io @benghiat @datakitchen_io Tuesday, May 19 CIC (Cambridge Innovation Center) 1 Broadway, Cambridge, MA

×