Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Analysis Responsibly

9,565 views

Published on

Delivered by Hilary Parker at the 2016 New York R Conference on April 8th and 9th at Work-Bench.

Published in: Data & Analytics
  • Be the first to comment

Scaling Analysis Responsibly

  1. 1. Scaling Analysis Responsibly Hilary Parker @hspter
  2. 2. #rcatladies Not So Standard Deviations @keegsdur
  3. 3. “We just don’t have enough analysts!”
  4. 4. “Let’s scale by building the perfect BI tool!”
  5. 5. That sounds great! We should automate some of the things that are slowing you down PRODUCT TEAM DATA http://xkcd.com/
  6. 6. That seems perfectly reasonable! Let’s just enlist some folks from engineering to help you with it DATAPRODUCT TEAM
  7. 7. DATA ENG Sure thing! ...and finally can it add this last graph?
  8. 8. several months pass…
  9. 9. ENG Sure! File a ticket! Can we add these 132 extra metrics to the testing? PRODUCT TEAM
  10. 10. You can’t do that, your family-wise error rate will tend to 1!! ENG PRODUCT TEAM DATA
  11. 11. ENG That’s a reasonable expectation for an internal product. I’m on it! I’d really like this tool to be more stable. PRODUCT TEAM
  12. 12. Our test violates a subtle statistical assumption for this new application, and we need to gut this stable product! ENG PRODUCT TEAM DATA
  13. 13. Almost impossible to avoid 2-against-1 dysfunction as product teams become “self-service” with engineering support Invariably becomes a race to the bottom as internal competition for the simplest tool emerges Stability prioritized over flexibility
  14. 14. (In tech) Building = Owning
  15. 15. Analysis Developer!
  16. 16. “Analysis Developer” Someone on the analyst team who develops reproducible, flexible analyses in R and helps all analysts scale their work
  17. 17. I’ll work with the analysis developer on my team! We should automate some of the things that are slowing you down PRODUCT TEAM DATA
  18. 18. Avoids common types of dysfunction Allows for flexible, accurate analysis Analysts acquire marketable skills!
  19. 19. Instead of creating dashboards or using static BI tools... http://dilbert.com/strip/2007-05-16
  20. 20. Series of R packages highly specified for business case, “mix and match” elements to rapidly create common reports. library(“internal_package”)
  21. 21. Instead of “assembly line” data processing…
  22. 22. Close 2-way partnership with data engineers to optimize the creation of datasets for certain common analyses. The assembly line handoff from scientist to engineer creates [an uncreative] environment. The trick is to create an environment that allows for autonomy, ownership, and focus for everyone involved. - Jeff Magnusson http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
  23. 23. Instead of PM anxiously watching dashboards… https://www.youtube.com/watch?v=CCbWyYr82BM
  24. 24. Analysts can create shorter-lived, reproducible reports
  25. 25. Expectation manage the shorter lifespan of the report, but include that report will require less work from teams once created Productionize in the short-term with CRON jobs Can add in more stats this way! Y/Y turns into semiparametric models, etc.
  26. 26. “The Problem with Dashboards (And A Solution)” by Stephanie Evergreen http://stephanieevergreen. com/problem-with-dashboards/
  27. 27. http://dilbert.com/strip/2004-04-05 Instead of promotion based on deliverables…
  28. 28. Consider skill acquisition for analyst promotion For analysis developers, promoted based on whether or not they were able to help other analysts become more efficient Support for skill acquisition!
  29. 29. Education support for learning better analysis development methods for all analysts Internally created resources
  30. 30. Instead of PMs self-teaching analysis based on what’s presented in dashboarding tools.. https://xkcd.com/605/
  31. 31. PMs can use tools for education analysts if they want to “ramp up” on analytical skills like R This way you can bake in statistical education as well.
  32. 32. “Isn’t this just package development?”
  33. 33. “Isn’t this just package development?” No!
  34. 34. Ad-hoc spreadsheet work
  35. 35. Ad-hoc spreadsheet work + scripting
  36. 36. Ad-hoc spreadsheet work R workflows + scripting
  37. 37. Ad-hoc spreadsheet work R workflows + scripting + reproducibility, some functions, “analysis testing”
  38. 38. Ad-hoc spreadsheet work R workflows Reproducible R analyses + scripting + reproducibility, some functions, “analysis testing”
  39. 39. Ad-hoc spreadsheet work R workflows Reproducible R analyses + scripting + reproducibility, some functions, “analysis testing” + workplace-wide audience, documentation, testing - problem-specific writeups and functions
  40. 40. Ad-hoc spreadsheet work R workflows Reproducible R analyses Internal package development + scripting + reproducibility, some functions, “analysis testing” + workplace-wide audience, documentation, testing - problem-specific writeups and functions
  41. 41. Ad-hoc spreadsheet work R workflows Reproducible R analyses Internal package development + scripting + reproducibility, some functions, “analysis testing” + workplace-wide audience, documentation, testing - problem-specific writeups and functions + industry-wide audience - company-specific code and functions
  42. 42. Ad-hoc spreadsheet work R workflows Reproducible R analyses Internal package development External package development + scripting + reproducibility, some functions, “analysis testing” + workplace-wide audience, documentation, testing - problem-specific writeups and functions + industry-wide audience - company-specific code and functions
  43. 43. Ad-hoc spreadsheet work R workflows Reproducible R analyses Internal package development External package development + reproducibility, some functions, “analysis testing” + scripting + workplace-wide audience, documentation, testing - problem-specific writeups and functions + industry-wide audience - company-specific code and functions Analysis Developer Open-Source Developer
  44. 44. Analysis Developer Stop trying to scale with static BI tools -- this will (almost) always lead to dysfunction Instead, scale by increasing analyst efficiency using R and education! Hire Analysis Developers to help with all this!
  45. 45. Thanks! Hilary Parker @hspter

×