Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Practical Considerations of Data Science Consulting in Large Organizations - Oct 12 2017

1,054 views

Published on

Below this section, the tab "notes" contains approximate speaker text as delivered at the 2017 Rice Data Science Conference. As opposed to the typical data science talk on math, models, or frameworks, this talk discusses the need to successfully manage people relationships when doing data science consulting and prototyping in a large organization. Common traps to avoid, key questions to answer early, how organizational procurement patterns influences tool selection and the importance of having a good local partner close to the data are all discussed. The in-person presenter of this talk at Rice Data Science Day was Yulan lin - https://www.linkedin.com/in/yulanlin/ Justin's slides were recorded in advance.

Published in: Data & Analytics

Practical Considerations of Data Science Consulting in Large Organizations - Oct 12 2017

  1. 1. Yulan Lin @y3l2n Justin Gosses @JustinGosses Data Science & Software Engineering Valador Inc. Supporting NASA OCIO Rice Data Science Conference, Oct. 2017 Practical Considerations for Data Science Consulting and Innovation in a Large Organization
  2. 2. Why practical considerations?
  3. 3. ● Startup ● Built around a software product ● Small companies This talk is NOT from the perspective of...
  4. 4. ● Established organization (> 10 years old) ● Large organization (> 10,000 people) ● Core function is not software/technical Clarification: not just NASA! Oil & Gas, Banking, Health, etc. This talk IS from the perspective of
  5. 5. A grab bag at the intersection of math + code that includes ● Machine Learning ● Deep Learning ● Statistics ● Data Visualization What is Data Science?
  6. 6. ● Data access ● How data science adds value ● Influence of procurement constraints ● Communication & narratives ● How data scientists are distributed Roadmap of our talk
  7. 7. Data Access
  8. 8. Does the data exist? ● Is it actually useful for your problem? ● If you’re going to collect it or make it: welcome to data engineering
  9. 9. Is the data programmatically accessible?
  10. 10. How “clean” is the data? (or: how much data translation services do you require from subject matter experts?)
  11. 11. Compliance: all. of. the. rules. ● Data Access ● Data Transfer ● Data Storage ● Data Anonymization ● Data Sharing
  12. 12. Data is currency: ● Power & Politics (at some level) ● Empathy is useful
  13. 13. A good partner helps you navigate: ● If data exists ● Data access ● Data oddities ● Compliance processes ● Data politics
  14. 14. How Data Scientists Add Value
  15. 15. Products
  16. 16. Awareness: Spreading knowledge of what is possible
  17. 17. Capability: Building skills & Bringing in new tools
  18. 18. Procurement Constraints
  19. 19. Does the proposed product fit the organization now and in the future? Consider: ● Skill development ● Workflow ● Tech stack
  20. 20. What is the official process?
  21. 21. What is the culture?
  22. 22. Open-source vs. proprietary
  23. 23. Communication and Design
  24. 24. Data science: What does that even mean? (or why managing expectations is important) Credit: https://xkcd.com
  25. 25. Effective Narratives: Don’t let the Buzzwords + math + programming get in the way of the Business value + project schedule + uncertainty story
  26. 26. Understand as early as possible ● What’s the real problem? ● Does the data exist? ● Can you access the data? ● How clean is the data? ● What is the business value? ● What is the organizational context?
  27. 27. When delivering something that will be used by people: Consider user-centered design
  28. 28. Data Visualization: You’re likely undervaluing it
  29. 29. The distribution of data scientists in an organization
  30. 30. Distribution of Data Scientists Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team
  31. 31. Organizational fence Distribution of Data Scientists Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team
  32. 32. Data Problems Finished Product Training Best Practices Distribution of Data Scientists Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Organizational fence
  33. 33. Innovation Lab Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Data Problems Training Best Practices
  34. 34. Innovation Lab Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Data Problems Training Best Practices Finished Product
  35. 35. Embedded + Rotations Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science TeamTraining Best Practices
  36. 36. Embedded + Rotations Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science TeamTraining Best Practices Projects Learning Training Best Practices
  37. 37. Centralized Consultancy Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Project Working Group Data Problems Training Training Best Practices
  38. 38. Centralized Consultancy Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Project Working Group Data Problems Training Training Best Practices
  39. 39. Centralized Consultancy Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Project Working Group Data Problems Training Training Best Practices Finished Product
  40. 40. How to grow data science in an org? Top-down vs. grassroots Data / Systems Skills / Culture
  41. 41. Wrap-up
  42. 42. Data scientists need to manage “outward” into many parts of an organization
  43. 43. All of these can make-or-break a project Data access: Will you need to navigate legacy systems and/or data owners? Value of data science: Is the project’s business value well defined? Procurement constraints: Can a project operationalize/grow within the org? Communication & design: Is the right information flowing effectively? Organizational structure: What are the pros/cons of your structures/workflow?
  44. 44. Thanks, and keep in touch! Justin Gosses @JustinGosses Yulan Lin @y3l2n

×