Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data and Data Science W's

261 views

Published on

HiPPO and Flipism are no longer the only way to take decisions. In the Big Data / Data Science era one can dream of data-driven organization. If the data were "oil", Big Data technologies extract, transport, and store it, while Data Science methods provide the a way to "refine the crude oil". This presentation elaborates on the Ws (What, Why, When, Who and How) of Big Data and Data Science.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Big Data and Data Science W's

  1. 1. Big Data & Data Science W's Emanuele Della Valle @manudellavalle Prof. @polimi & Founder @fluxedo_
  2. 2. W's 18/06/2018 @manudellavalle - http://emanueledellavalle.org 2
  3. 3. Why? • In many organizations decisions are made by "questionable" methodologies such as – Highest Paid Person Opinion (HiPPO) – Flipism (all decisions are made by flipping a coin) 18/06/2018 @manudellavalle - http://emanueledellavalle.org 3
  4. 4. Why? Highest Paid Person Opinion (HiPPO) 18/06/2018 @manudellavalle - http://emanueledellavalle.org 4
  5. 5. Why? Flipism (all decisions are made by flipping a coin) 18/06/2018 @manudellavalle - http://emanueledellavalle.org 5
  6. 6. Why? • In many organizations decisions are made by the "questionable" methodologies such as – Highest Paid Person Opinion (HiPPO) – Flipism (all decisions are made by flipping a coin) • This could have been the right approach in the '70s … – See the "Theory of Bounded Rationality" by Herbert Simons 18/06/2018 @manudellavalle - http://emanueledellavalle.org 6
  7. 7. Why? 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source http://www.azquotes.com/quote/139996 ] 7
  8. 8. Why? • In many organizations decisions are made by the "questionable" methodologies such as – Highest Paid Person Opinion (HiPPO) – Flipism (all decisions are made by flipping a coin) • This could have been the right approach in the '70s … – See the "Theory of Bounded Rationality" by Herbert Simons • … but in the Big Data era one can dream of data-driven organization 18/06/2018 @manudellavalle - http://emanueledellavalle.org 8
  9. 9. Why? • Data-Driven Organization 18/06/2018 @manudellavalle - http://emanueledellavalle.org 9
  10. 10. Why? Decisions no longer have to be made in the dark or based on gut instinct; they can be based on evidence, experiments and more accurate forecasts. -- McKinsey 18/06/2018 @manudellavalle - http://emanueledellavalle.org 10
  11. 11. Why? • Data-driven organizations – perform better • The data shows where they can streamline their processes – are operationally more predictable • Data insights fuel current and future decision making – are more profitable • Constant improvements and better predictions help to outsmart the competition and improve innovation. 18/06/2018 @manudellavalle - http://emanueledellavalle.org 11
  12. 12. Why? • Moneyball: data + analysis to win games 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: https://www.imdb.com/title/tt1210166/ ] 12
  13. 13. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 13
  14. 14. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 14
  15. 15. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 15
  16. 16. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 16
  17. 17. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 17
  18. 18. What's Big Data? • Big Data is "crude oil" … that we have to – Extract – Transport in mega-tankers – Ship through pipelines – Store in massive silos – … 18/06/2018 @manudellavalle - http://emanueledellavalle.org 18
  19. 19. What's Data Science? • Data Science is "refining crude oil" 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source:http://allabtinstru.blogspot.com/2016/09/ProcessofRefiningCrudeOil.html] 19
  20. 20. What's Data Science? • The Science [and Art] of… – Discovering what we don’t know from data – Obtaining predictive, actionable insight from data – Creating Data Products that have business impact now – Communicating relevant business stories from data – Building confidence in decisions that drive business value 18/06/2018 @manudellavalle - http://emanueledellavalle.org 20
  21. 21. Who's a Data Scientist? • Drew Conway, 2010 18/06/2018 @manudellavalle - http://emanueledellavalle.org 21
  22. 22. How? • Statistics starts with data • Two goals of analyzing data – Descriptions: how nature associates responses to inputs – Predictions: response for future input variables [source: Statistical Modeling: The Two Cultures. Leo Breiman, 2001] 18/06/2018 @manudellavalle - http://emanueledellavalle.org nature xy independent variable response variable 22
  23. 23. How? [source: Marc Andrews, 2014] Leverage more of the data being captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org 23
  24. 24. How? [source: Marc Andrews, 2014] Leverage more of the data being captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org 24
  25. 25. How? [source: Marc Andrews, 2014] Leverage more of the data being captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org 25
  26. 26. How? 18/06/2018 @manudellavalle - http://emanueledellavalle.org Reduce effort required to leverage data [source: Marc Andrews, 2014] 26
  27. 27. How? 18/06/2018 @manudellavalle - http://emanueledellavalle.org Reduce effort required to leverage data [source: Marc Andrews, 2014] 27
  28. 28. What? 18/06/2018 @manudellavalle - http://emanueledellavalle.org Reduce effort required to leverage data [source: Marc Andrews, 2014] 28
  29. 29. How? Data-driven exploration looking for correlation 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 29
  30. 30. How? Data-driven exploration looking for correlation 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 30
  31. 31. Your butcher … 18/06/2018 @manudellavalle - http://emanueledellavalle.org 31
  32. 32. … at scale! 18/06/2018 @manudellavalle - http://emanueledellavalle.org 32
  33. 33. How? Leverage data as it is captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 33
  34. 34. How? Leverage data as it is captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 34
  35. 35. How? Leverage data as it is captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 35
  36. 36. How? 18/06/2018 @manudellavalle - http://emanueledellavalle.org [sourcehttps://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/] 36
  37. 37. How? Overall picture by Gartner 18/06/2018 @manudellavalle - http://emanueledellavalle.org 37
  38. 38. Where? 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source https://www.ted.com/talks/anne_milgram_why_smart_statistics_are_the_key_to_fighting_crime ] Improve public safety and reduce violent crime through data analytics -41% murders | -27% crimes 38
  39. 39. Where? 18/06/2018 @manudellavalle - http://emanueledellavalle.org 39
  40. 40. Where? 18/06/2018 @manudellavalle - http://emanueledellavalle.org 40
  41. 41. What about cybersec? 18/06/2018 @manudellavalle - http://emanueledellavalle.org 41
  42. 42. Credits • Big Data [sorry] & Data Science: What Does a Data Scientist Do? Carlos Somohano, 2013 – https://www.slideshare.net/datasciencelondon/big-data-sorry-data- science-what-does-a-data-scientist-do-world • Becoming a data-driven organization The what, why and how. SAS, 2018 – https://www.sas.com/en_us/whitepapers/becoming-data-driven- organization-109150.html • Never trust summary statistics alone; always visualize your data. Alberto Cairo, 2016 – http://www.thefunctionalart.com/2016/08/download-datasaurus- never-trust-summary.html • 2017 Planning Guide for Data and Analytics. John Hagerty (Gartner), 2016 – https://www.gartner.com/binaries/content/assets/events/keywords/ catalyst/catus8/2017_planning_guide_for_data_analytics.pdf 18/06/2018 @manudellavalle - http://emanueledellavalle.org 42
  43. 43. Thank you! Any Question? Emanuele Della Valle @manudellavalle Prof. @polimi & Founder @fluxedo_

×