Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data


Published on

Published in: Technology, Economy & Finance
  • Be the first to like this

2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data

  1. 1. The growing revolution in Data: A presentation to the Social Business Summit Nicholas Gruen Chairman, Kaggle Chairman, Government 2.0 Taskforce, E [email_address] T @nicholasgruen Singapore, 6 th April, 2011
  2. 2. Outline <ul><li>The changing landscape </li></ul><ul><ul><li>and what’s behind it </li></ul></ul><ul><li>The ecology of data </li></ul><ul><li>Finding the people to find the value in your data </li></ul><ul><ul><li>Kaggle </li></ul></ul><ul><li>From data inside to data outside </li></ul><ul><li>Data and gamification </li></ul><ul><ul><li>The Gruen Tender </li></ul></ul>
  3. 3. Data can turn things upside down <ul><li>Insurance </li></ul><ul><li>Retail </li></ul><ul><li>Banking </li></ul><ul><li>Telecommunications </li></ul><ul><li>Accommodation </li></ul><ul><li>Aviation and transport </li></ul><ul><ul><li>From stand by to advance purchase </li></ul></ul><ul><ul><li>load optimisation, price discrimination and risk sharing </li></ul></ul><ul><li>Medicine </li></ul>
  4. 5. All That Data… 3 years of historical data for comparison 10 x 750 x 50 x 52 x 3 = 58,500,000 data points 4 regions to segregate the data 10 x 750 x 50 x 52 x 3 x 7 x 4 = 1,638,000,000 data points 50 states to segregate the data 10 x 750 x 50 x 52 x 3 x 7 x 4 x 50 = 81,900,000,000 data points 7 types of data to monitor (POS, Inventory, Marketing, Syndicated, etc) 10 x 750 x 50 x 52 x 3 x 7 = 409,500,000 data points 8 categories to aggregate the data 10 x 750 x 50 x 52 x 3 x 7 x 4 x 50 x 8 = 655,200,000,000 data points 10 Retailers to monitor 10 data points 750 Stores per retailer to monitor 10 x 750 = 7500 data points 50 products per store to monitor 10 x 750 x 50 = 375,000 data points 52 weeks of data per year for trend analysis 10 x 750 x 50 x 52 = 19,500,000 data points 655 Billion+ data points involved with managing the retail sales channel Source: Marilyn and Terence Craig @ Strataconf
  5. 6.
  6. 7. <ul><li>He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature, when she made them, like fire, expansible over all space, without lessening their density in any point, and like the air in which we breathe, move, and have our physical being, incapable of confinement or exclusive appropriation.  </li></ul><ul><li>Thomas Jefferson to Isaac McPherson, August, 1813 </li></ul>Jefferson’s enlightenment dream
  7. 8. Public goods Public goods – goods that no-one will supply if the government doesn’t Public goods . . . present serious problems in human organisation. Vincent and Elenor Ostrom - 1977
  8. 9. <ul><li>The Wealth of Nations (1776) </li></ul><ul><li>Private Goods </li></ul><ul><li>The Theory of </li></ul><ul><li>Moral Sentiments (1759) </li></ul><ul><li>The social preconditions of markets (Public Goods) </li></ul>Language Adam Smith
  9. 10. Public Goods Private Goods [The public good of] Justice . . . is the main pillar that upholds the whole edifice. If it is removed, the great, the immense fabric of human society . . . must in a moment crumble into atoms. Adam Smith
  10. 11. From potential to actual public good
  11. 12. Web 2.0: explosion of emergent public goods <ul><li>Web 2.0 platforms are public goods: </li></ul><ul><ul><li>Google (1998) </li></ul></ul><ul><ul><li>Wikipedia (2001) </li></ul></ul><ul><ul><li>Blogs (early 2000s) </li></ul></ul><ul><ul><li>Facebook (2004) </li></ul></ul><ul><ul><li>Twitter (2006) </li></ul></ul><ul><li>Government didn’t build any of them </li></ul><ul><li>These platforms generate data </li></ul><ul><ul><li>By creating a context in which it means something </li></ul></ul><ul><ul><li>And so inducing us to produce it </li></ul></ul>
  12. 13. The economics of abundance: a new birth of ‘free’dom Public goods . . . present serious problems in human organisation. Vincent and Elenor Ostrom - 1977 The freedom of ideas is the liberation of our species Public goods as a problem Public goods as an opportunity
  13. 14. The ecology of data Data Schema or Context Information An example from Web 2.0 …
  14. 15. Private goods => Public Goods: Software <ul><li>Private Goods </li></ul><ul><li>Meeting private needs </li></ul>Public Goods <ul><li>Many eyeballs </li></ul><ul><li>Free code </li></ul>
  15. 16. Making sense of data <ul><li>Release and the sense makers will come </li></ul><ul><li>Make sense of the data for your community </li></ul><ul><ul><li>And you may be able to monetise it </li></ul></ul><ul><ul><li>Whole businesses being built on data exhaust </li></ul></ul><ul><li>Find the people to analyse your data </li></ul><ul><ul><li>Kaggle </li></ul></ul>
  16. 26. FlightCaster <ul><li>Predicts flight delays. </li></ul><ul><li>We use an advanced algorithm that scours data on every domestic flight for the past 10-years and matches it to real-time conditions. We help you evaluate alternative options and help connect you to the right person to make the change. </li></ul><ul><li>FlightCaster uses data from: </li></ul><ul><ul><li>Bureau of Transportation Statistics </li></ul></ul><ul><ul><li>FAA Air Traffic Control System Command Center </li></ul></ul><ul><ul><li>FlightStats </li></ul></ul><ul><ul><li>National Weather Service </li></ul></ul>
  17. 27. Private goods => Public Goods: Data <ul><li>Private Goods </li></ul><ul><li>Meeting private needs </li></ul><ul><li>Linking to other websites </li></ul>Public Goods <ul><li>Google uses this information to rank sites </li></ul><ul><li>Everyone benefits </li></ul>Google monetises with ads
  18. 28. <ul><li>Private Goods </li></ul><ul><li>Platform for recording data </li></ul>Public Goods <ul><li>PLM aggregates data and shares it back as public and private goods </li></ul>Sales of data
  19. 29. Data exhaust
  20. 30. Where’s Wally?
  21. 31. Global Competitions State of the art 70% 1½ weeks 70.8% Competition closes 77% Predicting HIV viral load Accuracy of Prediction (1 – 100%) <ul><li>Revenue or sales forecasts </li></ul><ul><li>Traffic forecasting </li></ul><ul><li>Energy demand </li></ul><ul><li>Predicting crime </li></ul><ul><li>Tax/social security fraud </li></ul><ul><li>Hospital casualty demand </li></ul><ul><li>Identifying great </li></ul><ul><ul><li>Teachers </li></ul></ul><ul><ul><li>Schools </li></ul></ul><ul><ul><li>Hospitals </li></ul></ul><ul><li>and their best practices </li></ul>US$500
  22. 32. We could not be happier with the result.  The Kaggle approach has set a new benchmark in Government for the development of successful predictive models, delivered quickly and very cost effectively.  In particular, the flexibility of the winning predictive model will enable its application to other major transport routes to the CBD and allow for the addition of other factors such as weather and incident. Susan Calvert Director, Strategy and Project Delivery Unit
  23. 34. Forecast Eurovision Voting Dr. Derek Gatherer, UK Take on the Quants 1 & 2 John Blatz Baltimore Edmund & Adrian London & USA Jason Trigg Pennsylvania Chih-Li Sung & Roy Tseng Penghu & Taipei Jure Zbontar Ljubljana Thomas Mahony Canberra Emir Delic Australia Glen Maher Canberra Predict HIV Chris Raimondi Batimore Claudio Perlich USA Gzegorz Swiszcz Gera Edmund & Adrian London & USA Tourism Forecasting Part 1 Rajstennaj Barrabas USA Jason Trigg Pennsylvania Felipe Maia Uppsala University Lee Baker Las Cruces, New Mexico INFORMS Cole Harris Texas Nan Zhou Pittsburgh Chess Ratings Uri Blass Tel-Aviv Giuseppe Ragusa Rome Robert Warsaw Tourism Forecasting Part 2 R Package Recommendation Engine Ivan Russian Federation The top 3 competitors for: Philipp Emanuel Widmann Heidelberg, DE Dr. Christopher Hefele, New York Chris DuBois Portland Where’s Wally? Where’s Jeremy? Chris Raimondi Baltimore Felipe Maia Uppsala University Jeremy Howard
  24. 35. Where’s Wally from?
  25. 36. What are Wally’s Qualifications
  26. 37. Rebuilding an info-structure <ul><li>Global CrisisCommons  </li></ul><ul><li>Within 2 hours of #CCearthquake </li></ul><ul><li>Global volunteers parse 300,000 tweets. </li></ul><ul><li>“ Shell 58 Barrack Rd out of petrol – only diesel ”. </li></ul><ul><li>Agencies fussed, helped and obstructed. </li></ul><ul><li>Kaggle comp to triage tweets </li></ul>
  27. 39. New routines to generate data Real estate or other sales Indicated Service provider
  28. 40. Medical procedure Indicated Service provider
  29. 41. Litigation Indicated Service provider
  30. 42. Gruen Tenders <ul><li>Forward looking data </li></ul><ul><li>Tailored to the specific case at hand </li></ul><ul><li>Enables innovation and data capture </li></ul><ul><li>Generate a mass of new data </li></ul><ul><li>Compares like with like </li></ul><ul><li>Minimises perverse incentives </li></ul>
  31. 43. E [email_address] T @nicholasgruen
  32. 44. The Public Goods of Web 2.0 SEO Google Page Rank
  33. 45. The Public Goods of Web 3.0 Ontologies followed - with tagging, for same reasons as SEO Ontologies created