There are countless models that can be applied to solve any one predictive analytics problem. It is impossible to know at the outset which technique will be most effective.
Many are academics who want access to real world data and problems
The growing revolution in Data: A presentation to the Social Business Summit Nicholas Gruen Chairman, Kaggle Chairman, Government 2.0 Taskforce, E [email_address] T @nicholasgruen Singapore, 6 th April, 2011
All That Data… 3 years of historical data for comparison 10 x 750 x 50 x 52 x 3 = 58,500,000 data points 4 regions to segregate the data 10 x 750 x 50 x 52 x 3 x 7 x 4 = 1,638,000,000 data points 50 states to segregate the data 10 x 750 x 50 x 52 x 3 x 7 x 4 x 50 = 81,900,000,000 data points 7 types of data to monitor (POS, Inventory, Marketing, Syndicated, etc) 10 x 750 x 50 x 52 x 3 x 7 = 409,500,000 data points 8 categories to aggregate the data 10 x 750 x 50 x 52 x 3 x 7 x 4 x 50 x 8 = 655,200,000,000 data points 10 Retailers to monitor 10 data points 750 Stores per retailer to monitor 10 x 750 = 7500 data points 50 products per store to monitor 10 x 750 x 50 = 375,000 data points 52 weeks of data per year for trend analysis 10 x 750 x 50 x 52 = 19,500,000 data points 655 Billion+ data points involved with managing the retail sales channel Source: Marilyn and Terence Craig @ Strataconf
He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature, when she made them, like fire, expansible over all space, without lessening their density in any point, and like the air in which we breathe, move, and have our physical being, incapable of confinement or exclusive appropriation.
Thomas Jefferson to Isaac McPherson, August, 1813
Jefferson’s enlightenment dream
Public goods Public goods – goods that no-one will supply if the government doesn’t Public goods . . . present serious problems in human organisation. Vincent and Elenor Ostrom - 1977
The social preconditions of markets (Public Goods)
Language Adam Smith
Public Goods Private Goods [The public good of] Justice . . . is the main pillar that upholds the whole edifice. If it is removed, the great, the immense fabric of human society . . . must in a moment crumble into atoms. Adam Smith
The economics of abundance: a new birth of ‘free’dom Public goods . . . present serious problems in human organisation. Vincent and Elenor Ostrom - 1977 The freedom of ideas is the liberation of our species Public goods as a problem Public goods as an opportunity
The ecology of data Data Schema or Context Information An example from Web 2.0 …
We use an advanced algorithm that scours data on every domestic flight for the past 10-years and matches it to real-time conditions. We help you evaluate alternative options and help connect you to the right person to make the change.
Global Competitions State of the art 70% 1½ weeks 70.8% Competition closes 77% Predicting HIV viral load Accuracy of Prediction (1 – 100%)
Revenue or sales forecasts
Tax/social security fraud
Hospital casualty demand
and their best practices
We could not be happier with the result. The Kaggle approach has set a new benchmark in Government for the development of successful predictive models, delivered quickly and very cost effectively. In particular, the flexibility of the winning predictive model will enable its application to other major transport routes to the CBD and allow for the addition of other factors such as weather and incident. Susan Calvert Director, Strategy and Project Delivery Unit
Forecast Eurovision Voting Dr. Derek Gatherer, UK Take on the Quants 1 & 2 John Blatz Baltimore Edmund & Adrian London & USA Jason Trigg Pennsylvania Chih-Li Sung & Roy Tseng Penghu & Taipei Jure Zbontar Ljubljana Thomas Mahony Canberra Emir Delic Australia Glen Maher Canberra Predict HIV Chris Raimondi Batimore Claudio Perlich USA Gzegorz Swiszcz Gera Edmund & Adrian London & USA Tourism Forecasting Part 1 Rajstennaj Barrabas USA Jason Trigg Pennsylvania Felipe Maia Uppsala University Lee Baker Las Cruces, New Mexico INFORMS Cole Harris Texas Nan Zhou Pittsburgh Chess Ratings Uri Blass Tel-Aviv Giuseppe Ragusa Rome Robert Warsaw Tourism Forecasting Part 2 R Package Recommendation Engine Ivan Russian Federation The top 3 competitors for: Philipp Emanuel Widmann Heidelberg, DE Dr. Christopher Hefele, New York Chris DuBois Portland Where’s Wally? Where’s Jeremy? Chris Raimondi Baltimore Felipe Maia Uppsala University Jeremy Howard