Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Data talk at the World Bank


Published on

Published in: Technology
  • Be the first to comment

Open Data talk at the World Bank

  1. 1. Making data science a sportAnthony GoldbloomKaggle
  2. 2. Competition MechanicsCompetitions are judged on objective criteria
  3. 3. Kaggle’s Dark Matter Competitionon the White House blog “The world’s brightest physicists have been working for decades on solving one of the great unifying problems of our universe” “In less than a week, Martin O’Leary, a PhD student in glaciology, outperformed the state-of-the-art algorithms”
  4. 4. User base: 60,000 data scientists
  5. 5. Our User Base
  6. 6. Users apply different techniques • neural networks • genetic algorithms • logistic regression • random forest • support vector machine • Monte Carlo methods • decision trees • principal component analysis • ensemble methods • Kalman filter • adaBoost • evolutionary fuzzy modeling • Bayesian networks
  7. 7. EXAMPLE ESSAY QUESTION —We all understand the benefits of laughter. Forexample, someone once said, “Laughter is theshortest distance between two people.”Many other people believe that laughter is animportant part of any relationship. Tell a true story inwhich laughter was one element or part.
  8. 8. “Have you ever experienced a time with your friends or family where you laughed so hard your stomach hurt, and your eyes were filled with tears? Laughing is something every person needs.Automated results by A great laugh can make a persons daythe winning algorithm are and put a smile on their face. If no oneas reliable as manual laughed the world would be a terriblyassessment by teachers. sad place. My friends and I are always laughing, to the point where were rolling on the ground, clutching our stomachs laughing.”
  9. 9. & Obesity & Hypertension & High Cholesterol DiabetesProbability of going to hospital in the next six months
  10. 10. RTA Competition: Travel Time Prediction
  11. 11. Boehringer Ingelheim Competition: Data +1700 fieldsMutates Molecule True Molecule2 False Molecule3 True Molecule4 True Molecule 5 … True 0
  12. 12. Is it a lemon?
  13. 13. What could the world’s bestanalysts find in your data?e-mail a@kaggle.comphone +1 650 283 9781