Lessons learned from Running Hundreds of Kaggle Competitions: At Kaggle, we've run hundreds of machine learning competitions and seen over 80,000 data scientists make submissions. One thing is clear: winning competitions isn't random. We've learned that certain tools and methodologies work consistently well on different types of problems. Many participants make common mistakes (such as overfitting) that should be actively avoided. Similarly, competition hosts have their own set of pitfalls (such as data leakage).
In this talk, I'll share what goes into a winning competition toolkit along with some war stories on what to avoid. Additionally, I’ll share what we’re seeing on the collaborative side of competitions. Our community is showing an increasing amount of collaboration in developing machine learning models and analytic solutions. I'll showcase examples of this and discuss how these types of collaboration will improve how data science is learned and applied.