• Email
  • Like
  • Save
  • Private Content
  • Embed
 

The Model and the Train Wreck - A Training Data How-To -- @mrogati's talk at Strata 2012

by

  • 6,172 views

Getting training data for a recommender system is easy: if users clicked it, it’s a positive – if they didn’t, it’s a negative. ...

Getting training data for a recommender system is easy: if users clicked it, it’s a positive – if they didn’t, it’s a negative.

… Or is it? You’ve probably learned an algorithm to run on top of your existing algorithm, now and every time you re-train. And what do you do when the data product you’re building doesn’t have any users yet? Do you really launch with random results, hand label 50K examples, or ask a Turker to pretend they’re User #1337?

Unlike having a better algorithm, having better training data can improve your results by orders of magnitude. Yet training data generation is often an afterthought—a footnote in a formula-filled publication.

In this talk, we use examples from production recommender systems to bring training data to the forefront: from overcoming presentation bias to the art of crowdsourcing subjective judgments to creative data exhaust exploitation and feature creation.

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel

9 Embeds 166

http://lanyrd.com 66
http://invervegascore.blogspot.co.nz 42
http://www.linkedin.com 37
http://jwjeong.com 13
https://twitter.com 3
https://abs.twimg.com 2
http://us-w1.rockmelt.com 1
http://a0.twimg.com 1
http://invervegascore.blogspot.com 1

More...

Statistics

Likes
4
Downloads
0
Comments
2
Embed Views
166
Views on SlideShare
6,006
Total Views
6,172

12 of 2 previous next

Post Comment
Edit your comment

The Model and the Train Wreck - A Training Data How-To -- @mrogati’s talk at Strata 2012 The Model and the Train Wreck - A Training Data How-To -- @mrogati’s talk at Strata 2012 Presentation Transcript