Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kddcup2011

10,097 views

Published on

  • Dating direct: ♥♥♥ http://bit.ly/36cXjBY ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ♥♥♥ http://bit.ly/36cXjBY ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Kddcup2011

  1. 1. The Art of Lemon’ssolution KDD Cup 2011 Track 2 Siwei Lai/ Rui Diao Liang Xiang
  2. 2. Outline Problem Introduction Data Analytics Content Item CF BSVD+ NBSVD+ Algorithms 11.2175% 3.8222% 3.5362% 3.8146%  Main Models  Model Ensemble Model Post Ensemble Process  Post Process 2.5033% 2.4808% Conclusion Future Work
  3. 3. Problem Introduction Two Tracks Track 2  Classification Problem  Positive Samples : tracks users vote higher than 80  Negative Samples : popular tracks users have not voted  Data Set  User voting data  Taxonomy data  Comments  Similar to Top-N recommendation problem  Using negative samples to prevent Harry Potter problem
  4. 4. Data Analytics User vote data may be ordered by time.  Anchoring effect  Vote on artists and then vote on their tracks This is main reason why we got 2nd position http://justaguyinagarage.blogspot.com/2011/0 6/recommendation-system-competitions.html
  5. 5. Data Analytics If a user have voted on artist/album, she will have large probability to vote the tracks of the artist/album. 45% 58% Artist ⇒ Artist’s tracks 75% 75% Album ⇒ Album’s tracks 45% 56% Item ⇒ Items with the same Artist 51% 52% Item ⇒ Items with the same Album
  6. 6. Data Analytics User vote data may be ordered by time.  Anchoring effect  Vote on artists and then vote on their tracks If a user have voted on artist, she will have large probability to vote the tracks of the artist.
  7. 7. Algorithm: Main Models Content-based Model Item-based Collaborative Filtering Model Binary Latent Factor Model Neighborhood-based Binary SVD Model
  8. 8. Content-based Model If a user have voted on artist/album, she will have large probability to vote the tracks of the artist/album.  Version 1. User will vote on a track if she have voted the same artist’s item before. (Error rate ≈ 17%) P(u, i) = 1 if user u have voted tracks with same artist/album of track i  Version 2. Use the average score of some artist/album. (Error rate ≈ 11%) P(u, i) = average score user u assigned on artist/album of track i or tracks with same artist/ablum
  9. 9. Item-based CollaborativeFiltering Jaccard Index Error rate ≈ 9%
  10. 10. Item-based CollaborativeFiltering Our Similarity
  11. 11. Item-based CollaborativeFiltering Model + Temporal information 141|8573 862|1455 2033|5396 ... ... ... 251480 0 232699 90 81180 64 232699 50 238869 90 3109 54 132238 50 271685 90 26594 52 1405 9 20 ... ... ... ... items items items 67376 50 252580 90 8830 26 3109 0 3109 90 232699 59 96153 30 49451 90 53396 57 ... ... ...
  12. 12. Item-based CollaborativeFiltering + Vote information 141|8573 862|1455 2033|5396 ... ... ... 251480 0 232699 90 81180 64 232699 50 238869 90 3109 54 132238 50 271685 90 26594 52 ... ... ... ... 67376 50 252580 90 8830 26 3109 0 3109 90 232699 59 96153 30 49451 90 53396 57 ... ... ...
  13. 13. Item-based CollaborativeFiltering Prediction
  14. 14. Item-based CollaborativeFiltering + Removing popular bias
  15. 15. Item-based CollaborativeFiltering Factors Error Rate (%) initial model (Jaccard Index + KNN) 8.9992 + removing popular bias 5.2953 + using temporal information 3.9283 + using vote information 3.8222 + using taxonomy information 3.6578
  16. 16. Binary Latent Factor Model prediction Error rate ≈ 6% Sampling  Positive samples: items in train data.  Negative samples: nearly the same as sampling test data.  Positive samples and Negative samples have the same number for each user
  17. 17. Binary Latent Factor Model+prediction Error rate ≈ 3.5%
  18. 18. Neighborhood-based BinarySVD Modelprediction
  19. 19. Features used Models Content Item CF BSVD+ NBSVD+FeaturesCollaborative filtering × √ √ √Neighborhood info × √ × √Ratings √ √ ○ ○Time ordering × √ × ×Artist/album √ ○ √ √Genre structure × × × ×
  20. 20. Model Ensemble Local test set Linear combination Local Train Simulated Annealing Train Set Set 8-fold cross validation Model Error Rate (%) weight Local Test Set Content 11.2175 0.002 Item CF 3.8222 0.438 Test Set BSVD+ 3.5362 0.006 NBSVD+ 3.8146 0.025
  21. 21. Post Process Some special features can not be modeled well Find special user-item pairs.  The most popular items.  Vote high on track’s album but vote low on it’s artist. … Multiply a factor
  22. 22. Algorithms Content 11.2175% 0.002 Item CF 0.483 3.8222% Model Post Process Ensemble 0.006 2.4808% 2.5033% BSVD+ 3.5362% 0.025 NBSVD+ 3.8146% …
  23. 23. Model Similarities
  24. 24. Conclusion Data Analysis is very important  User behavior data is ordered by time  Artist/Album data can improve accuracy a lot Team members number and model numbers is very important Useful algorithms:  Content-based  Neighborhood-based  Matrix Factorization
  25. 25. Future Work How to add temporal information into Binary SVD Model? Apply Binary SVD into real production  How to make explanation  How to make real-time on-line recommendation
  26. 26. Q&A Thanks! xlvector@gmail.com

×