2. Rating Datasets
What are ratings? Explicit user preference information
Why ratings? Recommender systems
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 2
3. Rating Datasets
What are ratings? Explicit user preference information
Why ratings? Recommender systems
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 3
4. Ratings Scarcity in Research
Ratings = private data
Public datasets to the rescue?
– MovieLens 100K (1998)
– MovieLens 1M (2000)
– MovieLens 10M (2008)
– More on recsyswiki.com
Old, Synthetic Datasets
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 4
5. Social Sharing = Ratings Goldmine
Previous research: MovieTweetings
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 5
6. Social Sharing = Ratings Goldmine
Previous research: MovieTweetings
– Movie Rating dataset from IMDb – Twitter
– https://github.com/sidooms/MovieTweetings
What about other domains? Websites?
Well, let’s try it out!
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 6
7. Target Websites - Goodreads
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 7
Twitter user - Rating - Book title
Book author - Goodreads URL - Time
8. Target Websites - Pandora
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 8
Twitter user - Song
Pandora URL - Time
9. Target Websites - YouTube
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 9
Twitter user - (Video uploader)
YouTube URL - Time
10. Mining Experiment
But words are wind…
– 2 Weeks experiment
– 4 Online platforms
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 10
15. Applications
Collect ratings for recsys research / input
Cross-domain recsys research
Trend detection, analytics, ...
Applicable for all social sharing webs
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 15
16. Conclusions
Ratings scarcity in research
Public dataset are old and synthetic
Social sharing = ratings goldmine
2 week experiment, 4 major websites
Python code & datasets on Github
True cross-domain ratings dataset
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 16