SlideShare a Scribd company logo
1 of 17
Mining Cross-Domain Rating Datasets
from Structured Data on Twitter
@sidooms
Simon Dooms
Rating Datasets
 What are ratings? Explicit user preference information
 Why ratings? Recommender systems
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 2
Rating Datasets
 What are ratings? Explicit user preference information
 Why ratings? Recommender systems
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 3
Ratings Scarcity in Research
 Ratings = private data
 Public datasets to the rescue?
– MovieLens 100K (1998)
– MovieLens 1M (2000)
– MovieLens 10M (2008)
– More on recsyswiki.com
Old, Synthetic Datasets
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 4
Social Sharing = Ratings Goldmine
 Previous research: MovieTweetings
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 5
Social Sharing = Ratings Goldmine
 Previous research: MovieTweetings
– Movie Rating dataset from IMDb – Twitter
– https://github.com/sidooms/MovieTweetings
 What about other domains? Websites?
Well, let’s try it out!
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 6
Target Websites - Goodreads
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 7
Twitter user - Rating - Book title
Book author - Goodreads URL - Time
Target Websites - Pandora
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 8
Twitter user - Song
Pandora URL - Time
Target Websites - YouTube
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 9
Twitter user - (Video uploader)
YouTube URL - Time
Mining Experiment
 But words are wind…
– 2 Weeks experiment
– 4 Online platforms
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 10
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 12
Python code + Task Scheduler = Dataset files
https://github.com/sidooms/Twitter-ratings
The Numbers
One more thing …
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 13
Cross-Domain Rating Dataset
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 14
Applications
 Collect ratings for recsys research / input
 Cross-domain recsys research
 Trend detection, analytics, ...
 Applicable for all social sharing webs
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 15
Conclusions
 Ratings scarcity in research
 Public dataset are old and synthetic
 Social sharing = ratings goldmine
 2 week experiment, 4 major websites
 Python code & datasets on Github
 True cross-domain ratings dataset
ConclusionCross-DomainResultsSocial SharingIntro
Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 16
@sidooms
Simon Dooms
Mining Cross-Domain Rating Datasets
from Structured Data on Twitter

More Related Content

More from Simon Dooms

PhD Defense: Dynamic Generation of Personalized Hybrid Recommender Systems
PhD Defense: Dynamic Generation of Personalized Hybrid Recommender SystemsPhD Defense: Dynamic Generation of Personalized Hybrid Recommender Systems
PhD Defense: Dynamic Generation of Personalized Hybrid Recommender SystemsSimon Dooms
 
An online evaluation of explicit feedback mechanisms for recommender systems
An online evaluation of explicit feedback mechanisms for recommender systemsAn online evaluation of explicit feedback mechanisms for recommender systems
An online evaluation of explicit feedback mechanisms for recommender systemsSimon Dooms
 
Dynamic generation of personalized hybrid recommender systems
Dynamic generation of personalized hybrid recommender systemsDynamic generation of personalized hybrid recommender systems
Dynamic generation of personalized hybrid recommender systemsSimon Dooms
 
Improving IMDb Movie Recommendations with Interactive Settings and Filters
Improving IMDb Movie Recommendations with Interactive Settings and FiltersImproving IMDb Movie Recommendations with Interactive Settings and Filters
Improving IMDb Movie Recommendations with Interactive Settings and FiltersSimon Dooms
 
Caching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systemsCaching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systemsSimon Dooms
 
A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...
A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...
A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...Simon Dooms
 
A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...Simon Dooms
 

More from Simon Dooms (7)

PhD Defense: Dynamic Generation of Personalized Hybrid Recommender Systems
PhD Defense: Dynamic Generation of Personalized Hybrid Recommender SystemsPhD Defense: Dynamic Generation of Personalized Hybrid Recommender Systems
PhD Defense: Dynamic Generation of Personalized Hybrid Recommender Systems
 
An online evaluation of explicit feedback mechanisms for recommender systems
An online evaluation of explicit feedback mechanisms for recommender systemsAn online evaluation of explicit feedback mechanisms for recommender systems
An online evaluation of explicit feedback mechanisms for recommender systems
 
Dynamic generation of personalized hybrid recommender systems
Dynamic generation of personalized hybrid recommender systemsDynamic generation of personalized hybrid recommender systems
Dynamic generation of personalized hybrid recommender systems
 
Improving IMDb Movie Recommendations with Interactive Settings and Filters
Improving IMDb Movie Recommendations with Interactive Settings and FiltersImproving IMDb Movie Recommendations with Interactive Settings and Filters
Improving IMDb Movie Recommendations with Interactive Settings and Filters
 
Caching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systemsCaching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systems
 
A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...
A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...
A User-centric Evaluation of Recommender Algorithms for an Event Recommendati...
 
A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...
 

Recently uploaded

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 

Recently uploaded (20)

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 

Mining Cross-Domain Rating Datasets from Structured Data on Twitter

  • 1. Mining Cross-Domain Rating Datasets from Structured Data on Twitter @sidooms Simon Dooms
  • 2. Rating Datasets  What are ratings? Explicit user preference information  Why ratings? Recommender systems ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 2
  • 3. Rating Datasets  What are ratings? Explicit user preference information  Why ratings? Recommender systems ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 3
  • 4. Ratings Scarcity in Research  Ratings = private data  Public datasets to the rescue? – MovieLens 100K (1998) – MovieLens 1M (2000) – MovieLens 10M (2008) – More on recsyswiki.com Old, Synthetic Datasets ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 4
  • 5. Social Sharing = Ratings Goldmine  Previous research: MovieTweetings ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 5
  • 6. Social Sharing = Ratings Goldmine  Previous research: MovieTweetings – Movie Rating dataset from IMDb – Twitter – https://github.com/sidooms/MovieTweetings  What about other domains? Websites? Well, let’s try it out! ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 6
  • 7. Target Websites - Goodreads ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 7 Twitter user - Rating - Book title Book author - Goodreads URL - Time
  • 8. Target Websites - Pandora ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 8 Twitter user - Song Pandora URL - Time
  • 9. Target Websites - YouTube ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 9 Twitter user - (Video uploader) YouTube URL - Time
  • 10. Mining Experiment  But words are wind… – 2 Weeks experiment – 4 Online platforms ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 10
  • 11.
  • 12. ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 12 Python code + Task Scheduler = Dataset files https://github.com/sidooms/Twitter-ratings
  • 13. The Numbers One more thing … ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 13
  • 14. Cross-Domain Rating Dataset ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 14
  • 15. Applications  Collect ratings for recsys research / input  Cross-domain recsys research  Trend detection, analytics, ...  Applicable for all social sharing webs ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 15
  • 16. Conclusions  Ratings scarcity in research  Public dataset are old and synthetic  Social sharing = ratings goldmine  2 week experiment, 4 major websites  Python code & datasets on Github  True cross-domain ratings dataset ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 16
  • 17. @sidooms Simon Dooms Mining Cross-Domain Rating Datasets from Structured Data on Twitter