ViSTA-TV Workpackage 6: External Data Service for Metadata Enrichment & Novel TV Recommendations


Published on
ViSTA-TV project:
Video Stream Analytics for Viewers in the TV Industry

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

ViSTA-TV Workpackage 6: External Data Service for Metadata Enrichment & Novel TV Recommendations

  1. 1. Video Stream Analytics for Viewers and the TV Industry WP6: External Data Service
  2. 2. 2WP6: External Data Service Objectives
  3. 3. WP6 Objectives 3TITLE •  O.6.1 •  External data service design •  Analysis of candidate sources •  Analysis of data extracted •  O.6.2 •  External data service employed •  Enrich the EPG data •  Enrich feature extraction data •  Discover links between programs for novel recommendations •  O.6.3 •  Publish data to the Linked Open Data cloud The external data service aims at supporting the recommendation process by improving the connectivity of TV programs, which does not surface with the standard EPG metadata.
  4. 4. ViSTA-TV External Data Service 4TITLE load enrich publish load
  5. 5. External Data Service 5 "World War II" "Television Program" "Green Cross Code" "Tom Stoppard" "David Prowse" synopsis concepts "In this episode, Larry meets two veterans who each lost a limb in World War 2 to ask how differently we treat today 's injured soldiers. Plus a look back at the iconic Green Cross Code films. With Stuart Hall and Miriam Stoppard" po:long_synopsis "Larry Lamb" "Miriam Stoppard" "Stuart Hall" po:credit po:credit "" "" "" "" "" "" "" po:credit EPG DWH Concept tagging DBpedia:<LABEL> LABELrdfs:labeldc:subject Language Detection Synopsis Credits Title DBpedia:<concept>
  6. 6. Zattoo Data Service: RDF 6WP6: External Data Service "9966901" po:pid "Die allerbeste Sebastian Winkler Show"dc:title "mit Motsi Mabuse, Lady Bitch Ray und Sarah Brendel" zattoo:episode_title po:masterbrand "(Premiere in Einsfestival )" po:long_synopsis po:category po:episode rdf:type po:credit po:credit po:credit "guest" "Sarah Brendel" "guest" "Motsi Mabuse" "guest" "Lady Bitch Ray" po:role po:alias po:role po:alias po:role po:alias po="" zattoo="" dc="" rdf =" rdf syntax ns#"
  7. 7. 7WP6: External Data Service
  8. 8. 8WP6: External Data Service Enrichments Service
  9. 9. 9WP6: External Data Service
  10. 10. 10WP6: External Data Service
  11. 11. 11WP6: External Data Service
  12. 12. 12WP6: External Data Service
  13. 13. 13WP6: External Data Service
  14. 14. 14WP6: External Data Service
  15. 15. LOD Linking Service 15TITLE WP5
  16. 16. 16WP6: External Data Service Recommendations
  17. 17. LOD for recommendations 17External Data Service •  LOD datasets provide additional information which can be used to provide novel TV recommendations •  The challenge is to identify those links which are more useful to be used in the recommendation process. •  We started to analyze the datasets to identify features which can help in selecting the right links to use
  18. 18. 18WP6: External Data Service Current & Future Work
  19. 19. Current & Future Work 1.  Continuously adding new sources 2.  Continuous improvement of EPG enrichment quality •  complimentary services •  crowdsourcing 3.  Defining LOD-based notion of serendipity 4.  Further studies on the LOD patterns and their suitability for recommendations 5.  Applying approach in other domains, e.g. books 19TITLE
  20. 20. 1. Adding new sources 20TITLE Dataset  Objects  Triples  Links to ...  DBpedia  3.77 mil  400 mil  27.2 mil  Freebase  23 mil  337 mil  3.9 mil  BBC  60 mil  43.237  BBC music  20 mil  23.000  NYT  10.467  345.889  23.400  MusicBrainz  178 mil  855.754  Flickr  1.95 mil  5.61 mil  3.400.000  LinkedMDB  503.242  6 mil  162 756  GeoNames  8 mil  94 mil  0  LinkedGeoData  1 bil  20 bil  53204 
  21. 21. 2. Data cleaning Following the grandeur of Baroque, Rococo art is often dismissed as frivolous and unserious, but Waldemar Januszczak disagrees. […] The first episode is about travel in the 18th century and how it impacted greatly on some of the finest art ever made. The world was getting smaller and took on new influences shown in the glorious Bavarian pilgrimage architecture, Canaletto's romantic Venice and the blossoming of exotic designs and tastes all over Europe. The Rococo was art expressing itself in new, exciting ways. enrichment “Canaletto” ontology:Location “Rococo” dbpedia:Rococo_(band) •  Type mis-classification •  URI mis-annotation v  Integration of different text annotators results v  Validation through crouwdsourcing tasks Collaboration with: Silvia Giannini
  22. 22. 2. Data cleaning extractor label DBpedia ontology class DBpedia URI Canaletto ontology:Location dbpedia:Canaletto TextRazor Canaletto dbpedia-owl:[Artist, Agent, Person] dbpedia:Canaletto Canaletto dbpedia-owl:[Artist, Agent, Person] dbpedia:Canaletto Canaletto dbpedia-owl:[Artist, Agent, Person] dbpedia:Canaletto •  Label •  NERD ontology class •  sameAs link •  Label •  DBpedia ontology class •  Dbpedia URI •  Label •  DBpedia category •  Wikipedia page •  Label •  DBpedia ontology class •  DBpedia URI Type & URI alignment Voting system: <Canaletto, dbpedia-owl:[Artist, Agent, Person] dbpedia:Canaletto> 3/4
  23. 23. Validate: •  Labels relevance •  Relevant labels types results integration Aggregated enrichment (based on majority vote) Automatic integration of text annotators for enrichment Analysis of collected data for: •  Voting system validation (also URIs) •  Parameters tuning (e.g., complementarity handling) Program synopsis What if: •  there is a tie-break? •  majority of annotators are wrong? •  more granular alignment ontologies are adopted to avoid lack of type (or, type owl:Thing)? Aggregated enrichment (based on majority vote)
  24. 24. 24WP6: External Data Service LOD & Serendipity
  25. 25. 3. LOD-based Sependipity 25WP6: External Data Service Collaboration with:
  26. 26. LOD-based Sependipity 26WP6: External Data Service
  27. 27. 27WP6: External Data Service Diversity
  28. 28. 4. LOD-based Patterns for Diversity 28WP6: External Data Service LOD-based method for increasing diversity in recommendations •  extracts all the patterns from an RDF dataset à clusters generated & measured for diversity •  fed into two statistical models •  to determine, which semantic patterns can extract subsets of Linked Data to improve diversity in recommendations •  data characterization step to choose model •  diversity measures, e.g. entropy & semantic similarity •  IMDB & DBPedianoisiness, size & sparsity of LOD
  29. 29. 29WP6: External Data Service Applied to ‘Books’ Domain
  30. 30. References •  Valentina Maccatrozzo, Lora Aroyo and Willem Robert van Hage, Crowdsourced Evaluation of Semantic Patterns for Recommendations, User Modeling, Adaptation, and Personalization, Rome, Italy, July 10-14, 2013. •  Valentina Maccatrozzo, Davide Ceolin and Lora Aroyo, LOD Enrichment of TV Programs, in W3C Italy Event: Linked Open Data: where are we?, Rome, Italy, February 20-21, 2014. •  Valentina Maccatrozzo, Davide Ceolin, Lora Aroyo and Paul Groth, Semantic Pattern- based Recommender, Extended Semantic Web Conference (ESWC2014), Heraclion, Greece, May 25-29, 2014. •  Ceolin, Davide, Moreau, Luc, O'Hara, Kieron, Fokkink, Wan, Van Hage, Willem Robert, Maccatrozzo, Valentina, Sackley, Alistair, Schreiber, Guus and Shadbolt, Nigel (2014) Two procedures for analyzing the reliability of open government data. Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU'2014), Montpellier, FR, 15 Jul 2014. 30TITLE