Semantics at the multimedia fragment level or how enabling the remixing of online media

  • 1,107 views
Uploaded on

Presentation given at the 6th tele-TASK symposium in HPI, October 2012, Postdam, Germany

Presentation given at the 6th tele-TASK symposium in HPI, October 2012, Postdam, Germany

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,107
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
31
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Semantics at the multimedia fragment level or how enabling the remixing of online mediaRaphaël Troncy <raphael.troncy@eurecom.fr>
  • 2. Once upon a time … 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -2
  • 3. … leading to sharing Media Fragments Publishing status message containing a M di Fragment URI Media F t  Use a ‘#’ !  Highlight a video sequence  Highlight a region to pay attention to 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -3
  • 4. What are Media Fragments?0 20 temporal media fragment 35 t spatial media fragment track media fragment 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -4
  • 5. Media Fragments (temporal)Original resource lengthFragment beginning Playback progress Fragment end g 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -5
  • 6. Media Fragments (spatial) + Demo highlighted fragmentsemi-opaquesemi opaque overlay 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -6
  • 7. Media Fragments URIs Bookmark / Share parts (fragments) of audio/video content di / id t t Annotate media fragments Search for media fragments Mash-ups C Conserve b d idth bandwidth http://www.w3.org/TR/media frags reqs/ http://www.w3.org/TR/media-frags-reqs/ http://www.w3.org/TR/media-frags/ 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -7
  • 8. Video annotation 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -8
  • 9. Video interactivityCONCEPT IN PLAYER Cubism Fauvism Expressionism FACETS / PROPERTIES OF CONCEPT CONTENT ENRICHMENT CO C 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -9
  • 10. LinkedTV EU Project Vision  Ubiquitously online cloud of Networked Audio-Visual  12 Excellent Partners Content  Decoupled from place, Fraunhofer Eurecom E device or source STI GMBH Condat CERTH BEELD EN GELUID Aim UEP Noterik  provide interactive UMONS U. ST GALLEN multimedia service for non- CWI RBB professional end-users  focus television broadcast content as seed videos d id Web: http://www.linkedtv.eu 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 10
  • 11. Video Accessibility What is required to make video accessible on the Web? Technologies:  Annotating: automatic (speech transcription) and manual (social collaborative annotation tool)  Addressing: pointing to, retrieving, transmitting only parts of media  Rendering: video visualization for the impaired, Braille output Benchmarking: Sphinx, HTK, Julius J li 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 11
  • 12. Speech Processing 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 12
  • 13. Demo: http://semantics.eurecom.fr/acav/ 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 13
  • 14. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 14
  • 15. Semantic indexing at the fragment level Benchmarking: Sphinx, HTK, Julius  NER + full text index with the transcription  Interlinking with the Linked Data Cloud to enable semantic search 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 15
  • 16. NERD: Named Entity Recognition andDisambiguation Compare performances of Named Entity Recognition tools  Understand strengths and weaknesses of different Web APIs  Adapt NER processing to different context (Learn how to) Combine NER tools Participate in the ANR ETAPE benchmark 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 16
  • 17. What is a Named Entity recognition task? A task that aims to locate and classify the name of a person or an organization a location, a brand, a organization, location brand product, a numeric expression including time, date, money and p y percent in a textual document 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 17
  • 18. NER Tools and Web APIs Standalone software  GATE  Stanford CoreNLP  Temis http://nerd.eurecom.fr/ Web APIs 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 18
  • 19. What is NERD?ontology1 REST API2 UI3 The NERD ontology has been integrated in the NIF p j , g project, a EU FP7 in the context of the LOD2: Creating Knowledge out of Interlinked Data 1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 19
  • 20. Factual comparison of 10 Web NER tools Alchemy DBpedia Evri Extractiv Lupedia Open Saplo Wikimeta Yahoo! Zemanta API Spotlight CalaisLanguage EN,FR, EN FR EN EN,I EN I EN EN,FR, EN FR EN,FR EN FR EN, EN EN,FR EN FR EN EN GR,IT, GR* T IT SP SW SP PT,RU, PT* SP,SW SP*Granularity OEN OEN OED OEN OEN OEN OED OEN OEN OEDEntity N/A char N/A word range of char N/A POS range N/Aposition offset offset chars offset offset of charsClassification Alchemy DBpedia Evri DBpedia DBpedia Open N/A ESTER Yahoo FreeBaseschema FreeBase LinkedM Calais Scema.or DB gNumber of 324 320 5 34 319 95 5 7 13 81classesResponse JSON HTML HTM HTML HTML JSON JSON JSON JSON XMLFormat MicroF JSON L JSON JSON MicroF XML XML JSON XML RDF JSO RDF RDFa ormat RDF RDF XML N XML XML RDFQuota 30000 unl 300 3000 unl 50000 1333 unl 5000 10000(calls/day) 09/10/2012 - 0 Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 20/15
  • 21. NERD Ontology Aligned th t Ali d the taxonomies used b i d by the extractors 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 21
  • 22. NERD type OccurrenceBuilding the NERD Ontology Person 10 Organization 10 Country 6 Company 6 Location 6 Continent 5 City 5 RadioStation 5 Album 5 Product 5 ... ... 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 22
  • 23. NERD REST API RDF /document /user GET, /annotation/{extractor} POST, /extraction PUT, JSON /evaluation DELETE ... “entities” : [{ “entity”: “Tim Berners-Lee” , “type”: “Person” , “uri”: "http://dbpedia.org/resource/Tim_berners_lee", p p g , “nerdType”: "http://nerd.eurecom.fr/ontology#Person", “startChar”: 30, “endChar”: 45, “confidence”: 1, , “relevance”: 0.5 }]Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web ExtractionTools.Tools In: European chapter of the Association for Computational Linguistics (EACL12) Avignon France (EACL 12), Avignon, France. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 23
  • 24. NERD meets NIF Model documents through a set of strings deferencable on the Web : offset_23107_ 23110 a str:String ; offset 23107 str:referenceContext :offset_0_26546 . Map t i to tit M string t entity : offset_23107_ 23110 sso:oen dbpedia:W3C. Classification dbpedia:W3C rdf:type nerd:Organization .Rizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the LinkedData Cloud. In: (LDOW12) Linked Data on the Web (WWW12), Lyon, France. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 24
  • 25. NERD User Interface 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 25
  • 26. NERD Dashboard 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 26
  • 27. History of NER benchmarks CoNLL 2003 and CoNLL 2005  schema (4 types): person, organization, location and miscellaneous  language independent task ACE 2004 ACE 2005 and ACE 2007 2004,  schema (7 types): person, organization, location, facility, weapon, vehicle and geo-political entity  entity recognition, not just name (e.g. description, pronoun)  find relationships among entities extracted TAC 2009 (Knowledge Base Track)  schema (3 types): person, organization and location  create a knowledge base from the named entities extracted ETAPE 2012 (Named Entity Task)  schema: Quaero (7 main types, 32 sub-types) 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 27
  • 28. ETAPE 2012 challengegenre train dev test sourcesTV news 7h 40m 1h 40m 1h 40m BFM Story, Top QUestions (LCP) Pile et Face, Ca vous regarde, , g ,TV debates d b t 10h 30 30m 5h 10 10m 5h 10 10m Entre les lignes (LCP)TV amusements - 1h 05m 1h 05m La place du village (TV8) Train Dev EvalItem length 26h 10h 55m 10h 55mNb files 44 15 15Nb words 290517 91656 115511Nb Named Entities 46763 14398 13055Nb unique categories 33 33 33 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 28
  • 29. Participation at ETAPE (combined strategy) extraction(eA1,tA1,URIA1,siA1,eiA1) ... t URI si ei ... ... cleaning l i(eA2,tA2,URIA2,siA2,eiA2)(eA3,tA3,URIA3,siA3,eiA3) fusion When at least 2 extractors classify the (eN1,tN1,URIN1,siN1,eiN1) t URI si ei same entity with a different type then ` (eN2,tN2,URIN2,siN2,eiN2) we apply a preferred selection order (empirically defined): Wikimeta, AlchemyAPI, OpenCalais AlchemyAPI OpenCalais, Lupedia 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 29
  • 30. Participation at ETAPE (combined+ strategy) ETAPE Train & Dev ...Learned model POS tagger Created Apply rules (eA1,tA1,URIA1,siA1,eA1 static rules ) (eA2,tA2,URIA2,siA2,eiA2 ) fusion f (e1,t1,URI1,si1,ei1) Conflicts handled by priority selection: own, Wikimeta,AlchemyAPI, OpenCalais,Lupedia (eN1,tN1,URIN1,sN1,eN1) `(e ,t ,URI ,s ,e ) N2 N2 N2 N2 N2 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 30
  • 31. NERD Global results SLR Precision Recall F-measure %correctcombined 86.85% % 35.31% % 17.69% % 23.44% % 17.69% %combined+ 188.81% 15.13% 28.40% 19.45% 28.40%Combined+ : Eval corpus differs substantially from the Train & Devcorpora. The static rules do not fit well the Eval corpora and theyintroduce classification noise. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 31
  • 32. Per-Per-extractor results SLR Precision Recall F-measure %correctalchemyapi 37.71% 47.95% 5.45% 9.68% 5.45%lupedia 39.49% 22.87% 1.56% 2.91% 1.56%opencalais 37.47% 37 47% 41.69% 41 69% 3.53% 3 53% 6.49% 6 49% 3.53% 3 53%wikimeta 36.67% 19.40% 4.25% 6.95% 4.25%combined 86.85% 35.31% 17.69% 23.44% 17.69%(nerd)combined+ 188.81% 15.13% 28.40% 19.45% 28.40%(nerd+) 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 32
  • 33. NERD + Synote: http://linkeddata.synote.org Synote: 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 33
  • 34. WoLE WorkshopWoLE2012 Workshop in conjunction with theISWC2012 conference f http://wole2012.eurecom.fr 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 34
  • 35. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 35
  • 36. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 36
  • 37. Building the data.eurecom.fr 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 37
  • 38. Zenaminer Publish SCORM content in the Web of Data  separating the content from the layout Introduce the use of media element / fragments Automatic annotation of user comments using NER t l tools  hypertext link navigation to key terms and entities  satisfy better the information needs of the learner See also: http://zenaminer.sourceforge.net/ 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 38
  • 39. Example application: Link OpenLearnto relevant course/podcastsCredit: Mathieu D’AquinSee also: Zablith et al, LinkedLearning 2011
  • 40. Integrating Open  Educational Material in course descriptions Credit: Mathieu D’Aquin See also: Zablith et al, COLD 2011
  • 41. Take Home Message Video is a first class citizen on the Web  Annotations: Ontology and API for Media Resources  Access: Media Fragments URI NERD platform for extracting key information from learning resources including videos Linked Universities movement for federating initiatives in exposing educational data as i iti ti i i d ti ld t linked data 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 41
  • 42. Media Mixer Vision: adoption of semantic multimedia technologies ill f t t h l i will foster an European market for E k tf media fragment re-purposing and re-selling EU FP7 CSA: November 2012 - November 2014 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 42
  • 43. Credits Giuseppe Rizzo (Zenaminer, NERD) Anne Elisabeth Gazet (data.eurecom.fr) M thi D’Aquin (LinkedUniversities, Lucero) Mathieu D’A i (Li k dU i iti L ) 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 43
  • 44. http://www.slideshare.net/troncy09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 44