Raphaël troncy

Deep-linking into Media Assets at
the Fragment Level: Specification,
Model and Applications
Raphaël Troncy <raphael.troncy@eurecom.fr>

TimBL Vision back in 1994

17/12/2013 -

7ème Entretiens du Nouveau Monde Industriel (ENMI 2013)

-2

A typical HTML web page

17/12/2013 -


-3

What it looks like to a machine

17/12/2013 -


-4

Okay, so HTML is not helpful
Maybe we can tell the
machine what the
different parts of the
text represent?

title
speaker
time
location
abstract
biosketch
host
17/12/2013 -


-5

XML to rescue?

<title>
<speaker>
<time>
<location>

</title>
</speaker>
</time>
</location>

<abstract>

</abstract>

XML fans propose
creating a XML tag
set to use for each
application.
For talks, we can
choose
<title>, <speaker>, et
c.

<biosketch>
</biosketch>
<host>

17/12/2013 -

</host>


-6

XML  machine accessible meaning

<title>
<speaker>
<time>
<location>

</title>
</speaker>
</time>
</location>

But, to your
machine, the tags
still look like
this….

<abstract>

The tag names
carry no meaning.

<biosketch>

XML DTDs and
Schemas have
little or no
semantics.

</abstract>

<host>

17/12/2013 -

</host>

</biosketch>


-7

do not read

the following sign

Why is it so difficult to find
appropriate multimedia content, to
reuse and repurpose content
previously published and to present
this content in interfaces that vary
with user needs?

Image/Video indexing
 Techniques used by mainstream search engines
 search term occurs in the filename or in the caption or in user tags
 no semantics

 Image indexing: main problem
 an image is not alphabetic: there is no countable discrete units, that,
in combination will provide the meaning of the image
 image descriptors are not given with the image: one needs to
extract or interpret them

 Video indexing: additional problem
 a video has additionally a temporal dimension to take into account
 a video has a priori no discrete units neither (i.e. frames, shots,
sequences cannot be absolutely defined)

17/12/2013 -


- 13

Sounds Familiar?
 [Arnold Smeulders,
PAMI, 2000]
The semantic gap is the
lack of coincidence
between the information
that one can extract from
the sensory data and the
interpretation that the
same data has for a user
in a given situation

17/12/2013 -


- 14

The science of labeling
 Automatically detecting the presence of a
concept in a video stream

airplane
 Naming visual information

17/12/2013 -


- 15

A Simple Concept Detector

[Cees Snoek and Marcel Worring, SSMS, 2007]
17/12/2013 -


- 16

Support Vector Machine

[Cees Snoek and Marcel Worring, SSMS, 2007]
17/12/2013 -


- 17

The Computer Vision Approach
 Building detectors one-at-the-time
a face detector for
frontal faces

3 years later
a face detector for
non-frontal faces
One (or more) PhD for
every new concept

17/12/2013 -


- 18

a little drop of semantics goes a

long way

Jim Hendler [1997]

17/12/2013 -


- 20

Once upon a time …

17/12/2013 -


- 21

… leading to sharing Media Fragments
 Publishing status message containing
a Media Fragment URI
 Use a „#‟ !
 Highlight a
video
sequence
 Highlight a
region
to pay
attention to

17/12/2013 -


- 22

W3C Video on the Web Workshop - 2007

17/12/2013 -


- 23

Key topics
 Addressing: having global identifiers for identifying
spatial and temporal clips (for deep linking,
bookmarking, caching and indexing)
 Metadata: searching and discovering video is
difficult with the volume of online video
 Video codec: recommending a baseline (open)
video codec for the World Wide Web
 Content protection: managing digital rights
associated with the media is key: W3C should look
into metadata for digital rights
17/12/2013 -


- 24

Making video a "first class citizen"

17/12/2013 -


- 25

Flickr Notes

http://www.flickr.com/photos/mhausenblas/2883727293/
17/12/2013 -


- 26

YouTube Temporal Addressing (Sept 2008)

17/12/2013 -


- 27

Media Fragments Use Cases
 Bookmark / Share parts (fragments) of
audio/video content
 Annotate media fragments

 Search for media fragments
 Develop Mash-ups/Collage
 Conserve bandwidth

http://www.w3.org/TR/media-frags-reqs/

17/12/2013 -


- 28

What are Media Fragments?

0

20

“Scared Scene”

t

35

temporal media fragment

named media fragment

spatial media fragment

track media fragment

17/12/2013 -


- 29

Media Fragments Dimensions
 r01: Temporal fragments:
 a clipping along the time dimension from a start to an end time that
are within the duration of the media resource

 r02: Spatial fragments:
 a clipping of an image region, only consider rectangular regions

 r03: Track fragments:
 a track as exposed by a container format of the media resource

 r04: Named fragments:
 A temporal media fragment that has been given a name through
some sort of annotation mechanism

17/12/2013 -


- 30

Media Fragments (temporal)

Original resource
length

Fragment beginning
17/12/2013 -

Playback progress

Fragment end
- 31

Media Fragments (spatial)

highlighted
fragment
semi-opaque
overlay

http://ninsuna.elis.ugent.be/MFPlayer/html5
17/12/2013 -


- 32

17/12/2013 -


- 33

Media Fragment (Semantic) Annotation
Reg1
The "Big Three" at the Yalta
Conference (Wikipedia)

 Media Fragment creation: localize a region (person)
 Media Fragment annotation (tagging) = interpretation
Winston Churchill, UK Prime Minister, Allied Forces, WWII

 Media Fragment semantic annotation
:Reg1 foaf:depicts dbpedia:WinstonChurchill.
dbpedia:Churchill rdfs:label "Winston Churchill";
rdf:type foaf:Person
dbprop:order dbpedia:Prime_Minister_(UK).
17/12/2013 -


- 34

Media Fragment (Semantic) Annotation
A history of G8 violence (video)
(© Reuters)

Seq4
Seq1

 Media Fragment creation:
localize a temporal sequence
 Media Fragment annotation (tagging) = interpretation
G8 Summit, EU Summit, Heiligendamm, 2007, Gothenburg, 2001

 Media Fragment semantic annotation
:Seq1 foaf:depicts dbpedia:33rd_G8_Summit.
:Seq4 foaf:depicts dbpedia:EU_Summit.
dbpedia:33rd_G8_Summit
rdfs:label "33rd G8 summit"@en ;
grs:point "54.143055555555556 11.841666666666667".
17/12/2013 -


- 35

Media Fragment Semantic Annotation
 Things, not strings!
http://googleblog.blogspot.fr/2012/05/introducing-knowledgegraph-things-not.html

 Use knowledge bases (LOD)
 Use common
vocabularies (LOV)
 Follow the 4
Linked Data principles
 Refine the 4 Linked Media principles

17/12/2013 -


- 36

Open Annotation Data Model
 Specification developed in the W3C Open Annotation
Community Group
http://www.openannotation.org/spec/core/
 Core model
 OWL vocabulary for representing
and sharing annotation of digital
resources (and their fragment) … in RDF
 A body is related to a target
 Nature of the annotation changes
according to intention (motivation)

 How to annotate
this image?

17/12/2013 -


- 37

Semantic Annotation of an Image

http://www.w3.org/community/openannotation/wiki/
SE_Semantically_Tagging_an_Image
17/12/2013 -


- 38

Maphub: http://maphub.github.io/

17/12/2013 -


- 39

Open Video: Annotation Project

http://openvideoannotation.org/
17/12/2013 -


- 40

LinkedTV: automatic annotations ...

17/12/2013 -


- 41

... and enrichment for hypervideos

CONCEPT IN
PLAYER
Cubism

Expressionism

Fauvism

FACETS / PROPERTIES OF CONCEPT
17/12/2013 -


CONTENT ENRICHMENT
- 42

Media Fragments and Annotations

http://data.linkedtv.eu/medi
a/e2899e7f#t=840,900

nerd:Location
Casablanca

nerd:Location
Cafe Rick

nerd:Person
H. Bogart

nerd:Person
I. Bergman

 Media Fragment URI 1.0





17/12/2013 -

Chapters
Scenes
Shots
etc…


- 43

Enrichment and Hypervideos

nerd:Location
Casablanca

nerd:Location
Cafe Rick

nerd:Person
H. Bogart

Nerd:Person
E. Tierney

17/12/2013 -


nerd:Person
I. Bergman
nerd:Location
China

- 44

17/12/2013 -


- 45

NERD: Named Entity Recognition and
Disambiguation
 Compare performances of
NER and NEL tools
 Understand strengths and weaknesses of different Web APIs
 Adapt NER processing to different context

 (Learn how to) Combine NER (/ NEL) tools

What is NERD?
ontology1

REST API2
UI3
1

2

17/12/2013 -

http://nerd.eurecom.fr/ontology
http://nerd.eurecom.fr/api/application.wadl
3 http://nerd.eurecom.fr

- 46

NERD User Interface

17/12/2013 -


- 47

Media Fragment + Open Annotation + NERD
Locator

MediaResource

Annotation

MediaFragment

Entity
Type

URL (hyperlink)

17/12/2013 -


- 48

Media Fragment Enricher:
http://mfe.synote.org/mfe/

17/12/2013 -


- 49

Linking pieces of knowledge

17/12/2013 -


- 50

Linking pieces of knowledge

17/12/2013 -


- 51

http://linkedtv.project.cwi.nl/news/

17/12/2013 -


- 52

Take Away Summary
 Video is a first class citizen on the Web
 Annotations: Ontology and API for Media Resources,
Open Annotation Data Model
 Access: Media Fragments URI
 NERD platform for extracting key information from textual
resources including video subtitles and microposts

 Embrace the Linked Media vision
 Publish, re-use, re-purpose and remix media descriptions
 Develop links between (part of) media items via their
descriptions

17/12/2013 -


- 53

Take Away Summary

17/12/2013 -


- 54

Credits
 Giuseppe Rizzo, Vuk Milicic, José Luis Redondo Garcia (EURECOM)
 Thomas Steiner (Google Inc.), Yunjia Li (University of Southampton)
 Marieke van Erp (Free University of Amsterdam)
 Erik Mannens, Davy ven Deursen (iMinds, Uni. Ghent)
 Paolo Ciccarese, Robert Sanderson, Herbert Van de Sompel and all
the members of the W3C Open Annotation Community Group
 … and many other students

17/12/2013 -


- 55

Raphaël troncy

Recommended

Recommended

More Related Content

Similar to Raphaël troncy

Similar to Raphaël troncy (20)

More from IRI

More from IRI (20)

Recently uploaded

Recently uploaded (20)

Raphaël troncy