The Search and Hyperlinking Task at MediaEval 2014

Search
and
Hyperlinking
2014
Overview
Maria
Eskevich,
Robin
Aly,
David
Nicolás
Racca
Roeland
Ordelman,
Shu
Chen,
Gareth
J.F.
Jones

Find
what
you
were
(not)
looking
for
Search
&
Explore

Users
Main
group
User
Target
Media
Professionals
Broadcast
Researchers
&
Educators
Journalists
Research
Academic
researchers
&
students
InvesDgate
Academic
educators
Educate
Public
users
CiMzens
Entertainment,
Infotainment
Professionals
Reuse
Media
Archivists
Annotate

RecommendaMon
(Linking)
Not
what
we
want

Linking
Audio-‐Visual
Content

1998 2002 2008
2010 2013 2015
DATA
BIG DATA?
not
representaMve
representaMve

Search
&
Hyperlinking
task
• User
oriented:
aim
to
explore
the
needs
of
real
users
expressed
as
queries.
– How:
UK
ciMzens
and
crowd
sourcing
for
retrieval
assessment
• Temporal
aspect:
seek
to
direct
users
to
the
relevant
parts
of
retrieved
video
(“jump-‐in
point”).
– How:
segmentaMon,
segment
overlap,
transcripts.
prosodic,
visual
(low-‐level,
high-‐level;
keyframes)
• MulDmodal:
want
to
invesMgate
technologies
for
addressing
variety
in
user
needs
and
expectaMons
– varied
visual
and
audio
contribuMons,
intenMonal
gap
between
query
and
mulMmodal
descriptors
in
content

ME
Search
&
Hyperlinking
task
in
development:
2012
–
2014
Search
Hyperlinking
2012
2013
2014
2012
2013
2014
Dataset
BlipTv
BBC
BlipTv
BBC
Features
released:
!
Transcripts
2
ASR
3
ASR
2
ASR
3
ASR
!
Prosodic
features
no
yes
no
yes
!
Visual
clues
for
queries
yes
no
no
!
Concept
detecDon
yes
yes
Type
of
the
task
Known-‐item
Ad-‐hoc
Ad-‐hoc
Query/Anchors
creaDon
PC
iPad
PC
iPad
Number
of
queries/anchors
30/30
4/50
50/30
30/30
11/
98/30
Relevance
assessment
MTurk
users
(BBC)
MTurk
MTurk
Numbers
of
assessed
cases
30
50
9
900
3
517
9
975
13
141
EvaluaDon
metrics
MRR,
MASP,
MASDWP
MAP(-‐bin/
tol),
P@5/10
MAP
MAP(-‐bin/tol),
P@5/10

Dataset:
Video
collecMon
• BBC
copyright
cleared
broadcast
material:
– Videos:
• Development
set:
6
weeks
between
01.04.2008
and
11.05.2008
(1335
hours/2323
videos)
•
Test
set:
11
weeks
between
12.05.2008
and
31.07.2008
(2686
hours,
3528
videos)
– Manually
transcribed
subMtles
– Metadata
• AddiDonal
data:
– ASR:
LIMSI/Vocapia,
LIUM,
NST-‐Sheffield
– Shot
boundaries,
keyframes
– Output
of
visual
concept
detectors
by
University
of
Leuven,
and
University
of
Oxford

Dataset:
Query
• 28
Users
-‐
Policemen,
Hair
dresser,
Bouncer,
Sales
manger,
Student,
Self-‐employed
• Two
hour
session
on
iPads:
– Search
the
archive
(document
level)
– Define
clips
(segment
level)
– Define
anchors
(anchor
level)
Statement
of
InformaMon
Need
Search
Refine
Relevant
Clips
Define
Anchors

User
study
@
BBC:
1.)
Statement
of
InformaMon
Need

User
study
@
BBC:
2.)
Search
Relevant
clips
Goto
1.)
Goto
3.)

User
study
@
BBC:
3.)
Refine
Relevant
Clip

User
study
@
BBC:
4.)
Define
Anchors

Data
cleaning:
Usable
InformaMon
Need
• DescripMon
clearly
specifies
what
is
relevant
• A
query
with
a
suitable
Mtle
exists
• Sufficient
relevant
segments
exist
(try
query)

Data
cleaning:
Process
• For
each
informaMon
need
in
batch
1. check
if
usable
2. If
in
doubt
use
search
to
search
for
relevant
data
3. reword
&
spellcheck
descripMon
4. select
the
first
suitable
query
5. Save

Data
cleaning:
Usable
Anchor
• Longer
than
5
seconds
• DesMnaMon
descripMon
clearly
idenMfies
the
material
the
user
wants
to
see
when
he
would
acMvate
the
anchor
described
by
label
• It
is
likely
that
there
are
some
relevant
items
in
the
collecMon

Data
cleaning:
Process
• For
each
informaMon
need
in
assigned
batch
– Go
through
anchors
• check
if
usable
• reword
&
spellcheck
descripMon
• Assess
whether
it
is
like
to
find
links
in
the
collecMon
(possibly
using
search)
– Save

Dataset:
outcome
(1/2)
• 30
queries
<top>
<queryId>query_6</queryId>
<refId>53b3cf9d42b47e4c32545510</refId>
<queryText>saturday
kitchen
cocktails</queryText>
</top>
<top>
<queryId>query_1</queryId>
<refId>53b3c64b42b47e4a362be4ce</refId>
<queryText>sightseeing
london</queryText>
</top>

Dataset:
outcome
(2/2)
• 30
anchors:
<anchor>
<anchorId>anchor_1</anchorId>
<refId>53b3c46f42b47e459265d06f</refId>
<startTime>16.38</startTime>
<endTime>17.35</endTime>
<fileName>v20080629_184000_bbctwo_killer_wh
ales_in_the</fileName>
</anchor>

Ground
truth
creaMon
• Queries/Anchors:
user
studies
at
BBC:
-‐
28
users
with
following
profile:
" Age:
18-‐30
years
old
" Use
of
search
engines
and
services
on
iPads
on
the
daily
basis
• Relevance
assessment:
via
crowdsourcing
on
Amazon
MTurk
plaporm:
– Top
10
results
from
58
search
and
62
hyperlinking
submissions
– 1
judgment
per
query
or
anchor
that
was
accepted/rejected
based
on
an
automated
algorithm,
special
cases
of
users
typos
checked
manually
– Number
of
evaluated
HITs:
9
900
for
search,
and
13
141
for
hyperlinking

•
P@5/10/20
•
MAP
based:
•
EvaluaMon
metrics
MAP:
taking
into
account
any
overlapping
segment:
•
MAP-‐bin:
relevant
segments
are
binned
for
relevance:
•
MAP-‐tol:
only
start
Mmes
of
the
segments
are
considered:

Results:
Search
sub-‐task:
MAP
18
16
14
12
10
8
6
4
2
0
LIMSI/Vocapia
Manual
No
ASR
NST/Sheffield
LIUM

Results:
Search
sub-‐task:
MAP_bin
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
LIMSI/Vocapia
Manual
No
ASR
NST/Sheffield
LIUM

Results:
Search
sub-‐task:
MAP_tol
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
LIMSI/Vocapia
Manual
No
ASR
NST/Sheffield
LIUM

Results:
Hyperlinking
sub-‐task:
MAP
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
CUNI_F_M_NoOverlapAudi
CUNI_F_MoW_NeiogOhtvse
rlapKSI2
CUNI_F_MW_NeoigOhvtes
rlapKSIW
CUNI_F_M_eNigohOtsv
erlapNoW
CUNI_F_M_eOigvhetrsl
apKSIWeig
CUNI_F_N_NhotOs
verlapAudio
CUNI_F_NW_NeiogOhtvse
rlapKSIW
CUNI_F_N_NeiogOhtvse
rlapNoWe
CUNI_O_M_iNghotOs
verlapKSIW
eights
DCLab_Sh_N_Concept2
DCLab_Sh_N_ConceptEnrich
ment
IRISAKUL_Ss_N_HTM
IRISAKUL_Ss_N_NGRAM
IRISAKUL_Ss_N_TM1
IRISAKUL_Ss_N_TM2
IRISAKUL_Ss_O_NGRAMNE
R
JRS_F_MV_ATextVisR
JRS_F_MV_AwConcept
JRS_F_MV_CTextVisR
JRS_F_MV_CwConcept
JRS_F_M_AText
JRS_F_M_CText
JRS_F_V_AcOnly
JRS_F_V_CcOnly
LINKEDTV2014_O_O_K
LINKEDTV2014_O_VO_KC7S
LINKEDTV2014_O_VO_KC7T
S
LINKEDTV2014_Ss_N_ALL
LINKEDTV2014_Ss_N_TEXT
LIMSI/Vocapia
Manual
No
ASR
NST/Sheffield
LIUM

Results:
Hyperlinking
sub-‐task:
MAP_bin
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
CUNI_F_M_NoOverlapAud
CUNI_F_ioMW_eNigohOtvs
erlapKSI
CUNI_F_2MW_eNigohOtvse
rlapKSI
CUNI_F_MW_eNigohOtsv
erlapNo
CUNI_F_MW_Oeivgehrtlsa
pKSIWei
CUNI_F_N_gNhotOs
verlapAudi
CUNI_Fo_WN_eNigohOtsv
erlapKSI
CUNI_F_WN_eNigohOtsv
erlapNo
CUNI_O_MW_eNigohOtsv
erlapKSI
Weights
DCLab_Sh_N_Concept2
DCLab_Sh_N_ConceptEnri
chment
IRISAKUL_Ss_N_HTM
IRISAKUL_Ss_N_NGRAM
IRISAKUL_Ss_N_TM1
IRISAKUL_Ss_N_TM2
IRISAKUL_Ss_O_NGRAMN
ER
JRS_F_MV_ATextVisR
JRS_F_MV_AwConcept
JRS_F_MV_CTextVisR
JRS_F_MV_CwConcept
JRS_F_M_AText
JRS_F_M_CText
JRS_F_V_AcOnly
JRS_F_V_CcOnly
LINKEDTV2014_O_O_K
LINKEDTV2014_O_VO_KC7
LINKEDTV201S4
_O_VO_KC7
TS
LIMSI/Vocapia
Manual
No
ASR
NST/Sheffield
LIUM

Results:
Hyperlinking
sub-‐task:
MAP_tol
0
0.05
0.1
0.15
0.2
0.25
0.3
CUNI_F_M_NoOverlapAud
CUNI_F_MioW_NeoigOhvtes
rlapKSI2
CUNI_F_WMe_iNghotOs
verlapKSI
CUNI_F_MW_eNigohOtsv
erlapNo
CUNI_F_MW_Oeivgehrtlsa
pKSIWei
CUNI_F_N_gNhotOs
verlapAudi
CUNI_Fo_WN_eNigohOtsv
erlapKSI
CUNI_F_WN_eNigohOtsv
erlapNo
CUNI_O_MW_eNigohOtsv
erlapKSI
Weights
DCLab_Sh_N_Concept2
DCLab_Sh_N_ConceptEnric
hment
IRISAKUL_Ss_N_HTM
IRISAKUL_Ss_N_NGRAM
IRISAKUL_Ss_N_TM1
IRISAKUL_Ss_N_TM2
IRISAKUL_Ss_O_NGRAMN
ER
JRS_F_MV_ATextVisR
JRS_F_MV_AwConcept
JRS_F_MV_CTextVisR
JRS_F_MV_CwConcept
JRS_F_M_AText
JRS_F_M_CText
JRS_F_V_AcOnly
JRS_F_V_CcOnly
LINKEDTV2014_O_O_K
LINKEDTV2014_O_VO_KC7
LINKEDTV201S4
_O_VO_KC7
TS
LIMSI/Vocapia
Manual
No
ASR
NST/Sheffield
LIUM

Lessons
learned
1.
iPad
vs
PC
=
different
user
behaviour
and
expectaMon
from
the
system.
2.
Prosodic
features
broaden
the
scope
of
the
search
sub-‐task.
3.
Use
of
shot
segmentaMon
based
units
achieves
the
worst
scores
for
both
sub-‐tasks.
4.
Use
of
metadata
improves
results
for
both
sub-‐tasks.

The
Search
and
Hyperlinking
task
was
supported
by
We
are
grateful
to
Jana
Eggink
and
Andy
O'Dwyer
from
the
BBC
for
preparing
the
collecMon
and
hosMng
the
user
trials.
...
and
of
course
Martha
for
advise
&
crowdsourcing
access.

JRS
at
Search
and
Hyperlinking
of
Television
Content
Task
Werner
Bailer,
Harald
SMegler
MediaEval
Workshop,
Barcelona,
Oct.
2014

Linking
sub-‐task
• Matching
terms
from
textual
resources
• Reranking
based
on
visual
similarity
(VLAT)
• Using
visual
concepts
(only/in
addiMon)
• Results
– Differences
between
different
text
resources
– Context
helped
only
in
few
of
the
cases
– Visual
reranking
provides
small
improvement
– Visual
concepts
did
not
provide
improvements
34

Zsombor
Paróczi,
Bálint
Fodor,
Gábor
Szűcs
SoluDon
with
concept
enrichment
• Concept
enrichment:
the
set
of
words
is
extended
with
their
synonyms
or
other
conceptually
connected
words.
• Top
10
vs
top
50
conceptually
connected
words
for
each
word
• Conclusion:
the
results
show
that
concept
enrichment
with
less
words
give
beuer
precision
because
at
the
opposite
case
the
noise
is
greater.

Television
Linked
To
The
Web
H.A.
Le1,
Q.M.
Bui1,
B.
Huet1,
B.
Cervenková2,
J.
Bouchner2,
E.
Apostolidis3,
F.
Markatopoulou3,
A.
Pournaras3,
V.
Mezaris3,
D.
Stein4,
S.
Eickeler4,
and
M.
Stadtschnitzer4
1 - Eurecom, Sophia Antipolis, France.
2 - University of Economics, Prague, Czech Republic.
3 - Information Technologies Institute, CERTH, Thessaloniki, Greece.
4 - Fraunhofer IAIS, Sankt Augustin, Germany.
16-‐17
Oct
2014
www.linkedtv.eu

Reasons
to
visit
the
LinkedTV
poster
LinkedTV
@
MediaEval
2014
Search
and
Hyperlinking
Task

The Search and Hyperlinking Task at MediaEval 2014

Recommended

Recommended

More Related Content

Similar to The Search and Hyperlinking Task at MediaEval 2014

Similar to The Search and Hyperlinking Task at MediaEval 2014 (20)

More from multimediaeval

More from multimediaeval (20)

Recently uploaded

Recently uploaded (20)

The Search and Hyperlinking Task at MediaEval 2014