9. Problems
•Multimedia data very dense manual
annotation not feasable
•Multimedia content analysis is
difficult and rarely good enough to
create reliable products.
9
10. My Research...
Network Knowledge
Semantic Web
Context Understanding
Semantic Computing
Machine Learning Recognition
Artificial Intelligence
Filtering Features
Signal/Text Processing
Images Audio Video Text
11. My Research...
Hypotheses:
• Multimedia content analysis works
better when every cue is taken into
account (eg. video AND audio).
• Semantic is enabled through
context. Converts AI research into
products.
12. Context
Sources of Context:
• Inclusion of prior knowledge
• Combination of algorithms
• Multimodality:
– audio+video+...
– extra hardware
• Human interaction
• ...
12
13. Context as Key:
Example 1
→ Cut Horse →
Paste ^V Meadow
Visual Object Extraction
13
14. Simple Interactive
Object Extraction (SIOX)
→ →
Image User Input Output
Context delivered by human interaction
14
15. SIOX: Algorithm Idea
Color Signatures from image retrieval:
Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image
Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000.
Idea: Instead of searching and image database, use Color
Signatures to search inside an image.
15
16. SIOX in GIMP
SIOX
Button
G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in
Images and Videos”, International Journal of Semantic Computing Vol 1,
No 2, pp. 221-247, World Scientific, USA, June 2007. 16
30. Speaker Diarization: Who
Spoke When?
Audiotrack:
Segmentation:
Clustering:
G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term
Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and
Language Processing, Vol 17, No 5, pp 985--993, July 2009.
30
34. Narrative Theme Navigation
G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by
Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009.
34
36. Connecting Multimedia
and Semantic Technologies
GStreamer
Appscio
User
Device Component 1
Driver
User
Component 2
Source Recorder
.
.
.
File User
Component n
36
37. Semantic Media
Framework
Pipeline Framework
Integrated
C/C++/Java Development
Interface Environment
Events Code
Custom Event
Source 1
Video Application Server
Web Technology
Custom Event
Interface
Source 2
. Scripting & Logic Engine
.
.
Custom Event Services Connector
Source n
http://www.appscio.com
37
38. Semantic Analysis of
Multimedia Data
• enables automatic logical
inference on perceptually
encoded data
• enables more “natural”
interaction with the computer:
“do what the user means”
• Interfaces nicely with Semantic
Web technologies
38
43. Thank You!
Questions?
Contact:
Dr. Gerald Friedland
International Computer Science Institute
Berkeley, CA
http://www.gerald-friedland.org
friedland@icsi.berkeley.edu 43