Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Rich Internet Application
for Semi-Automatic Annotation
of Semantic Shots on Keyframes
Elisabet Carcel, Manuel Martos,
Xav...
Outline
Graphical User Interface
Motivation
Requirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation...
Motivation
Digital content:
Multimedia content must be annotated
to allow its future retrieval.
RetrievalIngestAV Producti...
Motivation
DIGITION
Media Asset Management
Asset
02_14_20
4
4Wednesday, December 14, 11
0.056%
99.944%
Daily retrieval in 2010
Repository Ingest
Retrieval
Motivation
0
80,000
160,000
240,000
320,000
400,000
201...
Motivation
Current solution Proposed new approach
Metadata by strata
• Start time code
• End time code
• Content descripti...
Outline
Graphical User Interface
Motivation
Requirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation...
Semantic Shot Classifier
• Annotation of the extracted keyframes
Semantic Shot
Medium Shot
+
Player
Medium Shot
+
Box
Requi...
• President close-up
• Close-up
• Medium
• General bureau
• General chamber
• Player close-up
• Player medium
• Stadium ov...
Graphical User Interface
• Embeddable in existing system (Digition)
• User friendly
• Domain-generic interface
Requirement...
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Vide...
Supervised Machine Learning
Design
Training
Detection
=
????
0.23
0.98
>Thr
Decision
Yes
No
?
?
?
?
12
12Wednesday, Decemb...
System Architecture
Training:
Test:
>thr
Decision
Yes
No
MAX
confidence
?
Confidence
1
Confidence
2
Confidence
N
Classifier 2
C...
Annotation
Football datasetFootball datasetFootball dataset
Class 1 Medium 116
Class 2 Closer Up 87
Class 3 Stadium 4
Clas...
Trainer parameters
Unsupervised clustering to support multi-view:
-MIN # elements / cluster
-MAX cluster radius
Design
Det...
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Vide...
Quality measure: F1-score
Evaluation
Class 1 Class 2 Class 3
Class1
Class 2
Class 3
4 1 0
0 3 0
2 0 3
Automatic
M
a
n
u
a
...
Quality measure: F1-score
Evaluation
Class 1 Class 2 Class 3
Class 1
Class 2
Class 3
4 1 0
0 3 0
2 0 3
Automatic
M
a
n
u
a...
Cross Validation with Random Partition
• Training
- 80% of positive instances (+)
- Positive instances from other classes ...
3 parameters to tune
the classifier:
• MAX radius for a multi-view cluster
• MIN elements for a multi-view cluster
• Confide...
Results for Parliament domain
• Optimal parameters
– Threshold: 0.9
– MIN # elements: 3
– MAX radius: 0.8
Min Max radius
e...
• “Unknown” labelled as “Close up”: 7.73% error
• Semantic shots labelled as “Unknown”: 1.27% error
F1 Measure
• Close Up ...
Results for Football domain
• Optimal parameters
– Threshold: 0.7
– MIN # elements: 1
– MAX radius: 0.8
Evaluation
Min Max...
Cromo Cromo
PP
Classe
3
3
Cromo
Cromo PP
C.Beauty
Public
Pancarta
Llotja
Unknown
629 1 0 0 0 0 33
4 482 0 0 0 0 36
1 0 20 ...
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Vide...
Design
26
26Wednesday, December 14, 11
Web services
• List of keyframes given an asset ID
• Keyframe classifier
System architecture
Data exchange
JSON Format
00_0...
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Vide...
Video-demo
29
29Wednesday, December 14, 11
Video-demo
Parliament domain: http://youtu.be/R7o0sCoe-Gs
30
30Wednesday, December 14, 11
Video-demo
Football domain: http://youtu.be/uURx9GoRJBg
31
31Wednesday, December 14, 11
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Vide...
• Web service to use new annotations to re-train
• Use semantic shot information to launch additional
analysis
- Synthetic...
9	
,
		!
01
	
5
IMT
Upcoming SlideShare
Loading in …5
×

Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Authors: Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto and Ferran Marqués

Details: https://imatge.upc.edu/web/publications/rich-internet-application-semi-automatic-annotation-semantic-shots-keyframes

This paper describes a system developed for the semi- automatic annotation of keyframes in a broadcasting company. The tool aims at assisting archivists who traditionally label every keyframe manually by suggesting them an automatic annotation that they can intuitively edit and validate. The system is valid for any domain as it uses generic MPEG-7 visual descriptors and binary SVM classifiers. The classification engine has been tested on the multiclass problem of semantic shot detection, a type of metadata used in the company to index new con- tent ingested in the system. The detection performance has been tested in two different domains: soccer and parliament. The core engine is ac- cessed by a Rich Internet Application via a web service. The graphical user interface allows the edition of the suggested labels with an intuitive drag and drop mechanism between rows of thumbnails, each row representing a different semantic shot class. The system has been described as complete and easy to use by the professional archivists at the company.

  • Login to see the comments

  • Be the first to like this

Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

  1. 1. Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto, Ferran Marqués 1 1Wednesday, December 14, 11
  2. 2. Outline Graphical User Interface Motivation Requirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo 2 2Wednesday, December 14, 11
  3. 3. Motivation Digital content: Multimedia content must be annotated to allow its future retrieval. RetrievalIngestAV Production Video recording in digital format Storage and video indexing in a database system Search through the metadata to retrieve video content 3 3Wednesday, December 14, 11
  4. 4. Motivation DIGITION Media Asset Management Asset 02_14_20 4 4Wednesday, December 14, 11
  5. 5. 0.056% 99.944% Daily retrieval in 2010 Repository Ingest Retrieval Motivation 0 80,000 160,000 240,000 320,000 400,000 2010 2011 2012 25,000 25,000 25,000 185,000 160,000 135,000 Stored Ingest Video hours per year Ingest and Retrieval Non-retrieved content 75h retrieved 5 5Wednesday, December 14, 11
  6. 6. Motivation Current solution Proposed new approach Metadata by strata • Start time code • End time code • Content description - Action -Semantic shots - People Non- annotated assets Annotated assets GUI for semi-automatic annotation 6 6Wednesday, December 14, 11
  7. 7. Outline Graphical User Interface Motivation Requirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo 7 7Wednesday, December 14, 11
  8. 8. Semantic Shot Classifier • Annotation of the extracted keyframes Semantic Shot Medium Shot + Player Medium Shot + Box Requirements Semantic ConceptShot Type 8 8Wednesday, December 14, 11
  9. 9. • President close-up • Close-up • Medium • General bureau • General chamber • Player close-up • Player medium • Stadium overview • Audience • Banner • Box Requirements Football Parlament 9 9Wednesday, December 14, 11
  10. 10. Graphical User Interface • Embeddable in existing system (Digition) • User friendly • Domain-generic interface Requirements 10 10Wednesday, December 14, 11
  11. 11. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 11 11Wednesday, December 14, 11
  12. 12. Supervised Machine Learning Design Training Detection = ???? 0.23 0.98 >Thr Decision Yes No ? ? ? ? 12 12Wednesday, December 14, 11
  13. 13. System Architecture Training: Test: >thr Decision Yes No MAX confidence ? Confidence 1 Confidence 2 Confidence N Classifier 2 Classifier N Classifier 1 Domain model MAX class Unknown SVM Trainer Model class 1 Model class 2 Model class N Design SVM Trainer SVM Trainer 13 13Wednesday, December 14, 11
  14. 14. Annotation Football datasetFootball datasetFootball dataset Class 1 Medium 116 Class 2 Closer Up 87 Class 3 Stadium 4 Class 4 Audience 19 Class 5 Banner 4 Class 6 Box 6 Class 7 Unknown 278 514 Parliament datasetParliament datasetParliament dataset Class 1 President CU 16 Class 2 Close-up 63 Class 3 Medium 68 Class 4 Bureau 13 Class 5 Chamber 23 Class 6 Unknown 84 276 Design GAT annotation tool: http://upseek.upc.edu/gat 14 14Wednesday, December 14, 11
  15. 15. Trainer parameters Unsupervised clustering to support multi-view: -MIN # elements / cluster -MAX cluster radius Design Detector parameters Confidence threshold >Thr. Decision Yes No Confidence Unknown 15 15Wednesday, December 14, 11
  16. 16. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 16 16Wednesday, December 14, 11
  17. 17. Quality measure: F1-score Evaluation Class 1 Class 2 Class 3 Class1 Class 2 Class 3 4 1 0 0 3 0 2 0 3 Automatic M a n u a l 17 17Wednesday, December 14, 11
  18. 18. Quality measure: F1-score Evaluation Class 1 Class 2 Class 3 Class 1 Class 2 Class 3 4 1 0 0 3 0 2 0 3 Automatic M a n u a l 18 18Wednesday, December 14, 11
  19. 19. Cross Validation with Random Partition • Training - 80% of positive instances (+) - Positive instances from other classes are considered negative (-) Evaluation • Detection - 20% of the positive instances (+) 19 19Wednesday, December 14, 11
  20. 20. 3 parameters to tune the classifier: • MAX radius for a multi-view cluster • MIN elements for a multi-view cluster • Confidence threshold Threshold (0.0 - 1.0) Evaluation Cross-validation iterations (3) Max radius (0.1 - 1.0) Min elements (0 - 5) F1 measure 20 20Wednesday, December 14, 11
  21. 21. Results for Parliament domain • Optimal parameters – Threshold: 0.9 – MIN # elements: 3 – MAX radius: 0.8 Min Max radius elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1 0.718 0.962 0.952 0.949 0.958 0.970 0.957 0.957 0.949 0.946 2 0.715 0.962 0.955 0.952 0.952 0.962 0.951 0.951 0.966 0.965 3 0.769 0.948 0.968 0.960 0.960 0.957 0.959 0.972 0.944 0.958 4 0.763 0.955 0.965 0.962 0.956 0.963 0.968 0.966 0.954 0.964 5 0.719 0.970 0.966 0.947 0.958 0.959 0.958 0.961 0.953 0.967 Figure 7.4.: Catalan Parliament F1 measures for 0.9 minimum score Evaluation 21 21Wednesday, December 14, 11
  22. 22. • “Unknown” labelled as “Close up”: 7.73% error • Semantic shots labelled as “Unknown”: 1.27% error F1 Measure • Close Up president: 0.98 • Close Up: 0.95 • Medium: 0.99 • General bureau: 0.98 • General chamber: 0.99 • Unknown: 0.94 Crom o Crom o PP Classe 3 CU Presi. CU Presi. Medium Bureau Chamber Unknown 93 0 0 0 0 3 0 376 0 0 0 2 0 0 405 0 0 3 0 0 0 75 0 3 0 0 0 0 135 3 0 39 0 0 0 465 Automatic M a n u a l CU Presi CU Med Bur Cha Un know Evaluation 22 22Wednesday, December 14, 11
  23. 23. Results for Football domain • Optimal parameters – Threshold: 0.7 – MIN # elements: 1 – MAX radius: 0.8 Evaluation Min Max radius elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1 0.385 0.616 0.691 0.703 0.822 0.734 0.771 0.860 0.829 0.714 2 0 0 0.838 0.546 0.858 0.827 0.828 0.816 0.832 0.659 3 0 0 0.853 0.784 0.764 0.785 0.770 0.818 0.833 0.727 4 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 Figure 7.1.: Soccer match F1 measures for 0.7 minimum score 23 23Wednesday, December 14, 11
  24. 24. Cromo Cromo PP Classe 3 3 Cromo Cromo PP C.Beauty Public Pancarta Llotja Unknown 629 1 0 0 0 0 33 4 482 0 0 0 0 36 1 0 20 0 0 0 3 0 0 0 94 0 0 20 0 0 0 0 18 0 6 0 1 0 0 0 21 14 461 3 0 0 0 0 1204 Automatic M a n u a l Cromo Cromo PP Cromo Beauty Public Panc. Llotja Un know Automatic Class1 Class2 Class3 Class4 Class5 Class6 Class7 Class1 652 11 0 0 0 0 33 Class2 4 482 0 0 0 0 36 Class3 1 0 20 0 0 0 3 Class4 0 0 0 94 0 0 20 Class5 0 0 0 0 18 0 6 Class6 0 1 0 0 0 21 14 Class7 461 3 0 0 0 0 1204 Table 7.3.: Best soccer match confusion matrix instances Class1 Class2 Class3 Class4 Class5 Class6 Class7 n 0.58 0.97 1.0 1.0 1.0 1.0 0.91 F1 Measure • Close-UP: 0.72 • Medium: 0.95 • C.Beauty: 0.91 • Public: 0.90 • Banner: 0.86 • Box: 0.74 • Unknown: 0.81 Evaluation class that causes that problem and the percentage error of the confusion. Manual Automatic % Error Cromo PP Cromo 0.76% Cromo Cromo PP 1.58% Cromo Beauty Cromo 4.16% Llotja Cromo PP 2.77% Cap pla Cromo 27.63% Cap pla Cromo PP 0.17% Totes les classes Cap pla 7.9% Figure 7.3.: Soccer match detection errors 24 24Wednesday, December 14, 11
  25. 25. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 25 25Wednesday, December 14, 11
  26. 26. Design 26 26Wednesday, December 14, 11
  27. 27. Web services • List of keyframes given an asset ID • Keyframe classifier System architecture Data exchange JSON Format 00_00_09_24 00_00_39_25 00_00_54_48 00_00_59_41 Asset 1257 Class 3 Score: 0.83 Class 3 Score 0.75 Class 5 Score 0.46 Class 2 Score 0.98 00_00_39_25 Football MIN elems MAX radius 00_00_54_48 Football MIN elems MAX radius 00_00_59_41 Football MIN elems MAX radius 00_00_09_24 Football MIN elems MAX radius Design 27 27Wednesday, December 14, 11
  28. 28. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 28 28Wednesday, December 14, 11
  29. 29. Video-demo 29 29Wednesday, December 14, 11
  30. 30. Video-demo Parliament domain: http://youtu.be/R7o0sCoe-Gs 30 30Wednesday, December 14, 11
  31. 31. Video-demo Football domain: http://youtu.be/uURx9GoRJBg 31 31Wednesday, December 14, 11
  32. 32. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 32 32Wednesday, December 14, 11
  33. 33. • Web service to use new annotations to re-train • Use semantic shot information to launch additional analysis - Synthetic and natural signs -- Text recognition - Close ups -- Face recognition - Medium Shot Parliament -- Speech recognition Future work +, (
  34. 34. 9 , !
  35. 35. 01 5
  36. 36. IMT
  37. 37. +
  38. 38. * 1
  39. 39. )
  40. 40. 33 33Wednesday, December 14, 11
  41. 41. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 34 34Wednesday, December 14, 11
  42. 42. Conclusions Semantic shot classifier • Multiclass classification combining binary classifiers - Positive labels in a class become negative in others - Grid search for classifier parameters • Generic framework for multiple domains • Results good enough for exploitation Graphical User Interface • Uses classifier through web services • Flexible and user friend: - Sorting by score - Threshold can be manually changed - Drag and drop -Validation by page or shot type - Keyframe highlight • Annotated keyframes are valid for: - Training - Retrieval 35 35Wednesday, December 14, 11
  43. 43. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation Thank you for you attention Follow our research blog: http://bitsearch.blogspot.com 36 36Wednesday, December 14, 11

×