Multimedia Content Understanding: Bringing Context to Content

759 views
626 views

Published on

There is a digital revolution happening right before our eyes, the way we communicate is rapidly changing dues to rapid technological advances. Pencil and paper communication is drastically reducing and being replaced with newer communication medium ranging from emails to sms/mms and other instant messaging services. Information/news used to be broadcasted only through official and dedicated channels such as television, radio or newspapers. The technology available today allows every single one of us to be individual information broadcasters whether through text, image or video using our personal connected mobile device. In effect, the current trend shows that video will soon become the most important media on the Internet. While the amount of multimedia content continuously increases there is still progress to be done for automatically understanding multimedia documents in order to provide means to index, search and browse them more effectively. The objectives of this chapter are three-fold. First, we will motivate multimedia content modeling research in the current technological context. Secondly, a broad state of the art will provide the reader with a brief overview of the methodological trends of the field. Thirdly, a bird eye view of the various research themes I have supervised and/or conducted will be presented and will expose how contextual information has become an important additional source of information for multimedia content understanding.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
759
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • More place
  • State of the art
  • 83 ????
  • More place
  • More place
  • Red box is the intersection with the ground truth!
  • Another place
  • Another place
  • Another picture
  • Multimedia Content Understanding: Bringing Context to Content

    1. 1. Multimedia Content Understanding: Bringing Context to Content Dr. Benoit Huet HDR Presentation Université de Nice Sophia-Antipolis October 3rd 2012
    2. 2. Welcome  Jury Members  Tat-Seng CHUA (NUS, Singapore) [reviewer]  Patrick GROS (INRIA, France) [reviewer]  Alan SMEATON (DCU, Ireland) [reviewer]  Edwin HANCOCK (University of York, UK) [member]  Bernard MERIALDO (EURECOM, France) [member]  Nicu SEBE (University of Trento, Italy) [member] 03/10/2012 B. HUET - HDR Presentation -2
    3. 3. Talk Outline  Curriculum Vitae  Research Activities  A Technical Presentation  Event Media Modeling based on User Generated Content  Conclusions and Research Perspectives  Questions/Discussion 03/10/2012 B. HUET - HDR Presentation -3
    4. 4. Curriculum Vitae  Ecole Superieure de Technologie Electrique Batchelor of Science: Electrical Enginering and Computing 1992  University Of Westminster MSc Artificial Intelligence (with distinction) 1993  Thesis: Recurrent neural networks for temporal sequence recognition  University Of York PhD Computer Science (Computer Vision) 1999  Thesis: Object recognition from large libraries of line-patterns  University Of Westminster Part-time Lecturer 1994-95  University Of York (R. Associate) Learning 2D and 3D object models from 2D scenes 1998  Eurecom (Assistant/MdC) Multimedia content analysis, indexing and retrieval 03/10/2012 B. HUET - HDR Presentation since -4 1999
    5. 5. Curriculum Vitae  Teaching:  Multimedia Technologies (course leader) • Rated 3.25/4.00 by students (33) in Spring 2012  Multimedia Advanced Topics (course leader) • Rated 3.25/4.00 by students (25) in Fall 2012  Intelligent Systems (course leader: B. Merialdo)  Multimedia Information Retrieval (course leader: B. Merialdo)  Artificial Neural Networks (94/95) 03/10/2012 B. HUET - HDR Presentation -5
    6. 6. Curriculum Vitae  Mentoring  PhD Advisor:   Mathilde Sahuguet, “Multimedia Mining on the Web”, since March 2012. Xueliang Liu, "Event-based Social Media Data Mining" since 2009.  Stephane Turlier, PhD from Telecom ParisTech in 2011 "Personalisation and Aggregation of Infotainment for Mobile Platforms“ Marco Paleari, PhD from Telecom ParisTech in 2009 "Affective Computing; Display, Recognition and Articial Intelligence“ Rachid Benmokhtar, PhD from Telecom ParisTech in 2009 "Fusion multi-niveau pour l'indexation et la recherche multimedia par le contenu semantique" Eric Galmar, defended his PhD from Telecom ParisTech in 2008 "Representation and Analysis of Video Content for Automatic Object Extraction".     PhD Co-advisor (with B. Merialdo):    Joakim Jiten, PhD from Telecom ParisTech in 2007 "Multidimensional hidden Markov model applied to image and video analysis" Fabrice Souvannavong, PhD from Telecom Paris in June 2005 "Semantic video content indexing and retrieval“ Ithery Yahiaoui, PhD from Telecom Paris in October 2003 "Automated Video Summary Construction“ 03/10/2012 B. HUET - HDR Presentation -6
    7. 7. Curriculum Vitae  Scientific Visibility  Editorial Boards:  Multimedia Tools and Application (Springer),  Multimedia Systems (Springer),  Guest Editor for EURASIP Journal on Image and Video Processing: selected papers from MultiMedia Modeling 2009,  Guest Editor for IEEE Multimedia special issue on Large Scale Multimedia Retrieval and Mining (July 2011),  Guest Editor for IEEE Multimedia special issue on Large Scale Multimedia Data Collections (July 2012),  Guest Editor for the Journal of Media Technology and Applications special issue on Multimedia Content Analysis,  Guest Editor for Multimedia Systems special issue on Social Media Mining and Knowledge Discovery. 03/10/2012 B. HUET - HDR Presentation -7
    8. 8. Curriculum Vitae  Scientific Visibility  Reviewing:  Most journals of the field: ACM Multimedia Systems, IEEE PAMI, IEEE Multimedia, …  Most conferences of the field: ACM Multimedia, IEEE ICME, ACM SIGIR, MMM, …  Conference Organisation     MMM 2009 General Chair ACM MM 2012: Area Chair (Content Processing Track) ACM ICMR 2012: Tutorial Chair …  Technical Commitees  Chair of the IEEE Multimedia Communication Technical Committee (VAIG) Visual Analysis and Content Management for Communications  Vice Chair of the IAPR Technical Committee 14 Signal Analysis for Machine Intelligence 03/10/2012 B. HUET - HDR Presentation -8
    9. 9. Curriculum Vitae  Publications:  Books and book chapters: 5  Journals: 11  Conferences: 99 (90 International + 9 National)  Technical reports: 6  Invited Talks/Seminars: 16  Panels: 3 ( 2 panelist + 1 moderator)  Patent: 1 03/10/2012 B. HUET - HDR Presentation -9
    10. 10. Curriculum Vitae  Current Projects  ALIAS (EU AAL/ANR): Adaptable Ambient LIving ASsistant  mobile robot system that interacts with elderly users  promotes social inclusion by creating connections to people and events in the wider world.  EventMap (EIT ICT Labs):  demonstrate the use of explicit representations of events to organize the provision and exchange of information and media.  LinkedTV (EU FP7): TeleVision Linked to the Web  A novel practical approach to Future Networked Media  Based on four phases: annotation, interlinking, search, and usage (including personalization, filtering, etc.).  MediaMixer (EU FP7):  re-purposing and re-using media fragment  media production, library, TV archive, news production, e-learning and UGC portal industries.  Past Projects  8 (2 direct contracts, 2 National, 4 European) 03/10/2012 B. HUET - HDR Presentation - 10
    11. 11. Research Activities  Multimedia Content Analysis/Understanding  Information/Data Overload  72 hours of video are uploaded every minute  4 billion hours of video are watched each months  More than 20% of views come from mobile devices  3 hours of video is uploaded per minute from mobile devices  3M Photos/day  85M Photos/day  Video is over 40% of today‟s Internet Traffic  Need for Efficient and Scalable Content-Based Indexing/Search Tools 03/10/2012 B. HUET - HDR Presentation - 11
    12. 12. Research Activities  Multimedia Content Analysis/Understanding  Information/Data Overload  Content and Context Ubiquitous Media Capturing Devices Sensor data complements Media • Clock / GPS / Gyroscope / Accelerometer  Using Context to better analyse Content 03/10/2012 B. HUET - HDR Presentation - 12
    13. 13. Research Activities  Multimedia Content Analysis/Understanding  Information/Data Overload  Content and Context  Social Networks  User Generated Content  Additional data complements Media • Comments, Tags, +1/Like, etc…  Reliability Issues!  Few UGC feature tags!  Using Context to better analyse Content 03/10/2012 B. HUET - HDR Presentation - 13
    14. 14. Research Activities  Multimedia Content Analysis/Understanding  Information/Data Overload  Content and Context  Social Networks  Long term research objective:  How CONTEXTual information can help analyse CONTENT better Content without context is meaningless [Ramesh Jain, 2008] 03/10/2012 B. HUET - HDR Presentation - 14
    15. 15. Research Activities  Bringing Context to Content:  Internal Context Knowledge/Data extracted from within the document  External Context Knowledge/Data associated or found in conjunction with the document 03/10/2012 B. HUET - HDR Presentation - 15
    16. 16. Research Activities  Bringing Context to Content:  Internal Context  Spatio-temporal Video Segmentation  Context at the pixel level Spatial Segmentation Create atomic homogeneous color regions, constrained by a contour map Temporal Grouping with Feature Points Establish temporal edges between regions Group regions that are strongly linked Space-Time Merging Refine linkage of static regions using local and global constraints Initialization Define the coarseness of the segmentation 03/10/2012 B. HUET - HDR Presentation ARG Matching Compare ARGs between frame pairs to validate creation of new regions - 16 [CIVR‟07]
    17. 17. Research Activities  Bringing Context to Content:  Internal Context  Structural Representation of Video Objects  Region co-occurence [CIVR‟05/IEE VISP‟05] 03/10/2012 B. HUET - HDR Presentation - 17
    18. 18. Research Activities  Bringing Context to Content:  Internal Context  Spatio-Temporal Semantic Segmentation  Concept Co-occurence [MMSP‟08] 03/10/2012 B. HUET - HDR Presentation - 18
    19. 19. Research Activities  Bringing Context to Content:  Internal Context  High-Level Fusion  Concept Co-Detection [MIR‟08] 03/10/2012 B. HUET - HDR Presentation - 19
    20. 20. Research Activities  Bringing Context to Content:  Internal Context  Multimodal Emotion Recognition [CBMI‟08 03/10/2012 B. HUET - HDR Presentation - 20 CIVR‟10]
    21. 21. Research Activities  Bringing Context to Content:  Internal Context Knowledge/Data extracted from within the document  External Context Knowledge/Data associated or found in conjunction with the document 03/10/2012 B. HUET - HDR Presentation - 21
    22. 22. Research Activities  Bringing Context to Content:  Internal Context  External Context  High-Level Fusion  Concept Co-occurence [MTAP‟11] 03/10/2012 B. HUET - HDR Presentation - 22
    23. 23. Research Activities  Bringing Context to Content:  Internal Context  External Context  Large Scale MM Annotation  Tags  Categories  Comments [VLSMCMR‟10] 03/10/2012 B. HUET - HDR Presentation - 23
    24. 24. Research Activities  Bringing Context to Content:  Internal Context  External Context  Mining Events from the Web Event 1  Machine tags  Geolocation  Timestamps [ACM WSM‟11] 03/10/2012 B. HUET - HDR Presentation - 24
    25. 25. Event Media Modeling based on User Generated Content Xueliang Liu and Benoit Huet
    26. 26. What is an Event? 03/10/2012 B. HUET - HDR Presentation - 26
    27. 27. What is an Event? VIGTA 2012 Capri Italy 03/10/2012 B. HUET - HDR Presentation - 27
    28. 28. Big Data! Media Sharing Event Directory 03/10/2012 B. HUET - HDR Presentation Social Apps/Networks Search Engines - 28
    29. 29. Search For media 03/10/2012 B. HUET - HDR Presentation - 29
    30. 30. Searching for an event 03/10/2012 B. HUET - HDR Presentation - 30
    31. 31. Media explicitly associated with the event 03/10/2012 B. HUET - HDR Presentation - 31
    32. 32. REST API for query 03/10/2012 B. HUET - HDR Presentation - 32
    33. 33. Objective  Automatically and explicitly associate media with their originating event  Build event visual appearance models  Model training requires both positive and negative samples  Can we mine the training set automatically online? 03/10/2012 B. HUET - HDR Presentation - 33
    34. 34. Related Works – Event Based  EventBurn.com  Create summaries about given events (searching Twitter, Facebook, and Flickr)  Firan et al. [CIKM’10]  Event categorization from social media data  Gao et al. [WWW’11]  Employing Twitter data to enrich event information  Mattivi et al. [ACM workshop on Modeling and Representing Events’11]  Event and Sub-Event clustering (visual features and time) 03/10/2012 B. HUET - HDR Presentation - 34
    35. 35. Related Works – Concept Based  Li et al. [CVPR ’07]  OPTIMOL: automatic Online Picture collecTion via Incremental MOdel Learning  Schroff et al. [IEEE PAMI ’11]  Harvesting image databases from the Web  Li et al. [ICMR ’11]  Social negative bootstrapping to model concepts  Automatically collect samples for specified concepts  Extension to Events using Contextual information 03/10/2012 B. HUET - HDR Presentation - 35
    36. 36. Automated Event Modeling FrameWork Positive Samples Event Negative Samples 03/10/2012 B. HUET - HDR Presentation - 36 Event Model
    37. 37. Event Machine Tags  A way to explicitly link Media and Events 03/10/2012 B. HUET - HDR Presentation - 37
    38. 38. Positive Samples Collection  Machine Tag  Abbreviation of events name + Geo-Tag  For example “ACMMM12” is the tag to query photos from “ACM Multimedia 2012” 03/10/2012 B. HUET - HDR Presentation - 38
    39. 39. Automated Event Modeling FrameWork Positive Samples Event Negative Samples 03/10/2012 B. HUET - HDR Presentation - 39 Event Model
    40. 40. Mining Photos from Sharing Platforms L o c a n t i o Megwelk, Amsterdam D a e t 03/10/2012 Flickr API B. HUET - HDR Presentation - 40
    41. 41. Negative Samples Collection  Assumption 1: Photos with recurrent tags captured near the location of the event describe the location/region not the event.  Assumption 2: Photos taken near the location of the event and in the same period offer better discriminating power than random photos.  Collecting Approach  Collect the all the media captured near the event„s location and time  Extract tag from the collection, and rank them according to appearance frequency.  Keep the top tags as common tags and use them to rank photos by similarity 03/10/2012 B. HUET - HDR Presentation - 41
    42. 42. Automated Event Modeling FrameWork Positive Samples Event tag1 tags Top N tags tag2 tag3 tagN Top M Photos Pic2 Pic3 ………. Rank tags by frequency 03/10/2012 Pic1 PicM Negative Samples …… Rank Photos by distance to tags B. HUET - HDR Presentation - 42 Event Model
    43. 43. The DataSet  10 LastFM concerts, 3 international conferences and 1 popular carnival EventID lastfm:804783 lastfm:1830095 lastfm:1858887 lastfm:1499065 lastfm:1787326 lastfm:1351984 lastfm:1842684 lastfm:2020655 lastfm:1301748 lastfm:1370837 ACMMM10 SIGIR2010 ACMMM07 NICECarnival2011 Total 03/10/2012 Positive Samples 441 716 408 348 446 307 602 538 944 592 100 30 118 52 Negative Candidate 1063 748 745 712 913 584 1125 745 541 1025 557 525 64 848 Pos 466 398 431 16 0 498 535 750 1157 592 178 0 15 60 Neg 64 134 266 153 313 19 78 6 80 115 23 201 44 209 5642 10195 5096 1705 B. HUET - HDR Presentation Testing - 43
    44. 44. DataSet Examples Positive Samples 03/10/2012 Negative Samples B. HUET - HDR Presentation Test Positive Test Negative - 44
    45. 45. Event Model Training  Feature:  400D Bag of Words from SIFT features.  Model:  SVM implemented with libSVM  RBF kernel  Cross validation is used to optimize the parameters 03/10/2012 B. HUET - HDR Presentation - 45
    46. 46. The (Negative Samples) Model Parameters  R: the location distance between photo taken and event venue  D: the time-span between photo taken and event taken time lastfm:804783 -An example on event: 03/10/2012 B. HUET - HDR Presentation R and D should be large enough to pool a diverse set of photos - 46
    47. 47. Visual Event Modeling Results EventID Query Our Algorithm k-NN Pruning Random Sample Uniform Negative lastfm:804783 lastfm:1830095 lastfm:1858887 lastfm:1499065 lastfm:1787326 lastfm:1351984 lastfm:1842684 lastfm:2020655 lastfm:1301748 lastfm:1370837 ACMMM10 SIGIR2010 ACMMM07 NICECarnival2011 87.92 74.81 61.84 9.47 0.00 96.32 87.28 99.21 93.53 83.73 88.56 0.00 25.42 22.30 88.68 78.38 63.41 90.53 98.40 96.32 87.93 91.80 93.53 85.15 91.04 60.19 57.62 76.58 46.98 80.26 63.56 89.94 92.65 55.32 67.86 71.69 73.73 73.83 87.56 42.28 46.61 59.10 50.00 96.62 76.47 92.90 97.12 86.65 79.28 75.00 64.83 60.25 86.57 16.41 28.81 55.39 75.85 84.96 73.89 89.35 42.49 93.81 87.11 94.58 93.21 80.62 89.05 22.38 27.18 56.51 Average Accuracy 69.41 83.31 68.64 70.07 73.42 03/10/2012 B. HUET - HDR Presentation - 48
    48. 48. Conclusions  Visual Modeling of Event allows to attach media to their corresponding event  Device and User Metadata provide interesting and valuable clues for automatically constructing a ground truth  Visual Event Models can be created in an unsupervised way  Detecting Events from Social Media activity 03/10/2012 B. HUET - HDR Presentation - 49
    49. 49. Conclusions and Future Work  Combine multiple information sources (Tweets, Social Graph, etc…) to detect and media enrich events.  Meta-Objective: Social Event analysis based on connections between events, media and participants  CONTEXTual Information contributes significantly to CONTENT understanding 03/10/2012 B. HUET - HDR Presentation - 50
    50. 50. Research Directions  Accenture Technology Vision [2012] Context-based services Social-driven IT  Cisco forecasts 65% of Internet Traffic will be video by 2015  Need for efficient and effective Multimedia Content Understanding  Are these multimedia content related?  What event is depicted in this document?  What is this video about? 03/10/2012 B. HUET - HDR Presentation - 51
    51. 51. Research Directions  Are these multimedia content related?  Future Digital Television  Interactive TV: no commercial success  2sd Screen: 70% of mobile device user use them while watching TV.  Need for relevant additional content  Web Media Mining for objects, people and events based on A/V Content and Contextual Information 03/10/2012 B. HUET - HDR Presentation - 52
    52. 52. Research Directions  What event is depicted in this document?  Events are a natural structuring element for Humans  Public events (show,…) vs Private events (birthday,…)  Initial promising results on public event  Extension to private events  Social Media Graph  2 ACM MM Grand Challenges in 2012 03/10/2012 B. HUET - HDR Presentation - 53
    53. 53. Research Directions  What is this video about?  Users are becoming Broadcasters  User Generated Content YouTube tutorials, product test, etc…  Business Intelligence  Harvest the social web for media documents related to products and understand its content  visual detection of product  emotion recognition 03/10/2012 B. HUET - HDR Presentation - 54
    54. 54. Questions?  Thank you for your attention. 03/10/2012 B. HUET - HDR Presentation - 55
    55. 55. Visual Data in the 90’s  Huet & Hancock [WACV’96] Digital Map Ground Truth 03/10/2012 Corresponding aerial images taken at different aircraft altitudes B. HUET - HDR Presentation - 57
    56. 56. Large Scale in the 90’  Huet & Hancock [IEEE PAMI’99]  Cartographic Database     22 original images Aerial scenes Main features: roads 100-1000 lines per image  Trademarks and logos Database [Flickner et al. ’95]  Over 1000 original images  Scanned data  B&W, Various resolution 03/10/2012 B. HUET - HDR Presentation - 58
    57. 57. The TRECVID years (2001- to date)  2001: 11 hrs from BBC & OpenVideo Project  2003 first collaborative ground truth annotation  2005-2006: 170 hrs (Nov.’04 news in Arabic, Chinese, and English)  High-level feature extraction (10)  2007-2009: 100hrs from the Netherlands Institute for Sound and Vision (news magazine, science news, news reports, documentaries, educational programming, and archival video)  2010-2011: 600hrs of MPEG-4 Creative Commons Videos  High-level feature extraction (light=50 full=364) 03/10/2012 B. HUET - HDR Presentation - 59
    58. 58. The Trend:  Datasets are going Large-Scale (Web-Scale) ...slowly... Multimedia / Computer Vision researchers are tackling and experimenting with Large-Scale data MIRflickr / NUSWide / ImageNet / MCG-WEBV  Issue: 1 research objective <-> 1 data corpus 03/10/2012 B. HUET - HDR Presentation - 60
    59. 59. Talk Outline  The scene / motivation  Social Events and Big Data  Using social platforms for creating a corpus automatically  Social Event Detection  Using social media for detecting events  Social Event Media Mining  Enriching Event‟s Illustrations through Web Mining  Conclusions 03/10/2012 B. HUET - HDR Presentation - 61
    60. 60. Event Detection by Temporal Analysis X. Liu, R. Troncy and B. Huet
    61. 61. Event Detection - Related Work  EventBurn.com  Create summaries about given events (searching Twitter, Facebook, and Flickr)  Firan et al. (CIKM’10)  Event categorization from social media data  Gao et al. (WWW’11)  Employing Twitter data to enrich event information 03/10/2012 B. HUET - HDR Presentation - 63
    62. 62. How to mine events from PhotoSet… Events ?? 03/10/2012 B. HUET - HDR Presentation - 64
    63. 63. Observation  Media are captured during events and shared 03/10/2012 B. HUET - HDR Presentation - 65
    64. 64. How fast are media uploaded? 03/10/2012 B. HUET - HDR Presentation - 66
    65. 65. Experiment Data  9 Attractive Venues WorldWide Venue Name NbEvents NbUsers 352 151 106 24 79 148 79 212 204 Melkweg Koko HMV Forum 111 Minna Gallery HMV Hammersmith Apollo Circolo degli Artisti Circolo Magnolia Ancienne Belgique Rotown NbPhotos 6912 3546 2650 1369 2124 2571 2190 7831 3623 266 155 130 105 96 86 76 56 49  Event Ground Truth obtained from the official agendas 03/10/2012 B. HUET - HDR Presentation - 67
    66. 66. Detecting and Identifying Events  Our solution consists of 3 steps:  Location Monitoring: finding the bounding-box of venues.  Temporal Analysis: detecting events by analyzing the uploading behavior along time.  Event Topic Identification: identifying detected events’ topics through tag analysis. 14 12 10 8 6 4 2 0 10/05/01 Location Monitoring 03/10/2012 10/05/06 10/05/11 10/05/16 10/05/21 10/05/26 10/05/31 Temporal Analysis Event Topic Identification B. HUET - HDR Presentation Results - 68
    67. 67. Event Detections  Region Monitoring 03/10/2012 B. HUET - HDR Presentation - 69
    68. 68. Venue Bounding Box Estimation 1 : INPUT : VenueName 2 : OUTPUT : BoundingBo x 3 : PhotoSet 4 : Center [] GetInfo( VenueName) 5 : EventSet GetPastEve nts(VenueName) 6 : f oreach event in EventSet do 7: photos GetFlickrP hoto(event) 8: PhotoSet.append ( photos) 9 : end 10 : GeoSet GetGeoInfo PhotoSet) ( 11 : Filter (GeoSet, Center, threshold 1km) 12 : RETURN MinRect(GeoSet) 03/10/2012 B. HUET - HDR Presentation - 70
    69. 69. Venue Bounding Boxes (a selection) Paradiso Megwelk 03/10/2012 HMV Hammersmith Apollo KoKo B. HUET - HDR Presentation - 71
    70. 70. Analyzing the number of Photos L o c o a n t i D t a e 03/10/2012 Megwelk REST Query B. HUET - HDR Presentation - 72
    71. 71. Our Media DataSet  Flickr Photos  Taken in May 2010  InName either one of the 9 selected locations: Number of Photos Koko Rotown Melkweg HMV Forum 111 Minna Gallery Ancienne Belgique Circolo degli Artisti Circolo Magnolia Hammersmith Apollo Total : 03/10/2012 Geo-tagged 372 90 363 184 937 2206 70 95 287 4604 Venue Name tagged 2040 273 700 412 3 288 553 236 84 4589 B. HUET - HDR Presentation Overlap Total 3 1 8 0 0 2 1 0 0 15 2409 362 1055 596 940 2492 622 331 371 9178 - 73
    72. 72. Analyzing the number of Photos 250 200 Events ?? 150 100 50 0 10/05/01 10/05/06 10/05/11 10/05/16 10/05/21 10/05/26 10/05/31 Number of Photos taken in Melkweg (NL) in May 2010 03/10/2012 B. HUET - HDR Presentation - 74
    73. 73. Analyzing the number of Photos Owners 14 12 Events ?? 10 8 6 4 2 0 10/05/01 10/05/06 10/05/11 10/05/16 10/05/21 10/05/26 10/05/31 Number of Photo Owners in Melkweg in May 2010 03/10/2012 B. HUET - HDR Presentation - 75
    74. 74. Event Detection Approach  Based on media upload activity  At a given time  At a given location  Events can beet arg(by: T ) detected ti i  Where 03/10/2012 ti N photos * N owners T : Threshold B. HUET - HDR Presentation - 76
    75. 75. Event Topics Mining  Keep the top N most frequent tags  Result: melkweg anouk amsterdam jemaine 2010 european flight flightoftheconchords conchords fotc mckenzie clement tour bret evelyn 03/10/2012 B. HUET - HDR Presentation - 77
    76. 76. Number of photos * Number of photo owners Event Detection Example 03/10/2012 Melkweg in May 2010 B. HUET - HDR Presentation - 78
    77. 77. Number of photos * Number of photo owners Event Detection Example 03/10/2012 111 Minna Gallery in May 2010 B. HUET - HDR Presentation - 79
    78. 78. Event Detection Results  Detection results on different conditions Source Threshold True Predict False Predict F1 mean 43 21 0.211 median 64 51 0.279 mean 56 56 0.246 median 58 62 0.251 mean 34 18 0.172 median 67 53 0.289 Image Owner Image*Owner 03/10/2012 B. HUET - HDR Presentation - 80
    79. 79. Event Detection Results  Event Detection Statistics Venues Our Method Ground Truth LastFM Melkweg Koko HMV Forum 69 20 14 Detect 15 15 12 111 Minna Gallery 23 15 2 0.133 0.087 0 Ancienne Belgique Rotown 38 16 15 15 9 8 0.600 0.533 0.237 0.500 28 13 Circolo degli Artisti 22 15 8 0.533 0.364 12 Circolo Magnolia 25 3 1 0.333 0.040 11 Hammersmith Apollo In total 15 242 15 120 10 67 0.667 0.558 0.667 0.277 14 136 03/10/2012 Matched 12 8 9 Precision 0.800 0.533 0.750 Recall 0.174 0.400 0.643 44 0 14 B. HUET - HDR Presentation - 81
    80. 80. Events Detection at Melkweg Venue Detection Results Date Tags Date Ground Truth Title LastFM LastFM Title melkweg 03/05/2010 parkwaydrive drive parkway 03/05/2010 Parkway Drive / Despised Icon / Winds Of Plague / The Warriors / 50 Lions 1336473 Parkway Drive melkweg 02/05/2010 flight flightoftheconchords conchords 02/05/2010 Flight Of The Conchords - UITVERKOCHT 1439320 Flight of the Conchords melkweg 04/05/2010 flightoftheconchords 04/05/2010 Flight Of The Conchords - UITVERKOCHT 1439407 Flight of the Conchords melkweg 05/05/2010 mayerhawtorne mayer hawthorne 05/05/2010 Mayer Hawthorne & The County 1416229 Mayer Hawthorne & The County melkweg 11/05/2010 bonobo 11/05/2010 Bonobo - UITVERKOCHT 1398102 Bonobo melkweg 14/05/2010 paulweller paul 14/05/2010 Paul Weller - UITVERKOCHT 1406677 Paul Weller melkweg 18/05/2010 brokensocialscene 18/05/2010 Broken Social Scene - UITVERKOCHT 1334429 Broken Social Scene melkweg 19/05/2010 mikestern richardbona 19/05/2010 Mike Stern band with special guest Richard Bona featuring Dave Weckl & Bob Malach melkweg 25/05/2010 beattimemelkweg 24/05/2010 Beattime - The Kika Edition melkweg 26/05/2010 beattime 24/05/2010 melkweg 28/05/2010 offcentre 28/05/2010 Beattime - The Kika Edition Off Centre - day 3 - night met Kode 9 / Falty DL / Gold Panda / Kelpe melkweg 30/05/2010 joannanewsom 30/05/2010 1425481 Joanna Newsom 03/10/2012 B. HUET - HDR Presentation Joanna Newsom - 82
    81. 81. Collage For illustration She & Him in Koko 07/05/2010 03/10/2012 B. HUET - HDR Presentation - 83
    82. 82. Conclusions on Event Detection  A novel approach for automatically detecting social events is presented  The key idea consists in temporally monitoring media shared on social web sites at a specific location (Geo Localized Photo)  Automatic Efficient Social Event Detection and Identification can be achieved 03/10/2012 B. HUET - HDR Presentation - 84
    83. 83. Enriching Events with Social Media X. Liu, R. Troncy and B. Huet
    84. 84. Searching for media about an event 03/10/2012 B. HUET - HDR Presentation - 86
    85. 85. Finding more media that illustrate an event A. Compute the bounding box area of a venue B. Retrieve all media geo-tagged in this area C. Retrieve all media with a similar title D. Prune the results with visual analysis E. Extend the result set with all media from the same uploader 03/10/2012 B. HUET - HDR Presentation - 87
    86. 86. A. Bounding box of Nouveau Casino? 03/10/2012 B. HUET - HDR Presentation - 88
    87. 87. B. 74 photos taken in this area this day 03/10/2012 B. HUET - HDR Presentation - 89
    88. 88. C. 85 additional photos with a similar title 03/10/2012 B. HUET - HDR Presentation - 90
    89. 89. D. 6 photos after visual pruning        03/10/2012 B. HUET - HDR Presentation   - 91 
    90. 90. How is the visual pruning performed?  Model dataset: photo event id + photo geo  Testing dataset: similar title  Low-level features used:  Color moments, Gabor texture, Edge histogram  L1 distance on the k-Nearest Neighbors (k-NN)  Threshold  Min L1 distance between two model image pairs  Conservative approach 03/10/2012 B. HUET - HDR Presentation - 92
    91. 91. E. 66 photos after uploader heuristics  hellerpop DustGraph / Stefan cartoixa 13 photos 03/10/2012 B. HUET - HDR Presentation 46 photos - 93
    92. 92. Same process for videos  1 video (id) 3 videos (geo) 26 videos (title)  03/10/2012 Visual pruning performed on key frames Nb positive > 50% B. HUET - HDR Presentation  - 94
    93. 93. How illustrated are the events? Query By ID Photos Videos (title) Videos (title+venue) Query By Geo Query By Title Visual Pruning Heuristic 5 74 (74) 85 (85) 6 (6) 66 (66) 1 3 (0) 23 (0) 13 (0) - 10 (10)  20 events  Model dataset: 785 photos  Testing dataset: 1766 photos (1573 positive, 193 negative)  Results: 439 photos (99% precision, 28% recall) 03/10/2012 B. HUET - HDR Presentation - 95
    94. 94. Conclusions  Method for finding media illustrating scheduled events  Search media with machine and geo tags  Search media with title and normal tags  Prune visually and retrieve all media from confirmed users  Challenge: do not necessarily trust the geo-coordinates 03/10/2012 B. HUET - HDR Presentation - 96
    95. 95. Event detection with latent topic model
    96. 96. Framework Tj Priori knowledge Validating Data Tj Decision Learning Inference Mass of data 03/10/2012 Ti Semantic Space Cluster the documents by concepts B. HUET - HDR Presentation events distribution on semantic space - 98 Ti
    97. 97. Infer topics with LDA model Parameters: α and β are the Dirichlet prior on the distribution of per-document topic, and per-topic word Inputs: W is the observed words.  To learn to the topics from large scale of data  To estimate topic distribution on new data 03/10/2012 B. HUET - HDR Presentation - 99
    98. 98. Estimate Event Distribution  Estimate Validating data Distribution (D)  Learn the event distribution Dist : KL divergence  Decision rule: Where 03/10/2012 B. HUET - HDR Presentation - 100
    99. 99. Threshold K  Inference on validating set #Docs Decision: k = 0.3 K 03/10/2012 B. HUET - HDR Presentation - 101
    100. 100. Examples on Melkweg, Amsterdam  LDA topics in Amsterdam -- topic: 0 bus: 0.032322 berkhof: 0.012423 autobus: 0.011886 chair: 0.011348 man: 0.010810 gvb: 0.010810 -- topic: 3 ……… canal: 0.059344 boat: 0.029258 bridge: 0.021845 river: 0.015741 water: 0.012253 …….. 03/10/2012 -- topic: 1 park: 0.042738 green: 0.030343 nature: 0.027130 museum: 0.024835 tree: 0.018408 bird: 0.017031 -- topic: 4 ……… event: 0.069694 paradiso: 0.067306 concert: 0.039046 music: 0.033872 live: 0.032280 B. HUET …….. - HDR Presentation -- topic: 2 dutch: 0.044336 building: 0.037352 architecture: 0.0359 centrum: 0.028923 cityscape: 0.024829 urban: 0.017123 -- topic: 5 ……… ……… ……… ……… ……… …….. - 102 ……..
    101. 101. Decision on Photos on Melkweg tags="holland netherlands amsterdam bike canal" title="201005037961-2" tags="portrait music netherlands face rock concert live band may " title="Imelda May" 03/10/2012 tags="amsterdam snoekbaars nikond90 thepowerofplots pjotrp" title="Amsterdam" tags="longexposure canon lights nightshot tram leidseplein lijn2" title="Carnival Ride" tags="twitter" title="Gewoon omdat we niet mogen fotograferen van She &amp; Him @ Melkweg" tags="tweeted" title="#eskimojoe soundcheck @melkweg sounds good!" B. HUET - HDR Presentation - 103
    102. 102. Detected Events on Melkweg Event Title Flight Of The Conchords - UITVERKOCHT Date 2010/5/3 Parkway Drive / Despised Icon / Winds Of Plague / The Warriors / 50 Lions Flight Of The Conchords - UITVERKOCHT Mayer Hawthorne & The County She & Him Gert Vlok Nel Bonobo - UITVERKOCHT Gentleman & The Evolution - UITVERKOCHT Paul Weller - UITVERKOCHT Broken Social Scene - UITVERKOCHT 2010/5/2 2010/5/4 2010/5/5 2010/5/6 2010/5/7 2010/5/11 2010/5/12 2010/5/14 2010/5/18 Mike Stern band with special guest Richard Bona featuring Dave Weckl & Bob Malach Beattime - The Kika Edition Off Centre - day 3 - night met Kode 9 / Falty DL / Gold Panda / Kelpe Joanna Newsom 2010/5/19 2010/5/24 2010/5/28 2010/5/30 03/10/2012 B. HUET - HDR Presentation - 104
    103. 103. Results Summary (1) Social Media Data statistics over Event Detection Venue Melkweg Koko 111 Minna Gallery Ancienne Belgique Rotown Circolo degli Artisti HMV Forum total 03/10/2012 Total Post Detection 355 724 313 496 118 167 97 2270 B. HUET - HDR Presentation Positive 42 95 26 32 6 46 18 265 Precision 32 44 10 19 4 36 15 160 - 105 0.76 0.46 0.38 0.59 0.67 0.78 0.83 0.60
    104. 104. Results Summary (2) Social Event Detection Performance Venue Melkweg Koko 111 Minna Gallery Ancienne Belgique Rotown Circolo degli Artisti HMV Forum Total 03/10/2012 GroundTruth 27 15 4 19 7 17 10 99 B. HUET - HDR Presentation Detection 14 12 4 10 2 15 6 63 Recall 0.52 0.80 1.00 0.53 0.29 0.88 0.60 0.64 - 106
    105. 105. Conclusion  A novel approach for detecting social events is presented  The idea consists in mining the event distribution on concepts learned from large scale data  Future work:  Exploring multimodality of data ( visual feature, EXIF data…) on event detection  Modeling the topics efficiently (varying along time) 03/10/2012 B. HUET - HDR Presentation - 107
    106. 106. Multimedia Challenges  Gartner Group “Twelve Technologies for 2000 to 2010”  content-based retrieval and object recognition.  Ever increasing volume of multimedia data (internet + p2p, set-top-box, pda, mobile phone, etc…)  Cross-device access… (wireless or wired)  Access to data remains mostly text based  Data indexes remain mostly text based (filename and eventually few user fed metadata)  Multimedia content analysis for automatic semantic metadata creation 03/10/2012 B. HUET - HDR Presentation - 108
    107. 107. Current Research Themes  Recognition and Retrieval  Retrieval in technical drawings [E.P.O.]  Video Object Analysis Graph representation  EC projects: GMF for ITV (PT, IRT, JRS, ...)  Video Summarisation  Automated summary construction  Evaluation of summary‟s performance  EC projects: SPATION (Philips, Uni. of Brescia, ...) 03/10/2012 B. HUET - HDR Presentation - 109
    108. 108. 3W3S  World Wide Web Safe Surfing Services  The 3W3S Filter is composed of several components:  HTTP proxy / ICAP Server [WebWasher],  a URL filter [WebWasher],  a PICS filter [Thales],  Content Word filter [Eurecom].  Implementation of the content word filter as an 03/10/2012 B. HUET - HDR Presentation - 110
    109. 109. SPATION  Services Platforms and Applications for Transparent Information management in an in-hOme Network  Analysis of Audio-Visual Descriptors for Summary Construction [Eurecom]  1 PhD on Automatic Video Summarisation  Image similarity measure TV Photo  Patent: Using textual summaries for video summarisation HiFi [Philips/Eurecom co-inventor] Browser Set-Top-Box 03/10/2012 PDA B. HUET - HDR Presentation - 111 (Remote Control)
    110. 110. GMF4iTV  Generic Media Framework for Interactive Television User Profile Semantic Ontology  Semantic Video Object/Shot Classification video  Interactive Personalization  Annotations Annotations Annotations 2Extra Annotations PhD students content MPEG7 03/10/2012 Set-Top-Box User Extra Content (Co-advisor) B. HUET - HDR Presentation - 112
    111. 111. Attributed Relational Graph for Semantic VideoFrame Object Retrieval Frame Video Query Video Segmentation Regions Adjacency Graph 03/10/2012 Object Graph Matching Salient object characteristics are obtained through Latent Semantic Analysis B. HUET - HDR Presentation 113
    112. 112. Perspective Research Themes  Multimedia Data Analysis and Mining  Object segmentation (MPEG4),  Object representation and recognition (MPEG7).  Multimedia Information Organisation  Increasing amount of material,  Tools for easy and rapid navigation and search,  Internet (WWW, Peer2Peer), Home network.  Multimedia Document Understanding 03/10/2012 B. HUET - HDR Presentation - 114
    113. 113. Multimedia Data Analysis and Mining  Object segmentation  Static vs Dynamic… combined strategy  Object representation and recognition:  Objects vs Images  Multimedia vs ”mono”media Image  Extensive use of MPEG7 Motion  Major Text objectives: Speech of structural constraints Sounds + Music  Use (Attributed Graph) to improve robustness to feature detection 03/10/2012 B. HUET - HDR Presentation - 115
    114. 114. Multimedia Information Organisation  Organisation for efficient retrieval:  Matching/recognition is slow, efficient data organisation and indexing is crucial  Distributed Index: Peer2Peer and home network P2P scenario under investigation with PF.  Organisation for visualisation: Summarisation  Reduced in time orAllow MPEG7 descriptors to be used space not in semantic content as required  Some “Semantic” informationindexing/search terms. 03/10/2012 B. HUET - HDR Presentation 116  Combination of multiple cues in selection mechanism
    115. 115. Multimedia Document Understanding  Making sense of multimedia data:  Manual annotation of documents is time consuming Multimedia archives MPEG7 descriptors  Automatic association of “low-level” descriptors with corresponding key-words (ontology)  Extension of the work on object recognition and retrieval  Bridging the 03/10/2012 Outdoor/Indoor? Studio? M/F? B. HUET - HDR Presentation - 117
    116. 116. Research Objective and Strategy  “Habilitation a diriger des recherches”  Funding for research  EC RTD Sixth Framework IST Key Action III Multimedia Content and Tools  RIAM: Numerisation, Indexation des contenus et gestion des flux audiovisuel.  Bourse CIFRE: Bouygues Telecom.  ACI Masse des données: Pidot [CNRS].  Prospective Industrial Partners 03/10/2012 B. HUET - HDR Presentation - 118
    117. 117. Conclusion  Research Themes  Address challenges of multimedia content analysis  Build upon existing expertise  Complementary with Eurecom‟s current research projects and themes  Research Strategy  Strengthen industrial contacts  Participate in project proposals  Strong Points 03/10/2012 B. HUET - HDR Presentation - 119
    118. 118. Questions? 03/10/2012 B. HUET - HDR Presentation - 120
    119. 119. Publications  Books and Book Chapters  Journals  Raphael Troncy, Benoit Huet, Simon Schenk, "Multimedia semantics: metadata, analysis and interaction" Wiley-Blackwell, July 2011, ISBN: 9780470747001, pp 1-328  Rachid Benmokhtar, Benoit Huet, Gael Richard, Slim Essid, "Feature extraction for multimedia analysis", Book Chapter no. 4 in "Multimedia Semantics: Metadata, Analysis and Interaction", Wiley, July 2011, ISBN: 9780-470-74700-1 , pp 35-58  Slim Essid, Marine Campedel, Gael Richard, Tomas Piatrik, Rachid Benmokhtar, Benoit Huet, "Machine learning techniques for multimedia analysis" Book Chapter no. 5 in "Multimedia Semantics: Metadata, Analysis and Interaction", Wiley, July 2011, ISBN: 978-0-470-74700-1 , pp 59-80  Benoit Huet, Alan F. Smeaton, Ketan Mayer-Patel , Yannis Avrithis; Advances in Multimedia Modeling Springer : Lecture Notes in Computer Science, Subseries: Information Systems and Applications, incl. Internet/Web, and HCI , Vol. 5371, ISBN: 978-3-540-92891-1  Benoit Huet and Bernard Merialdo, "Automatic video summarization", Chapter in "Interactive Video, Algorithms and Technologies“ by Hammoud, Riad (Ed.), 2006, XVI, 250 p, ISBN: 3-540-33214-6 , pp 27-41. 03/10/2012  Benoit Huet, Tat-Seng Chua and Alexander Hauptmann, "Large-Scale Multimedia Data Collections", IEEE Multimedia, Volume 19, No. 3, JulySeptember 2012.  Rachid Benmokhtar and Benoit Huet, "An ontology-based evidential framework for video indexing using high-level multimodal fusion", Multimedia Tools and Applications, Springer, December 2011 , pp 1-27  Rong Yan, Benoit Huet, Rahul Sukthankar, "Large-scale multimedia retrieval and mining", IEEE Multimedia, Vol 18, No. 1, January-March 2011  Benoit Huet, Alan F. Smeaton, Ketan Mayer-Patel, Yannis Avrithis, "Selected papers from multimedia modeling conference 2009", EURASIP Journal on Image and Video Processing Volume 2010, Article ID 792567  Fabrice Souvannavong, Lukas Hohl, Bernard Merialdo and BenoitHuet, "Structurally Enhanced Latent Semantic Analysis for Video Object Retrieval ", Special Issue of the IEE Proceedings on Vision, Image and Signal Processing , Volume 152, No. 6, 9 December 2005, pp 859-867.  Fabrice Souvannavong, Bernard Merialdo and Benoit Huet, "Partition sampling: an active learning selection strategy for large database annotation", Special Issue of the IEE Proceedings on Vision, Image and Signal Processing ,Volume 152 No. 3, May 2005, Special section on Technologies for interactive multimedia services , pp 347-355.  Ithery Yahiaoui, Bernard Merialdo and Benoit Huet, "Comparison of multiepisode video summarisation algorithms", EURASIP Journal on Applied Signal Processing, Special issue on Multimedia Signal Processing, Vol. 2003, No. 1, page 48-55, January 2003.  Huet B. and E. R. Hancock, "Relational Object Recognition from Large Structural Libraries", Pattern Recognition, Vol. 35, No. 9, page 1895-1915, Sept 2002.  Huet B. and E. R. Hancock, "Line Pattern Retrieval Using Relational Histograms", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 12, page 1363-1370, December 1999.  Huet B., A.D.J. Cross and E.R. Hancock, "Shape Recognition from Large Image Libraries by Inexact Graph Matching", Pattern Recognition in Practice VI, June 2-4 1999, Vlieland, The Netherlands. Appeared in a special issue of Pattern Recognition Letters, 20, page 1259-1269, December 1999.  Huet B. and E.R. Hancock, "Object Recognition from Large Structural Libraries", Advances in Pattern Recognition: Lecture Notes in Computer Science (SSPR98), Springer-Verlag, 1451, August 1998. B. HUET - HDR Presentation - 121
    120. 120. Publications  International Conferences and Workshops  Xueliang Liu and Benoit Huet, "Social Event Visual Modeling from Web Media Data", ACM Multimedia'12 Workshop on Socially-Aware Multimedia, Nara, Japan, 2012.  Xueliang Liu and Benoit Huet, "Social Event Discovery by Topic Inference", WIAMIS 2012, 13th International Workshop on Image Analysis for Multimedia Interactive Services, 23-25 May 2012, Dublin City University, Ireland , Dublin, Ireland.  Xueliang Liu, Raphael Troncy and Benoit Huet, "Using social media to identify events" WSM'11, ACM Multimedia 3rd Workshop on Social Media, November 18-December 1st, 2011, Scottsdale, Arizona, USA  Symeon Papadopoulos, Raphael Troncy, Vasileios Mezaris, Benoit Huet, Ioannis Kompatsiaris, "Social event detection at MediaEval 2011: Challenges, dataset and evaluation", MediaEval 2011, MediaEval Benchmarking Initiative for Multimedia Evaluation, September 1-2, 2011, Pisa, Italy  Xueliang Liu, Raphael Troncy and Benoit Huet, " EURECOM @ MediaEval 2011 social event detection task" MediaEval 2011, MediaEval Benchmarking Initiative for Multimedia Evaluation, September 1-2, 2011, Pisa, Italy  Xueliang Liu, Raphael Troncy and Benoit Huet, "Finding media illustrating events", ICMR'11, 1st ACM International Conference on Multimedia Retrieval, April 17-20, 2011, Trento, Italy  Marco Paleari, Ryad Chellali and Benoit Huet, "Bimodal emotion recognition", ICSR'10, International Conference on Social Robotics, November 23-24, 2010, Singapore - Also published as LNCS Volume 6414/2010, pp 305-314  Xueliang Liu and Benoit Huet, "Concept detector renement using social videos", VLS-MCMR'10, International workshop on Very large-scale multimedia corpus, mining and retrieval, October 29, 2010, Firenze, Italy , pp 19-24  Benoit Huet, Tat-Seng Chua and Alexander Hauptmann, "ACM international workshop on very-large-scale multimedia corpus, mining and retrieval", ACMMM'10, ACM Multimedia 2010, October 25-29, 2010, Firenze, Italy , pp 1769-1770  Xueliang Liu, Benoit Huet, "Automatic concept detector refinement for largescale video semantic annotation", ICSC'10, IEEE 4th International Conference on Semantic Computing, September 22-24, 2010, Pittsburgh, PA, USA , pp 97-100 03/10/2012  Marco Paleari, Benoit Huet, Ryad Chellali, "Towards multimodal emotion recognition : A new approach", CIVR 2010, ACM International Conference on Image and Video Retrieval, July 5-7, Xi'an, China , pp 174-181  Marco Paleari, Ryad Chellali, Benoit Huet, "Features for multimodal emotion recognition : An extensive study", CIS'10, IEEE International Conference on Cybernetics and Intelligent Systems, June 28-30, 2010, Singapore , pp 90-95  Marco Paleari, Vivek Singh, Benoit Huet, Ramesh Jain, "Toward environment-to-environment (E2E) affective sensitive communication systems", MTDL'09, Proceedings of the 1st ACM International Workshop on Multimedia Technologies for Distance Learning at ACM Multimedia, October 23rd, 2009, Beijing, China , pp 19-26  Benoit Huet, Jinhui Tang, Alex Hauptmann, ACM SIGMM the first workshop on web-scale multimedia corpus MM'09 : Proceedings of the seventeen ACM international conference on Multimedia, October 19-24, 2009, Beijing, China , pp 1163-1164  Marco Paleari, Carmelo Velardo, Benoit Huet, Jean-Luc Dugelay, "Face dynamics for biometric people recognition" MMSP'09, IEEE International Workshop on Multimedia Signal Processing, October 5-7, 2009, Rio de Janeiro, Brazil  Rachid Benmokhtar and Benoit Huet, "Hierarchical ontology-based robust video shots indexing using global MPEG-7 visual descriptors", CBMI 2009, 7th International Workshop on Content-Based Multimedia Indexing, June 35, 2009, Chania, Crete Island, Greece  Rachid Benmokhtar and Benoit Huet, "Ontological reranking approach for hybrid concept similarity-based video shots indexing", WIAMIS 2009, 10th International Workshop on Image Analysis for Multimedia Interactive Services, May 6-8, 2009, London, UK  Marco Paleari, Rachid Benmokhtar and Benoit Huet, "Evidence theory based multimodal emotion recognition", MMM 2009, 15th International MultiMedia Modeling Conference, January 7-9, 2009, Sophia Antipolis, France , pp 435446  Thanos Athanasiadis, Nikolaos Simou, Georgios Th. Papadopoulos, Rachid Benmokhtar, Krishna Chandramouli, Vassilis Tzouvaras, Vasileios Mezaris, Marios Phiniketos, Yannis Avrithis, Yiannis Kompatsiaris, Benoit Huet, Ebroul Izquierdo, "Integrating image segmentation and classication for fuzzy knowledge-based multimedia indexing“ MMM 2009, 15th International MultiMedia Modeling Conference, January 7-9, 2009, Sophia Antipolis, France B. HUET - HDR Presentation - 122
    121. 121. Publications  Rachid Benmokhtar, Eric Galmar and Benoit Huet, "K-Space at TRECVid 2008" TRECVid'08, 12th InternationalWorkshop on Video Retrieval Evaluation, November 17-18, 2008, Gaithersburg, USA  Rachid Benmokhtar and Benoit Huet, "Perplexity-based evidential neural network classifier fusion using MPEG-7 low-level visual features", MIR 2008, ACM International Conference on Multimedia Information Retrieval 2008, October 27- November 01, 2008, Vancouver, BC, Canada , pp 336-341  L. Goldmann, T. Adamek, P. Vajda, M. Karaman, R. M•orzinger, E. Galmar, T. Sikora, N. O'Connor, T. Ha-Minh, T. Ebrahimi, P. Schallauer, B. Huet, "Towards fully automatic image segmentation evaluation" ACIVS 2008, Advanced Concepts for Intelligent Vision Systems, October 20-24, 2008, Juan-les-Pins, France  Eric Galmar and Benoit Huet, "Spatiotemporal modeling and matching of video shots", 1st ICIP Workshop on Multimedia Information Retrieval : New Trends and Challenges, October 12-15, 2008, San Diego, California, USA , pp 5-8  Marco Paleari, Benoit Huet, Antony Schutz and Dirk T. M. A. Slock, "A multimodal approach to music transcription", 1st ICIP Workshop on Multimedia Information Retrieval : New Trends and Challenges, October 1215, 2008, San Diego, USA , pp 93-96  Eric Galmar, Thanos Athanasiadis, Benoit Huet, Yannis Avrithis, "Spatiotemporal semantic video segmentation" MMSP 2008, 10th IEEE International Workshop on MultiMedia Signal Processing, October 8-10, 2008, Cairns, Queensland, Australia , pp 574-579  Stephane Turlier, Benoit Huet, Thomas Helbig, Hans-Jorg Vogel, "Aggregation and personalization of infotainment, an architecture illustrated with a collaborative scenario" 8th International Conference on Knowledge Management and Knowledge Technologies, September 4th, 2008, Graz, Austria  Marco Paleari, Benoit Huet, Antony Schutz and Dirk T. M. A. Slock, "Audiovisual guitar transcription", Jamboree 2008 : Workshop By and For KSpace PhD Students, July, 25 2008, Paris, France  Rachid Benmokhtar, Benoit Huet and Sid-Ahmed Berrani, "Low level feature fusion models for soccer scene classification", 2008 IEEE International Conference on Multimedia & Expo, June 23-26, 2008, Hannover, Germany  Marco Paleari, Benoit Huet, "Toward emotion indexing of multimedia excerpts" CBMI 2008, 6th International Workshop on Content Based Multimedia Indexing, June, 18-20th 2008, London, UK [Best student paper award] 03/10/2012  Marco Paleari, Benoit Huet, Brian Duffy, "SAMMI, Semantic affect enhanced multimedia indexing", SAMT 2007, 2nd International Conference on Semantic and Digital Media Technologies, 5-7 December 2007, Genoa, Italy  Rachid Benmokhtar, Eric Galmar and Benoit Huet, "Eurecom at TRECVid 2007: Extraction of high level features", TRECVid'07, 11th International Workshop on Video Retrieval Evaluation, November 2007, Gaithersburg, USA  Rachid Benmokhtar, Eric Galmar and Benoit Huet, ,"K-Space at TRECVid 2007", TRECVid'07, 11th International Workshop on Video Retrieval Evaluation, November 2007, Gaithersburg, USA  Marco Paleari, Brian Duffy and Benoit Huet, "ALICIA, an architecture  for intelligent affective agents", IVA 2007 7th International Conference on Intelligent Virtual Agents, 17th - 19th September 2007 Paris, France | Also published in LNAI Volume 4722 , pp 397-398  Marco Paleari, Brian Duffy and Benoit Huet, "Using emotions to tag media", Jamboree 2007: Workshop By and For KSpace PhD Students, September, 15th 2007, Berlin, Germany  Eric Galmar and Benoit Huet, "Analysis of vector space model and spatiotemporal segmentation for video indexing and retrieval", CIVR 2007, ACM International Conference on Image and Video Retrieved, July 9-11 2007, Amsterdam, The Netherlands  Rachid Benmokhtar, Benoit Huet, Sid-Ahmed Berrani, Patrick Lechat, "Video shots key-frames indexing and retrieval through pattern analysis and fusion techniques", FUSION'07, 10th International Conference on Information Fusion, July 9-12 2007, Quebec, Canada  Rachid Benmokhtar and Benoit Huet, "Multi-level fusion for semantic indexing video content", AMR'07, International Workshop on Adaptive Multimedia Retrieval, June 5-6 2007, Paris, France  Rachid Benmokhtar and Benoit Huet, "Performance analysis of multiple classifier fusion for semantic video content indexing and retrieval", MMM'07, International MultiMedia Modeling Conference,January 9-12 2007, Singapore - Also published as LNCS Volume 4351, pp 517-526  Rachid Benmokhtar and Benoit Huet, "Neural network combining classifier based on Dempster-Shafer theory for semantic indexing in video content", MMM'07, International MultiMedia Modeling Conference, January 9-12 2007, Singapore - Also published as LNCS Volume 4351 , pp 196-205 B. HUET - HDR Presentation - 123
    122. 122. Publications  Rachid Benmokhtar, Emilie Dumont, Bernard Merialdo and Benoit Huet, "Eurecom in TrecVid 2006: high level features extractions and rushes study", TrecVid 2006, 10th International Workshop on Video Retrieval Evaluation, November 2006, Gaithersburg, USA  Peter Wilkins, Tomasz Adamek, Paul Ferguson, Mark Hughes, Gareth J F Jones, Gordon Keenan, Kevin McGuinness, Jovanka Malobabic, Noel E. O'Connor, David Sadlier, Alan F. Smeaton, Rachid Benmokhtar, Emilie Dumont, Benoit Huet, Bernard Merialdo, Evaggelos Spyrou, George Koumoulos, Yannis Avrithis, R. Moerzinger, P. Schallauer, W. Bailer, Qianni Zhang, Tomas Piatrik, Krishna Chandramouli, Ebroul Izquierdo, Lutz Goldmann, Martin Haller, Thomas Sikora, Pavel Praks, Jana Urban, Xavier Hilaire and Joemon M. Jose, "K-Space at TRECVid 2006", TrecVid 2006, 10th International Workshop on Video Retrieval Evaluation, November 2006, Gaithersburg, USA  Isao Echizen, Stephan Singh, Takaaki Yamada, Koichi Tanimoto, Satoru Tezuka and Benoit Huet, "Integrity verification system for video content by using digital watermarking", ICSSSM'06, IEEE International Conference on Services Systems and Services Management, 25-27 October 2006, Troyes, France  Eric Galmar and Benoit Huet, "Graph-based spatio-temporal region extraction", ICIAR 2006, 3rd International Conference on Image Analysis and Recognition, September 18-20, 2006, Povoa de Varzim, Portugal | Also published as Lecture Notes in Computer Science (LNCS) Volume 4141 , pp 236-247  Rachid Benmokhtar and Benoit Huet, "Classifier fusion : combination methods for semantic indexing in video content", ICANN 2006, International Conference on Artificial Neural Networks, 10-14 September 2006, Athens, Greece - also published as LNCS Volume 4132 , pp 65-74  Bernard Merialdo, Joakim Jiten, Eric Galmar and Benoit Huet, "A new approach to probabilistic image modeling with multidimensional hidden Markov models", AMR 2006, 4th International Workshop on Adaptive Multimedia Retrieval , 27-28 July 2006, Geneva, Switzerland |Also published as LNCS Volume 4398  Fabrice Souvannavong and Benoit Huet, "Continuous behaviour knowledge space for semantic indexing of video content", Fusion 2006, 9th International Conference on Information Fusion, 10-13 July 2006, Florence Italy  Benoit Huet and Bernard Merialdo, "Automatic video summarization", Chapter in "Interactive Video, Algorithms and Technologies“ by Hammoud, Riad (Ed.), 2006, XVI, 250 p, ISBN: 3-540-33214-6, pp 27-41 03/10/2012  Joakim Jiten, Bernard Merialdo and Benoit Huet, "Multi-dimensional dependency-tree hidden Markov models", ICASSP 2006, 31st IEEE International Conference on Acoustics, Speech, and Signal Processing, May 14-19, 2006, Toulouse, France  Joakim Jiten, Benoit Huet and Bernard Merialdo, "Semantic feature extraction with multidimensional hidden Markov model", SPIE Conference on Multimedia Content Analysis, Management and Retrieval 2006, January 1719, 2006 - San Jose, USA - SPIE proceedings Volume 6073 Volume 6073 , pp 211-221  Joakim Jiten, Fabrice Souvannavong, Bernard Merialdo and Benoit Huet, “Eurecom at TRECVid 2005: extraction of high-level features", TRECVid 2005, TREC Video Retrieval Evaluation, November 14, 2005, USA  Benoit Huet, Joakim Jiten, Bernard Merialdo, "Personalization of hyperlinked video in interactive television", IEEE International Conference on Multimedia & Expo July 6-8, 2005, Amsterdam, The Netherlands.  B. Cardoso, F. de Carvalho, L. Carvalho, G. Fernandez, P. Gouveia, B. Huet, J. Jiten, A.Lopez, B. Merialdo, A. Navarro, H. Neuschmied, M. Noe, R. Salgado, G. Thallinger, "Hyperlinked video with moving object in digital television", IEEE International Conference on Multimedia & Expo, July 6-8, 2005, Amsterdam, The Netherlands.  F. Souvannavong, B. Merialdo and B. Huet, "Region-based video content indexing and retrieval", Fourth International Workshop on Content-Based Multimedia Indexing (CBMI'05), June 21-23, 2005 Riga, Latvia.  F. Souvannavong, B. Merialdo and B. Huet, "Multi-modal classier fusion for video shot content", 6th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS'05), Montreux, Switzerland, April 2005.  Fabrice Souvannavong, L. Hohl, B. Merialdo and B. Huet, "Enhancing latent Semantic Analysis Video Object Retrieval with Structural Information", IEEE International Conference on Image Processing, October 24-27, 2004 Singapore.  Fabrice Souvannavong, B. Merialdo and B. Huet, "Latent Semantic Analysis For An Effective Region Based Video Shot Retrieval System", 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, held in conjunction with ACM Multimedia 2004, October 15-16, 2004, New York, NY USA.  Fabrice Souvannavong, B. Merialdo and B. Huet, "Eurecom at Video-TREC 2004: Feature Extraction Task ", NIST Special Publication, The 13th Text Retrieval Conference (TREC 2004 Video Track). B. HUET - HDR Presentation - 124
    123. 123. Publications  Bernardo Cardoso and Fausto de Carvalho and Gabriel Fernandez and Benoit Huet and Joakim Jiten and Alejandro Lopez and Bernard Merialdo and Helmut Neuschmied and Miquel Noe and David Serras Pereira and Georg Thallinger. "Personalization of Interactive Objects in the GMF4iTV project ". Proceedings of TV'04: the 4th Workshop on Personalization in Future TV held in conjunction with Adaptive Hypermedia 2004 ,Eindhoven, The Netherlands, August 23, 2004.  Fabrice Souvannavong, L. Hohl, B. Merialdo and B. Huet, "Using Structure for Video Object Retrieval", International Conference on Image and Video Retrieval, July 21-23, 2004, Dublin City University, Ireland .  Fabrice Souvannavong, B. Merialdo and B. Huet, "Improved Video Content Indexing By Multiple Latent Semantic Analysis", International Conference on Image and Video Retrieval, July 21-23, 2004, Dublin City University, Ireland .  Fabrice Souvannavong, B. Merialdo and B. Huet, "Latent Semantic Indexing For Semantic Content Detection Of Video Shots", IEEE International Conference on Multimedia and Expo (ICME'2004), June 27th { 30th, 2004, Taipei, Taiwan.  Fabrice Souvannavong, B. Merialdo and B. Huet, "Partition Sampling for Active Video Database Annotation", 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS'04), April 21-23, 2004, Instituto Superior Tecnico, Lisboa, Portugal.  Fabrice Souvannavong, B. Merialdo and B. Huet, "Latent Semantic Indexing for Video Content Modeling and Analysis", NIST Special Publication, The 12th Text Retrieval Conference (TREC 2003 Video Track).  Fabrice Souvannavong, B. Merialdo and B. Huet, "Video Content Structuration With Latent Semantic Analysis", Third International Workshop on Content-Based Multimedia Indexing, CBMI 2003, 22-24 Septembre 2003, Rennes, France.  Fabrice Souvannavong, B. Merialdo and B. Huet, "Semantic Feature Extraction using Mpeg Macro-block Classification", NIST Special Publication: SP 500-251, The Eleventh Text Retrieval Conference (TREC 2002 Video Track).  Gerhard Mekenkamp, Mauro Barbieri, Benoit Huet, Itheri Yahiaoui, Bernard Merialdo, Riccardo Leonardi and Michael Rose, "Generating TV Summaries for CE Devices", ACM Multimedia 2002, December 3-5 2002, Juan Les Pins, France. 03/10/2012  Benoit Huet, Itheri Yahiaoui, Bernard Merialdo, "Image Similarity for Automatic Video Summarization", EUSIPCO 2002 - 11th European Signal Processing Conference, September 3-6 2002, Toulouse, France.  Bernard Merialdo, B. Huet, I. Yahiaoui, Fabrice Souvannavong, "Automatic Video Summarization", International Thyrrenian Workshop on Digital Communications, Advanced Methods for Multimedia Signal Processing, September 8th - 11th, 2002, Palazzo dei Congressi, Capri, Italy.  Benoit Huet, G. Guarascio, N. Kern and B. Merialdo, "Relational skeletons for retrieval in patent drawings", IEEE International Conference Image Processing (ICIP2001), October 7-10 2001, Thessaloniki, Greece.  Ithery Yahiaoui, Bernard Merialdo et Benoit Huet, "Automatic Summarization of Multi-episode Videos with the Simulated User Principle", Workshop on MultiMedia Signal Processing (MMSP'01), October 3-5, 2001, Cannes, France.  Itheri Yahiaoui, Bernard Merialdo and Benoit Huet, "Optimal video summaries for simulated evaluation", EuropeanWorkshop on Content-Based Multimedia Indexing, September 19-21, 2001 Brescia, Italy.  Itheri Yahiaoui, Bernard Merialdo and Benoit Huet, "AUTOMATIC VIDEO SUMMARIZATION", MMCBIR 2001 - Indexation et Recherche par le Contenu dans les Documents Multimedia, 24 et 25 septembre 2001, INRIA Rocquencourt, France.  Ithery Yahiaoui, Bernard Merialdo et Benoit Huet, "Generating Summaries of Multi-Episodes Video", International Conference on Multimedia & Expo (ICME2001), August 22-25, 2001 Tokyo, Japan.  Itheri Yahiaoui, Bernard Merialdo and Benoit Huet, "Automatic construction of multi-video summaries", ISKO: Filtrage et resume automatique de l'information sur les reseaux, July 5-6 2001, Nanterre, France.  Benoit Huet, Ithery Yahiaoui et Bernard Merialdo, "Multi-Episodes Video Summaries", International Conference on Media Futures 2001, 8-9 May 2001, Florence, Italy.  Arnd Kohrs, Benoit Huet, et Bernard Merialdo, "Multimedia Information Recommendation and Filtering on the Web", Networking 2000, May 14 - 19, 2000, Paris, France.  Merialdo B., S. Marchand-Maillet and B. Huet, "Approximate Viterbi decoding for 2D-Hidden Markov Models", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2000), Istanbul Turkey, June 5-9 2000. B. HUET - HDR Presentation - 125
    124. 124. Publications  Huet B. and E. R. Hancock, "Sensitivity Analysis for Object Recognition from Large Structural Libraries", IEEE International Conference on Computer Vision (ICCV99), Kerkyra, Greece, September 20-27, 1999.  Huet B. and E. R. Hancock, "Inexact Graph Retrieval", IEEE CVPR99 Workshop on Content-based Access of Image and Video Libraries (CBAIVL99), Fort Collins, Colorado USA, June 22, 1999.  Huet B., A.D.J. Cross and E.R. Hancock, "Shape Retrieval by Inexact Graph Matching";, IEEE International Conference on Multimedia Computing and Systems (ICMCS'99), Florence, Italy, page 772-776, 7-11 June 1999.  Huet B. and E.R. Hancock, "Structural Sensitivity for Large-Scale LinePattern Recognition", Third International Conference on Visual Information Systems (VISUAL99), page 711-718, 2-4 June, 1999, The Netherlands.  Huet B., A.D.J. Cross and E.R. Hancock, "Graph Matching for Shape Retrieval", Advances in Neural Information Processing Systems 11, Edited by M.J. Kearns, S.A. Solla and D.A. Cohn, MIT Press, June 1999.  Worthington P., B. Huet and E.R. Hancock, "Appearance-Based Object Recognition Using Shape-From-Shading", Proceeding of the 14th International Conference on Pattern Recognition (ICPR'98), Brisbane (Australia), page 412-416, 16-20 August 1998.  Huet B. and E.R. Hancock, "Relational Histograms for Shape Indexing", IEEE International Conference on Computer Vision (ICCV98), Mumbai India, page 563-569, Jan 1998.  Huet B. and E.R. Hancock, "Fuzzy Relational Distance for Large-scale Object Recognition", IEEE Conference on Computer Vision and Pattern Recognition (CVPR'98), Santa Barbara California USA, page 138-143, 1998.  Huet B. and E.R. Hancock, "Pairwise Representation for Image Database Indexing", Sixth International Conference on Image Processing and its Applications (IPA97), Dublin (Ireland), 15-17 July 1997.  Huet B. and E.R. Hancock, "Cartographic Indexing into a Database of Remotely Sensed Images", Third IEEE Workshop on Applications of Computer Vision (WACV96), Sarasota Florida (USA), page 8-14, 1996.  Huet B. and E.R. Hancock, "Structural Indexing of infra-red images using Statistical Histogram Comparison", Third International Workshop on Image and Signal Processing (IWISP'96), Manchester (UK), 4-7 Nov 1996.  Charlton P. and Huet B., "Intelligent Agents for Image Retrieval",Research and Technology Advances in Digital Libraries, Virginia (USA), May 1995.   Charlton P. and Huet B., "Using Multiple Agents For Content-Based Image Retrieval", European Research Seminar on Advances in Distributed Systems, L'Alpe D'Huez (France), April 1995 . National Conferences and Workshops  E. Galmar and B. Huet, "Methode de segmentation par graphe pour le suivi de regions spatio-temporelles". CORESA 2005, 10emes journees Compression et representation des signaux audiovisuels, 7-8 Novembre 2005, Rennes, France.  Fabrice Souvannavong, B. Merialdo and B. Huet, "Classification Semantique des Macro-Blocs Mpeg dans le Domaine Compresse.", CORESA 2003,16 17 Janvier 2003, Lyon France.  Itheri Yahiaoui, Bernard Merialdo, Benoit Huet, "User Evaluation of MultiEpisode Video Summaries", Indexation de documents et Recherche d'informations, GDR I3 et ISIS, July 9 2002, Grenoble, France.  Itheri Yahiaoui, Bernard Merialdo, Benoit Huet, "Construction et Evaluation automatique de resumes multi-videos", Analyse et Indexation Multimedia, June 20 2002, Universite Bordeaux 1, France.  I. Yahiaoui, B. Merialdo et B. Huet, "Construction automatique de resumes multi-videos", CORESA 2001, Nov 2001, Universite de Dijon, France.  I. Yahiaoui, B. Merialdo et B. Huet, "Resumes automatiques de sequences video", CORESA2000, 19-20 Octobre 2000, Universite de Poitiers, Futuroscope, France.  Worthington P., B. Huet and E.R. Hancock, "Increased Extend of Characteristic Views using Shape-from-Shading for Object Recognition", Proceeding of the British Machine Vision Conference (BMVC'98), Southampton (UK), page 710-719, 7-10 Sept 1998.  Huet B. and E.R. Hancock, "Structurally Gated Pairwise Geometric Histograms for Shape Indexing", Proceeding of the British Machine Vision Conference (BMVC97), Colchester (UK), page 120-129, 8-11 Sept 1997.  9. Huet B. and E.R. Hancock, "A Statistical Approach to Hierarchical Shape Indexing", Intelligent Image Databases (IEE and BMVA), London (UK), May 1996.  Thesis:   03/10/2012 B. HUET - HDR Presentation Huet B., “Multimedia Content Understanding: Bringing Context to Content”, HDR, University Nice Sophia-Antipolis, France, Oct 2012 Huet B., "Object Recognition from Large Libraries of Line-Patterns", PhD Thesis, University of York, Mai 1999. - 126

    ×