Dense-captioning events in videos


Published on

Published in: Technology
  1. 1. Dense-Captioning Events in Videos
  2. 2. Dense-Captioning
  3. 3. Highlight • Task: dense-captioning events • Dataset: ActivityNet Captions • Events range across multiple time scales and can even overlap. • generating action proposals to multi-scale detection of events, processes each video in a forward pass to detect events as they occur • Events in a given video are usually related to one another. • introduce a captioning module that utilizes the context from all the events from our proposal module to generate each sentence
  4. 4. DenseCap: Fully Convolutional Localization Networks for Dense Captioning
  5. 5. DenseCap: Fully Convolutional Localization Networks for Dense Captioning
  6. 6. Method V. Escorcia, F. C. Heilbron, J. C. Niebles, and B. Ghanem. Daps: Deep action proposals for action understanding. 2016,ECCV J. Johnson, A. Karpathy, and L. Fei-Fei. DenseCap: Fully convolutional localization networks for dense captioning. A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei- Fei, and S. Savarese. Social lstm: Human trajectory prediction in crowded spaces. object-centric in images action-centric in videos
  7. 7. Performance
  8. 8. Discussion Jointly Localizing and Describing Events for Dense Video Captioning
  9. 9. Discussion Joint Event Detection and Description in Continuous Video Streams