Serena is a Ph.D. student in the Stanford Vision Lab, advised by Prof. Fei-Fei Li. Her research interests are in computer vision, machine learning, and deep learning. She is particularly interested in the areas of video understanding, human action recognition, and healthcare applications. She interned at Facebook AI Research in Summer 2016.
Before starting her Ph.D., she received a B.S. in Electrical Engineering in 2010, and an M.S. in Electrical Engineering in 2013, both from Stanford. She also worked as a software engineer at Rockmelt (acquired by Yahoo) from 2009-2011.
Towards Scaling Video Understanding:
The quantity of video data is vast, yet our capabilities for visual recognition and understanding in videos lags significantly behind that for images. In this talk, I will first discuss some of the challenges of scale in labeling, modeling, and inference behind this gap. I will then present some of our recent work towards addressing these challenges, in particular using reinforcement learning-based formulations to tackle efficient inference in videos and learning classifiers from noisy web search results. Finally, I will conclude with discussion on future promising directions towards scaling video understanding.