Deep learning features and
similarity of movies based on their
video content
Summer Camp - Show Max - Lukáš Lopatovský
Assignment
● Deep learning allows extracting useful features from video
frames. Your task is to apply new deep learning frameworks
to extract features from video frames of selected movies
available in the ShowMax streaming platform.
● Goals:
● Extract deep features from video frames. Explore similar
movies in the space of latent features and adjust the
extraction process in order to create clusters of video assets
(eg. TV episodes).
Residual Networks
● Enable to build deeper (convolutional) neural
network. (State of the art method for the image
recognition.)
Residual Networks
● To enable to build deeper network, the residual
nets use the simple trick. They maintain the
residuum from the previous layer ( so do not
loose the previously known information )
Torch
- Efficient Tensor library (like NumPy) with an
efficient CUDA backend
- Neural Networks package -- build arbitrary
acyclic computation graphs with automatic
differentiation
- fast CUDA and CPU backends
- Good community and industry support - several
hundred community-built and maintained
packages.
● Torch example of ‘nn’ library
What has been done
● The movies were classified using arbitrary
number of picture frames.
● We have used already trained ImageNet FB-
resnet network and own data set trained and
fine-tuned networks to classify movies.
● To detect the object in the image, we have
classify the whole image, as well as we have
made the various crops to get more accurate
predictions. (Cropping showed better results)
Classification output
● By classification of frames in the movie, the
special file is produced (.res). It is in a form to
contain all the important data. It can be later post-
process according to the special needs of the user:
- To create Object detection .srt file.
- To get various cumulative classification results.
- To trace the appearance of the object at the time-
line.
Object classification exmple
Object classification exmple
Object classification exmple
False positive
False positive
False positive
Own datasets
● The network was successfully trained and fine-tuned
from the ResNet network
● However, it showed some problems based from
improper dataset.
- Some categories contain many irrelevant pictures in second half
of the search. (Special case: “The doctor House”)
- The style of the images in the search is often very different to the
style found in the movie. (kitchen, car)
- Movies mostly contain images full of people, so the categories
containing people make false positive prediction. (cinema, theater)
Classification example
⊙ω⊙
Classification example
⊙ω⊙
Object detection
Object detection
Object detection
Object detection
Object detection
Next step
● After the discussion in the company, the
programs were transformed to the easily usable
form.
● The feature vectors of the classification will be
used to find similarities among movies.
Compared to the existing algorithms and if
successful, incorporated into the current
recommendation system.

Deep learning features and similarity of movies based on their video content

  • 1.
    Deep learning featuresand similarity of movies based on their video content Summer Camp - Show Max - Lukáš Lopatovský
  • 2.
    Assignment ● Deep learningallows extracting useful features from video frames. Your task is to apply new deep learning frameworks to extract features from video frames of selected movies available in the ShowMax streaming platform. ● Goals: ● Extract deep features from video frames. Explore similar movies in the space of latent features and adjust the extraction process in order to create clusters of video assets (eg. TV episodes).
  • 3.
    Residual Networks ● Enableto build deeper (convolutional) neural network. (State of the art method for the image recognition.)
  • 4.
    Residual Networks ● Toenable to build deeper network, the residual nets use the simple trick. They maintain the residuum from the previous layer ( so do not loose the previously known information )
  • 5.
    Torch - Efficient Tensorlibrary (like NumPy) with an efficient CUDA backend - Neural Networks package -- build arbitrary acyclic computation graphs with automatic differentiation - fast CUDA and CPU backends - Good community and industry support - several hundred community-built and maintained packages.
  • 6.
    ● Torch exampleof ‘nn’ library
  • 7.
    What has beendone ● The movies were classified using arbitrary number of picture frames. ● We have used already trained ImageNet FB- resnet network and own data set trained and fine-tuned networks to classify movies. ● To detect the object in the image, we have classify the whole image, as well as we have made the various crops to get more accurate predictions. (Cropping showed better results)
  • 8.
    Classification output ● Byclassification of frames in the movie, the special file is produced (.res). It is in a form to contain all the important data. It can be later post- process according to the special needs of the user: - To create Object detection .srt file. - To get various cumulative classification results. - To trace the appearance of the object at the time- line.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
    Own datasets ● Thenetwork was successfully trained and fine-tuned from the ResNet network ● However, it showed some problems based from improper dataset. - Some categories contain many irrelevant pictures in second half of the search. (Special case: “The doctor House”) - The style of the images in the search is often very different to the style found in the movie. (kitchen, car) - Movies mostly contain images full of people, so the categories containing people make false positive prediction. (cinema, theater)
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
    Next step ● Afterthe discussion in the company, the programs were transformed to the easily usable form. ● The feature vectors of the classification will be used to find similarities among movies. Compared to the existing algorithms and if successful, incorporated into the current recommendation system.