Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Controllable image to-video translation


Published on

Jiaxu Miao

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Controllable image to-video translation

  1. 1. Controllable image-to-video translation: A case study on facial expression generation
  2. 2. Introduction ■ Task Description – how to generate video clips of rich facial expressions from a single profile photo of the neutral expression ■ Difficulties – The image-to-video translation might seem like an ill-posed problem because the output has much more unknowns to fill in than the input values – humans are familiar with and sensitive about the facial expressions – the face identity is supposed to be preserved in the generated video clips
  3. 3. Introduction ■ Different people express emotions in similar manners ■ the expressions are often “unimodal” for a fixed type of emotion ■ the human face of a profile photo draws a majority of users’ attention, leaving the quality of the generated background less important
  4. 4. Method ■ Problem formulation ■ Given an input image I ∈ RH×W×3 where H andW are respectively the height and width of the image, our goal is to generate a sequence of video frames {V (a) := f(I,a);a ∈ [0,1]}, where f(I,a) denotes the model to be learned. ■ Properties: – a=0, f(I,a)=I – Smooth, f(I,a) and f(I,a+ Δa) should be visually similar when ∆a is small – V (1) be the peak state of the expression
  5. 5. Method
  6. 6. Method Training loss adversarial loss Temporal continuity Facial landmark prediction Lk:
  7. 7. method ■ Jointly learning the models of different types of facial expressions
  8. 8. Experiments Visualization
  9. 9. Experiments ■ Analysis on temporal continuity
  10. 10. Experiments