Controllable image-to-video translation for generating facial expressions

•Download as PPTX, PDF•

0 likes•91 views

哲

哲东郑

Jiaxu Miao

Technology

Controllable image-to-video
translation: A case study on facial
expression generation

Method
■ Problem formulation
■ Given an input image I ∈ RH×W×3 where H andW are respectively the height and
width of the image, our goal is to generate a sequence of video frames {V (a) := f(I,a);a
∈ [0,1]}, where f(I,a) denotes the model to be learned.
■ Properties:
– a=0, f(I,a)=I
– Smooth, f(I,a) and f(I,a+ Δa) should be visually similar when ∆a is small
– V (1) be the peak state of the expression

Method
Training loss
adversarial loss
Temporal continuity
Facial landmark prediction Lk:

method
■ Jointly learning the models of different types of facial expressions

Experiments
■ Analysis on temporal continuity

Recently uploaded

How to Remove Document Management Hurdles with X-Docs?XfilesPro

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Understanding the Laravel MVC ArchitecturePixlogix Infotech

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Recently uploaded (20)

How to Remove Document Management Hurdles with X-Docs?

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Injustice - Developers Among Us (SciFiDevCon 2024)

Azure Monitor & Application Insight to monitor Infrastructure & Application

Unblocking The Main Thread Solving ANRs and Frozen Frames

Understanding the Laravel MVC Architecture

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

A Domino Admins Adventures (Engage 2024)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Human Factors of XR: Using Human Factors to Design XR Systems

SQL Database Design For Developers at php[tek] 2024

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Pigging Solutions Piggable Sweeping Elbows

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Controllable image-to-video translation for generating facial expressions

1. Controllable image-to-video translation: A case study on facial expression generation

2. Introduction ■ Task Description – how to generate video clips of rich facial expressions from a single profile photo of the neutral expression ■ Difficulties – The image-to-video translation might seem like an ill-posed problem because the output has much more unknowns to fill in than the input values – humans are familiar with and sensitive about the facial expressions – the face identity is supposed to be preserved in the generated video clips

3. Introduction ■ Different people express emotions in similar manners ■ the expressions are often “unimodal” for a fixed type of emotion ■ the human face of a profile photo draws a majority of users’ attention, leaving the quality of the generated background less important

4. Method ■ Problem formulation ■ Given an input image I ∈ RH×W×3 where H andW are respectively the height and width of the image, our goal is to generate a sequence of video frames {V (a) := f(I,a);a ∈ [0,1]}, where f(I,a) denotes the model to be learned. ■ Properties: – a=0, f(I,a)=I – Smooth, f(I,a) and f(I,a+ Δa) should be visually similar when ∆a is small – V (1) be the peak state of the expression

5. Method

6. Method Training loss adversarial loss Temporal continuity Facial landmark prediction Lk:

7. method ■ Jointly learning the models of different types of facial expressions

8. Experiments Visualization

9. Experiments ■ Analysis on temporal continuity

10. Experiments