SlideShare a Scribd company logo
Temporal Generative Adversarial Nets
with Singular Value Clipping
작은논문읽기 모임 2018-2-2nd
영상 및 비디오 패턴 인식 연구실 이형민
연구 분야
• Video Generation
 Frame Interpolation
 Frame Extrapolation(Future Frame Prediction)
 Image Animation
• Video-Sound Fusion
 Sound of Pixel
 Cocktail Party Effect
• Simple Video Processing
 Image Processing  Video
공통점: Video, Time Axis
GAN
Generator Discriminator
Generator Discriminator
GAN with Temporal Data?
Image
Video
GAN with Temporal Data?
Spatial 축과 Temporal 축이 전부 동일하게 취급된다!!
Temporal GAN
Temporal GAN
• 3d Convolution
• (channel, time, height, width)
Future Research Ideas
• Frame Interpolation
논문은 많이 읽었는데 아이디어 구체화가 안됨
• Text-Guided Image Animation
Model의 창의와 자유도를 제한하면서 인간이 원하는 방향을 제시하는 방식에 흥미가 생겼음.
• Motion Deblurring using Frame Interpolation
Video는 자연이 Labeling 해준 천연 Dataset
Video Frame Rate를 높여서 인위적으로 Motion Blurred Image를 형성한 뒤, 역으로 학습시키는 아이디어

More Related Content

More from Hyeongmin Lee

PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
Hyeongmin Lee
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
Hyeongmin Lee
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
Hyeongmin Lee
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
Hyeongmin Lee
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
Hyeongmin Lee
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
Hyeongmin Lee
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Hyeongmin Lee
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Hyeongmin Lee
 
Pr266
Pr266Pr266
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
Hyeongmin Lee
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
Hyeongmin Lee
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
Hyeongmin Lee
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
Hyeongmin Lee
 
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
Hyeongmin Lee
 
[Paper Review] Unmasking the abnormal events in video
[Paper Review] Unmasking the abnormal events in video[Paper Review] Unmasking the abnormal events in video
[Paper Review] Unmasking the abnormal events in video
Hyeongmin Lee
 
GAN with Mathematics
GAN with MathematicsGAN with Mathematics
GAN with Mathematics
Hyeongmin Lee
 
[Paper Review] Image captioning with semantic attention
[Paper Review] Image captioning with semantic attention[Paper Review] Image captioning with semantic attention
[Paper Review] Image captioning with semantic attention
Hyeongmin Lee
 
Git command
Git commandGit command
Git command
Hyeongmin Lee
 

More from Hyeongmin Lee (20)

PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
 
Pr266
Pr266Pr266
Pr266
 
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
 
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
 
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
 
[Paper Review] Unmasking the abnormal events in video
[Paper Review] Unmasking the abnormal events in video[Paper Review] Unmasking the abnormal events in video
[Paper Review] Unmasking the abnormal events in video
 
GAN with Mathematics
GAN with MathematicsGAN with Mathematics
GAN with Mathematics
 
[Paper Review] Image captioning with semantic attention
[Paper Review] Image captioning with semantic attention[Paper Review] Image captioning with semantic attention
[Paper Review] Image captioning with semantic attention
 
Git command
Git commandGit command
Git command
 

[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping

  • 1. Temporal Generative Adversarial Nets with Singular Value Clipping 작은논문읽기 모임 2018-2-2nd 영상 및 비디오 패턴 인식 연구실 이형민
  • 2. 연구 분야 • Video Generation  Frame Interpolation  Frame Extrapolation(Future Frame Prediction)  Image Animation • Video-Sound Fusion  Sound of Pixel  Cocktail Party Effect • Simple Video Processing  Image Processing  Video 공통점: Video, Time Axis
  • 4. GAN with Temporal Data? Image Video
  • 5. GAN with Temporal Data? Spatial 축과 Temporal 축이 전부 동일하게 취급된다!!
  • 7. Temporal GAN • 3d Convolution • (channel, time, height, width)
  • 8. Future Research Ideas • Frame Interpolation 논문은 많이 읽었는데 아이디어 구체화가 안됨 • Text-Guided Image Animation Model의 창의와 자유도를 제한하면서 인간이 원하는 방향을 제시하는 방식에 흥미가 생겼음. • Motion Deblurring using Frame Interpolation Video는 자연이 Labeling 해준 천연 Dataset Video Frame Rate를 높여서 인위적으로 Motion Blurred Image를 형성한 뒤, 역으로 학습시키는 아이디어