SlideShare a Scribd company logo
MINOR PROJECT REPORT
By: Druv Gera
DENOISING DIFFUSION
PROBABILISTIC MODEL
AGENDA
Introduction​
Methodology
​Diffusion Models
Training Results
​Results and Discussions
References​
INTRODUCTION
• Evolution of Generative Models:
• Rapid progress in recent years.
• Capable of generating human-like language, synthetic
images, and diverse audio.
• Challenges with Current Techniques:
• Existing computer vision techniques predict predetermined
object characteristics.
• Limitations in generality and usability.
• Alternative Approach - Learning from Raw Text:
• Proposal to learn directly from raw text describing the image.
• Overcomes limitations of predefined labels.
3
INTRODUCTION – CONT.…
• Introduction to CLIP:
• Contrastive models like CLIP as a key inspiration.
• Demonstrates robust image representations capturing
both semantics and style.
• Project Objectives:
• Two-stage model proposed:
• Prior generating a CLIP image embedding from a
given text.
• Decoder generating an image based on these CLIP
image embeddings.
4
METHODOLOGY
METHODOLOGY
• CLIP as a Representation Learner:
• CLIP (Contrastive Language-Image Pretraining) recognized for
robust image representations.
• Desirable properties, including robustness and applicability to
various tasks.
• Two-Stage Model: Prior and Decoder:
• Introduction of the proposed two-stage model for image generation.
• Prior Mechanism:
• Generates a CLIP image embedding from a given text caption.
• Decoder:
• Generates an image based on CLIP image embeddings.
• Leveraging CLIP Image Embeddings:
• CLIP image embeddings used as a bridge between textual
descriptions and image generation.
• Enhances generality and adaptability in the image generation
process.
6
METHODOLOGY
• Combining CLIP and Diffusion Models:
• Synergy between CLIP and diffusion models for improved
image synthesis.
• Language-guided image manipulations achieved through the
joint embedding space of CLIP.
• Language-Guided Image Manipulations:
• Application of CLIP for language-guided modifications in
generated images.
• Explores the joint embedding space of CLIP for enhanced
control and manipulation
7
DIFFUSION MODELS
• Introduction to Diffusion Models:
• Diffusion models represent a state-of-the-art family of deep
generative models.
• Break the long-time dominance of GANs in challenging
image synthesis tasks.
• Role of Denoising Diffusion Probabilistic Models (DDPM):
• DDPMs use two Markov chains: forward and reverse.
• Forward chain perturbs data to noise, while the reverse chain
converts noise back to data.
• Three Predominant Formulations:
• DDPMs, Score-Based Generative Models (SGMs), and
Stochastic Differential Equations (Score SDEs).
• Each formulation offers unique features and benefits for the
diffusion model.
8
DIFFUSION MODELS
• Training the Diffusion Model on Stanford Cars Dataset:
• Dataset: Stanford Cars dataset with approximately 8000
images in the train set.
• Training process involves progressively destructing data by
injecting noise.
• Forward and Reverse Markov Chains in Diffusion Models:
• Forward process injects noise into data until all structures are
lost.
• Reverse process gradually removes noise by running a
learnable Markov chain.
9
DENOISING DIFFUSION
PROBABILISTIC MODELS (DDPM)
A denoising diffusion probabilistic model (DDPM) makes use of two
Markov chains: a forward chain that perturbs data to noise, and a
reverse chain that converts noise back to data.
New data points are subsequently generated by first sampling a
random vector from the prior distribution, followed by ancestral
sampling through the reverse Markov chain.
Using the chain rule of probability and the Markov property, we can
factorize the joint
distribution of x1, x2 . . . xT conditioned on x0, denoted as q(x1, . . . ,
xT| x0), into
10
TRAINING
We trained the diffusion model to generate images of cars as per the
theory explained above. As a dataset, we used the Stanford Cars
dataset which consists of around 8000 images in the trainset
11
TRAINING RESULTS
• Training the Diffusion Model on Stanford Cars Dataset:
• Utilized the Stanford Cars dataset comprising around 8000
images in the training set.
• Focus on generating images of cars as per the project's
objectives.
• Results from the Forward Markov Chain:
• Forward process involves injecting noise into data
progressively.
• Reverse Markov Chain Using Simple UNet Architecture:
• Utilized a Simple UNet Architecture for the reverse Markov
chain.
• Trained on the dataset for 100 epochs to optimize the
learnable transition kernel.
12
TRAINING RESULTS
13
TRAINING RESULTS
14
RESULTS
15
• Explored diffusion models comprehensively.
• Showed diffusion models, as likelihood-based models,
outperform state-of-the-art GANs.
• Highlighted the stationary training objective's role in
achieving superior sample quality.
• Introduced an improved architecture, successfully
applied to unconditional image generation.
• Presented a classifier guidance technique for extending
quality to class-conditional tasks.
RESULTS
16
REFERENCES
17
[1] Aditya Ramesh, Prafulla Dhariwal, Alex Nicho, Casey Chu, Mark
Chen. Hierarchical
Text-Conditional Image Generation with CLIP Latents,
arXiv:2204.06125
[2] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh,
Gabriel Goh, Sandhini
Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark,
Gretchen Krueger, Ilya
Sutskever. Learning Transferable Visual Models From Natural
Language Supervision,
arXiv:2103.000201
THANK YOU
DRUV GERA
COMPUTER SCIENCE
DELHI TECHNICAL
UNIVERSITY

More Related Content

Similar to Minor Project Report on Denoising Diffusion Probabilistic Model

ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTIONANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
RajatRoy60
 
How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
Wee Hyong Tok
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
Shuai Zhang
 
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable DiffusionDreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
drawais8
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
Fernando Constantino
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
Danielle Dean
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
KhondokerAbuNaim
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
Build, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with EaseBuild, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with Ease
Databricks
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
FEG
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
Sri Ambati
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Dongmin Choi
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
inside-BigData.com
 
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDeep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Databricks
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
Hannes Fassold
 
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
Daniel Varro
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo
 
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...
JacobSilbiger1
 
A neural image caption generator
A neural image caption generatorA neural image caption generator
A neural image caption generator
heedaeKwon
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Sujit Pal
 

Similar to Minor Project Report on Denoising Diffusion Probabilistic Model (20)

ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTIONANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
 
How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable DiffusionDreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
Build, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with EaseBuild, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with Ease
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
 
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDeep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
 
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...
 
A neural image caption generator
A neural image caption generatorA neural image caption generator
A neural image caption generator
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 

Recently uploaded

Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...
Prakhyath Rai
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
ijaia
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
uqyfuc
 
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
MadhavJungKarki
 
Call For Paper -3rd International Conference on Artificial Intelligence Advan...
Call For Paper -3rd International Conference on Artificial Intelligence Advan...Call For Paper -3rd International Conference on Artificial Intelligence Advan...
Call For Paper -3rd International Conference on Artificial Intelligence Advan...
ijseajournal
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
harshapolam10
 
Accident detection system project report.pdf
Accident detection system project report.pdfAccident detection system project report.pdf
Accident detection system project report.pdf
Kamal Acharya
 
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
sydezfe
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Transcat
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
vmspraneeth
 
smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...
um7474492
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
2. protection of river banks and bed erosion protection works.ppt
2. protection of river banks and bed erosion protection works.ppt2. protection of river banks and bed erosion protection works.ppt
2. protection of river banks and bed erosion protection works.ppt
abdatawakjira
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
P5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civilP5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civil
AnasAhmadNoor
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 

Recently uploaded (20)

Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
 
Call For Paper -3rd International Conference on Artificial Intelligence Advan...
Call For Paper -3rd International Conference on Artificial Intelligence Advan...Call For Paper -3rd International Conference on Artificial Intelligence Advan...
Call For Paper -3rd International Conference on Artificial Intelligence Advan...
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
 
Accident detection system project report.pdf
Accident detection system project report.pdfAccident detection system project report.pdf
Accident detection system project report.pdf
 
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
 
smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
2. protection of river banks and bed erosion protection works.ppt
2. protection of river banks and bed erosion protection works.ppt2. protection of river banks and bed erosion protection works.ppt
2. protection of river banks and bed erosion protection works.ppt
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
P5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civilP5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civil
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 

Minor Project Report on Denoising Diffusion Probabilistic Model

  • 1. MINOR PROJECT REPORT By: Druv Gera DENOISING DIFFUSION PROBABILISTIC MODEL
  • 3. INTRODUCTION • Evolution of Generative Models: • Rapid progress in recent years. • Capable of generating human-like language, synthetic images, and diverse audio. • Challenges with Current Techniques: • Existing computer vision techniques predict predetermined object characteristics. • Limitations in generality and usability. • Alternative Approach - Learning from Raw Text: • Proposal to learn directly from raw text describing the image. • Overcomes limitations of predefined labels. 3
  • 4. INTRODUCTION – CONT.… • Introduction to CLIP: • Contrastive models like CLIP as a key inspiration. • Demonstrates robust image representations capturing both semantics and style. • Project Objectives: • Two-stage model proposed: • Prior generating a CLIP image embedding from a given text. • Decoder generating an image based on these CLIP image embeddings. 4
  • 6. METHODOLOGY • CLIP as a Representation Learner: • CLIP (Contrastive Language-Image Pretraining) recognized for robust image representations. • Desirable properties, including robustness and applicability to various tasks. • Two-Stage Model: Prior and Decoder: • Introduction of the proposed two-stage model for image generation. • Prior Mechanism: • Generates a CLIP image embedding from a given text caption. • Decoder: • Generates an image based on CLIP image embeddings. • Leveraging CLIP Image Embeddings: • CLIP image embeddings used as a bridge between textual descriptions and image generation. • Enhances generality and adaptability in the image generation process. 6
  • 7. METHODOLOGY • Combining CLIP and Diffusion Models: • Synergy between CLIP and diffusion models for improved image synthesis. • Language-guided image manipulations achieved through the joint embedding space of CLIP. • Language-Guided Image Manipulations: • Application of CLIP for language-guided modifications in generated images. • Explores the joint embedding space of CLIP for enhanced control and manipulation 7
  • 8. DIFFUSION MODELS • Introduction to Diffusion Models: • Diffusion models represent a state-of-the-art family of deep generative models. • Break the long-time dominance of GANs in challenging image synthesis tasks. • Role of Denoising Diffusion Probabilistic Models (DDPM): • DDPMs use two Markov chains: forward and reverse. • Forward chain perturbs data to noise, while the reverse chain converts noise back to data. • Three Predominant Formulations: • DDPMs, Score-Based Generative Models (SGMs), and Stochastic Differential Equations (Score SDEs). • Each formulation offers unique features and benefits for the diffusion model. 8
  • 9. DIFFUSION MODELS • Training the Diffusion Model on Stanford Cars Dataset: • Dataset: Stanford Cars dataset with approximately 8000 images in the train set. • Training process involves progressively destructing data by injecting noise. • Forward and Reverse Markov Chains in Diffusion Models: • Forward process injects noise into data until all structures are lost. • Reverse process gradually removes noise by running a learnable Markov chain. 9
  • 10. DENOISING DIFFUSION PROBABILISTIC MODELS (DDPM) A denoising diffusion probabilistic model (DDPM) makes use of two Markov chains: a forward chain that perturbs data to noise, and a reverse chain that converts noise back to data. New data points are subsequently generated by first sampling a random vector from the prior distribution, followed by ancestral sampling through the reverse Markov chain. Using the chain rule of probability and the Markov property, we can factorize the joint distribution of x1, x2 . . . xT conditioned on x0, denoted as q(x1, . . . , xT| x0), into 10
  • 11. TRAINING We trained the diffusion model to generate images of cars as per the theory explained above. As a dataset, we used the Stanford Cars dataset which consists of around 8000 images in the trainset 11
  • 12. TRAINING RESULTS • Training the Diffusion Model on Stanford Cars Dataset: • Utilized the Stanford Cars dataset comprising around 8000 images in the training set. • Focus on generating images of cars as per the project's objectives. • Results from the Forward Markov Chain: • Forward process involves injecting noise into data progressively. • Reverse Markov Chain Using Simple UNet Architecture: • Utilized a Simple UNet Architecture for the reverse Markov chain. • Trained on the dataset for 100 epochs to optimize the learnable transition kernel. 12
  • 15. RESULTS 15 • Explored diffusion models comprehensively. • Showed diffusion models, as likelihood-based models, outperform state-of-the-art GANs. • Highlighted the stationary training objective's role in achieving superior sample quality. • Introduced an improved architecture, successfully applied to unconditional image generation. • Presented a classifier guidance technique for extending quality to class-conditional tasks.
  • 17. REFERENCES 17 [1] Aditya Ramesh, Prafulla Dhariwal, Alex Nicho, Casey Chu, Mark Chen. Hierarchical Text-Conditional Image Generation with CLIP Latents, arXiv:2204.06125 [2] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever. Learning Transferable Visual Models From Natural Language Supervision, arXiv:2103.000201
  • 18. THANK YOU DRUV GERA COMPUTER SCIENCE DELHI TECHNICAL UNIVERSITY