SlideShare a Scribd company logo
1 of 24
MASKED RELATION
LEARNING FOR DEEPFAKE
DETECTION
Team – 20
1. Shaik Neha Sulthana (Y20CS165)
2. Shaik Majida (Y20CS163)
3. Pavaluri Poojitha (Y20CS138)
Guide : Mr. M. Naveen
ABSTRACT
DeepFake detection aims to differentiate falsified faces from
real ones. The approach proposed, aims to improve DeepFake detection
by considering the relationships between different parts of the face by
using a graph-like structure to represent the face and its different
regions.We use a technique called "masked modeling" to reduce the
amount of redundant information. This involves masking or ignoring
some of the relationships between different regions of the face to focus
on the most informative relationships.
INTRODUCTION
 DeepFake videos are computer-generated videos that make it appear
as though someone is doing or saying something that they did not
actually do or say. DeepFake detection aims to differentiate falsified
faces from real ones.
 The majority of methods treat it as a binary classification problem
by focusing just on the regional differences in face forgery and local
artifacts, omitting the relationship between local regions. The
approach proposed, aims to improve DeepFake detection by
considering the relationships between different parts of the face by
using a graph-like structure to represent the face and its different
regions.
 However, too much information can make the method less effective,
so we use a technique called "masked modeling" to reduce the
amount of redundant information. This involves masking or ignoring
some of the relationships between different regions of the face to
focus on the most informative relationships.
EXISTING TECHNIQUES
Here are the previous techniques on DeepFake detection, relation
learning, masked graph modeling.
DeepFake Detection
 Image forensic patterns and physiological signals.
 Deep Learning with Convolutional Neural Networks (CNNs).
 Frequency domain features.
 Temporal artifacts.
Relation Learning
 Handcrafted Features
 Deep Learning with Convolutional Neural Networks (CNNs)
 Adversarial Learning
 Feature Fusion
Masked graph modeling
 Masked Language Modeling (MLM)
 Masked Autoencoders (MAEs)
 Masked Graph Modeling
 Graph Autoencoders (GAEs)
 Pretext Tasks
DEEPFAKE DETECTION
 Image forensic patterns and physiological signals:
Authors: Lugstein, S. Baier, Y. Li
These early DeepFake detectors attempt to expose fake faces through
image forensic patterns and physiological signals
Limitation: They are incompetent to detect realistic face forgery.
 Deep Learning with Convolutional Neural Networks(CNNs):
Authors: A. Rossler, S. Ren, M. Tan
The deep learning approaches are effective to learn discriminative
characteristics of specific face manipulation algorithms.
Limitation: These methods are not robust enough to detect the
manipulations that they have not seen during their training, and their
performance suffers when facing new and challenging scenarios.
DEEPFAKE DETECTION(CONTD)
 Frequency domain features :
Authors: Q. Gu, Y. Rao, L. M. Binh, J. Li
These assist classifiers to capture fine-grained clues of face forgery.
Limitation: They mainly focus on local features and do not explore the
relationships between facial regions.
 Temporal artifacts:
Authors: S. Lyu, M. Pantic, Z. Sun
These include discontinuity of eye blinking , lip motion , and facial
landmarks as effective clues for DeepFake detection.
Limitation: They are limited in detecting more sophisticated
DeepFake manipulations.
RELATION LEARNING
 Handcrafted Features:
Authors: J. Li et al., A. Goel, B. Fernando, F. Keller
Handcrafted features such as LBP, HOG were used in the past for face
recognition and image classification.
Limitation: These features are limited in their ability to capture
complex patterns and relationships within the data.
 Deep Learning with Convolutional Neural Networks :
Authors: Z. Zha, L. Yu, Y. Li, and Y. Zhang
CNNs have been widely adopted in recent years for various visual
tasks, including DeepFake detection.
Limitation: CNNs limits their ability to model non-Euclidean structure
data such as 3D shapes or point clouds. Additionally, CNNs require a
large amount of labeled data for training and are computationally
expensive.
RELATION LEARNING(CONTD)
 Adversarial Learning:
Authors: Y. Rao, J. Ni, and H. Xie
Adversarial learning is a technique where a model is trained to
distinguish between real and fake samples, while another model is
trained to generate realistic fake samples.
Limitation: It is vulnerable to adversarial attacks and requires careful
tuning of hyperparameters.
 Feature Fusion:
Authors: S. Chen, T. Yao
Feature fusion techniques aim to combine multiple modalities of
information, such as RGB and frequency domains, to improve
DeepFake detection.
Limitation: Feature fusion is computationally expensive and requires
careful selection of fusion methods.
MASKED GRAPH MODELING
 Masked Language Modeling (MLM):
Authors: A. Wettig, J. Devlin
This method involves masking partial tokens to improve the
performance of language models. Recent studies indicate that MLM is
also effective for computer vision tasks.
Limitation: This method is limited to language and computer vision
tasks and may not be applicable to other domains.
 Masked Autoencoders (MAEs):
Authors: K. He, X. Chen, S. Xie
MAEs are the pioneering methods that learn visual representation by
reconstructing masked image patches. The rationale behind masked
modeling is information redundancy. Limitation: This method is
limited to learning visual representation and may not be suitable for
other tasks.
MASKED GRAPH MODELING(CONTD)
 Masked Graph Modeling:
Authors: F. Manessi and A. Rozza
This method involves masking vertices and edges of a graph during
training to improve performance. Most works on masked graph
modeling adopt self-supervised learning to predict the masked vertices
and edges of a graph.
Limitation: This method may require a large amount of training data to
achieve good performance and may be computationally expensive.
 Graph Autoencoders (GAEs):
Authors: A. Salehi, G. Cui, J. Zhou
GAEs are used to learn the structure of a graph by reconstructing it
from its latent representation.
Limitation: This method may not be suitable for tasks that require
more complex relationships between nodes in a graph.
MASKED GRAPH MODELING(CONTD)
 Pretext Tasks:
Authors: W. Jin et al.
Pretext tasks are self-supervised learning tasks used to train neural
networks. They have limited improvement in performance and may not
generalize well to other tasks.
Limitation: This method may require a large amount of training data
to achieve good performance and may be computationally expensive.
PROPOSED TECHNIQUE – MASKED RELATION LEARNING
It consists of two main components:
 SpatioTemporal Attention (STA) module
 Masked Relation Learner (MRL).
SpatioTemporal Attention
 SpatioTemporal Attention is a technique used in machine learning
and computer vision to selectively focus on specific regions or
frames of a video sequence. It involves allocating more
computational resources to the relevant parts of the video and
ignoring the irrelevant parts. In the context of video analysis, spatio-
temporal attention is used to recognize and classify actions, gestures,
or events in a video sequence.
SPATIOTEMPORAL ATTENTION(CONTD)
 How SpatioTemporal Attention is used in masked relation
learning for deepfake detection
 In the context of deepfake detection, spatio-temporal attention is
used to identify subtle changes in the visual appearance of the
face or the body that are indicative of manipulation. By focusing
on the most relevant regions of the image or video, the model
can better capture the underlying relationships between these
regions, which can help to distinguish between real and fake
content.
 Overall, spatio-temporal attention is a powerful tool for
improving the accuracy of deep learning models in detecting
deepfakes, as it allows the model to selectively focus on the most
relevant information in the video while ignoring irrelevant
distractions.
SPATIOTEMPORAL ATTENTION(CONTD)
 Advantages of SpatioTemporal Attention
 Robustness: SpatioTemporal Attention can make the model more
robust to occlusions and distortions in the video, by allowing it
to focus on the most informative regions of the image or video
and ignore the irrelevant parts.
 Selective focusing: SpatioTemporal Attention allows the model
to selectively focus on specific regions or frames of a video
sequence, which can improve the accuracy of the model's
predictions.
 Computational efficiency: SpatioTemporal Attention can help to
reduce the computational cost of processing large video datasets,
by allowing the model to focus on the most informative parts of
the video and avoid processing irrelevant parts.
MASKED RELATION LEARNER
 Masked relation learning is a technique used in deep learning to
learn the relationships between different parts of an image or video,
while being robust to occlusions and manipulations. The technique
involves masking different parts of the input and forcing the model
to learn the relationships between the remaining unmasked parts.
 Role of MRL in DeepFake detection:
 In the context of deepfake detection, masked relation learning
can help to identify subtle changes in the visual appearance of
the face or the body that are indicative of manipulation.
 The technique involves dividing the input into different regions,
such as the eyes, nose, and mouth, and then masking some of
these regions while leaving others unmasked. The model is then
trained to predict the relationships between the unmasked
regions, such as the relationship between the movement of the
eyes and the mouth.
MASKED RELATION LEARNER(CONTD)
 Overall, masked relation learning is a powerful technique for
learning the relationships between different parts of an image or
video, and it has many practical applications in fields such as
computer vision, robotics, and natural language processing.
 Advantages of MRL:
 Robustness: Masked relation learning can help the model to be
more robust to manipulations, and distortions in the input, by
focusing on the relationships between the unmasked parts and
ignoring the masked parts.
 Scalability: Masked relation learning can help to reduce the
computational cost of processing large datasets, by focusing only
on the most informative parts of the input.
MASKED RELATION LEARNER(CONTD)
 Transferability: Masked relation learning can be applied to
different tasks and domains, making it a versatile technique for
machine learning and computer vision.
 Improved performance: By learning the relationships between
different parts of the input, masked relation learning can help to
improve the performance of deep learning models, particularly in
tasks such as object detection, segmentation, and action
recognition
DATASETS
 FaceForensics++ (FF++) : a standardized dataset for DeepFake
detection. It consists of 1,000 pristine videos and 4,000 fake videos.
Four manipulation techniques are used to generate fake videos,
including DeepFakes1 (DF), Face2Face (F2F), FaceSwap2 (FS), and
NeuralTextures (NT). To simulate the setting of social networks,
FF++ has high-quality (HQ) and low-quality (LQ) copies created by
light compression and heavy compression, respectively.
FACEFORENSICS++ (FF++)
DATASETS(CONTD)
 Celeb-DF : a large-scale deepfakes dataset. It con- tains 590 real
videos and 5,639 fake videos of celebrities. An undisclosed
improved synthesis algorithm is devised to produce face forgeries.
The realistic forgeries make it difficult for DeepFake detection.
DATASETS(CONTD)
 DeepFake Detection Challenge (DFDC) [46]: a public faceswap
video dataset. It contains 1,131 real videos and 4,119 fake videos.
Six advanced faceswap algorithms are used to craft fake videos. The
real videos are filmed in a variety of real-world scenes. Many
distractors such as dark lighting, extreme pose, and occlusion lead to
challenging forgery detection.
DEEPFAKE DETECTION CHALLENGE (DFDC)
ppt[1].pptx

More Related Content

Similar to ppt[1].pptx

icmi2015_ChaZhang
icmi2015_ChaZhangicmi2015_ChaZhang
icmi2015_ChaZhang
Zhiding Yu
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
paperpublications3
 

Similar to ppt[1].pptx (20)

Face recognition using laplacianfaces
Face recognition using laplacianfaces Face recognition using laplacianfaces
Face recognition using laplacianfaces
 
DOMAIN ENGINEERING FOR APPLIED MONOCULAR RECONSTRUCTION OF PARAMETRIC FACES
DOMAIN ENGINEERING FOR APPLIED MONOCULAR RECONSTRUCTION OF PARAMETRIC FACESDOMAIN ENGINEERING FOR APPLIED MONOCULAR RECONSTRUCTION OF PARAMETRIC FACES
DOMAIN ENGINEERING FOR APPLIED MONOCULAR RECONSTRUCTION OF PARAMETRIC FACES
 
Domain Engineering for Applied Monocular Reconstruction of Parametric Faces
Domain Engineering for Applied Monocular Reconstruction of Parametric FacesDomain Engineering for Applied Monocular Reconstruction of Parametric Faces
Domain Engineering for Applied Monocular Reconstruction of Parametric Faces
 
ppt with template for reference (1).pptx
ppt with template for reference (1).pptxppt with template for reference (1).pptx
ppt with template for reference (1).pptx
 
Facial recognition system
Facial recognition systemFacial recognition system
Facial recognition system
 
A COMPREHENSIVE STUDY ON OCCLUSION INVARIANT FACE RECOGNITION UNDER FACE MASK...
A COMPREHENSIVE STUDY ON OCCLUSION INVARIANT FACE RECOGNITION UNDER FACE MASK...A COMPREHENSIVE STUDY ON OCCLUSION INVARIANT FACE RECOGNITION UNDER FACE MASK...
A COMPREHENSIVE STUDY ON OCCLUSION INVARIANT FACE RECOGNITION UNDER FACE MASK...
 
Adversarial sketch-photo transformation for enhanced face recognition accurac...
Adversarial sketch-photo transformation for enhanced face recognition accurac...Adversarial sketch-photo transformation for enhanced face recognition accurac...
Adversarial sketch-photo transformation for enhanced face recognition accurac...
 
76 s201920
76 s20192076 s201920
76 s201920
 
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS
 
icmi2015_ChaZhang
icmi2015_ChaZhangicmi2015_ChaZhang
icmi2015_ChaZhang
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
 
A Robust & Fast Face Detection System
A Robust & Fast Face Detection SystemA Robust & Fast Face Detection System
A Robust & Fast Face Detection System
 
IRJET - Facial In-Painting using Deep Learning in Machine Learning
IRJET -  	  Facial In-Painting using Deep Learning in Machine LearningIRJET -  	  Facial In-Painting using Deep Learning in Machine Learning
IRJET - Facial In-Painting using Deep Learning in Machine Learning
 
Password Authentication Framework Based on Encrypted Negative Password
Password Authentication Framework Based on Encrypted Negative PasswordPassword Authentication Framework Based on Encrypted Negative Password
Password Authentication Framework Based on Encrypted Negative Password
 
Selective local binary pattern with convolutional neural network for facial ...
Selective local binary pattern with convolutional neural  network for facial ...Selective local binary pattern with convolutional neural  network for facial ...
Selective local binary pattern with convolutional neural network for facial ...
 
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWDEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
 
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
 
Sign Language Recognition
Sign Language RecognitionSign Language Recognition
Sign Language Recognition
 
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
 
Face recognition system
Face recognition systemFace recognition system
Face recognition system
 

Recently uploaded

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
Health
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
HenryBriggs2
 

Recently uploaded (20)

Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 

ppt[1].pptx

  • 1. MASKED RELATION LEARNING FOR DEEPFAKE DETECTION Team – 20 1. Shaik Neha Sulthana (Y20CS165) 2. Shaik Majida (Y20CS163) 3. Pavaluri Poojitha (Y20CS138) Guide : Mr. M. Naveen
  • 2. ABSTRACT DeepFake detection aims to differentiate falsified faces from real ones. The approach proposed, aims to improve DeepFake detection by considering the relationships between different parts of the face by using a graph-like structure to represent the face and its different regions.We use a technique called "masked modeling" to reduce the amount of redundant information. This involves masking or ignoring some of the relationships between different regions of the face to focus on the most informative relationships.
  • 3. INTRODUCTION  DeepFake videos are computer-generated videos that make it appear as though someone is doing or saying something that they did not actually do or say. DeepFake detection aims to differentiate falsified faces from real ones.  The majority of methods treat it as a binary classification problem by focusing just on the regional differences in face forgery and local artifacts, omitting the relationship between local regions. The approach proposed, aims to improve DeepFake detection by considering the relationships between different parts of the face by using a graph-like structure to represent the face and its different regions.  However, too much information can make the method less effective, so we use a technique called "masked modeling" to reduce the amount of redundant information. This involves masking or ignoring some of the relationships between different regions of the face to focus on the most informative relationships.
  • 4. EXISTING TECHNIQUES Here are the previous techniques on DeepFake detection, relation learning, masked graph modeling. DeepFake Detection  Image forensic patterns and physiological signals.  Deep Learning with Convolutional Neural Networks (CNNs).  Frequency domain features.  Temporal artifacts. Relation Learning  Handcrafted Features  Deep Learning with Convolutional Neural Networks (CNNs)  Adversarial Learning  Feature Fusion
  • 5. Masked graph modeling  Masked Language Modeling (MLM)  Masked Autoencoders (MAEs)  Masked Graph Modeling  Graph Autoencoders (GAEs)  Pretext Tasks
  • 6. DEEPFAKE DETECTION  Image forensic patterns and physiological signals: Authors: Lugstein, S. Baier, Y. Li These early DeepFake detectors attempt to expose fake faces through image forensic patterns and physiological signals Limitation: They are incompetent to detect realistic face forgery.  Deep Learning with Convolutional Neural Networks(CNNs): Authors: A. Rossler, S. Ren, M. Tan The deep learning approaches are effective to learn discriminative characteristics of specific face manipulation algorithms. Limitation: These methods are not robust enough to detect the manipulations that they have not seen during their training, and their performance suffers when facing new and challenging scenarios.
  • 7. DEEPFAKE DETECTION(CONTD)  Frequency domain features : Authors: Q. Gu, Y. Rao, L. M. Binh, J. Li These assist classifiers to capture fine-grained clues of face forgery. Limitation: They mainly focus on local features and do not explore the relationships between facial regions.  Temporal artifacts: Authors: S. Lyu, M. Pantic, Z. Sun These include discontinuity of eye blinking , lip motion , and facial landmarks as effective clues for DeepFake detection. Limitation: They are limited in detecting more sophisticated DeepFake manipulations.
  • 8. RELATION LEARNING  Handcrafted Features: Authors: J. Li et al., A. Goel, B. Fernando, F. Keller Handcrafted features such as LBP, HOG were used in the past for face recognition and image classification. Limitation: These features are limited in their ability to capture complex patterns and relationships within the data.  Deep Learning with Convolutional Neural Networks : Authors: Z. Zha, L. Yu, Y. Li, and Y. Zhang CNNs have been widely adopted in recent years for various visual tasks, including DeepFake detection. Limitation: CNNs limits their ability to model non-Euclidean structure data such as 3D shapes or point clouds. Additionally, CNNs require a large amount of labeled data for training and are computationally expensive.
  • 9. RELATION LEARNING(CONTD)  Adversarial Learning: Authors: Y. Rao, J. Ni, and H. Xie Adversarial learning is a technique where a model is trained to distinguish between real and fake samples, while another model is trained to generate realistic fake samples. Limitation: It is vulnerable to adversarial attacks and requires careful tuning of hyperparameters.  Feature Fusion: Authors: S. Chen, T. Yao Feature fusion techniques aim to combine multiple modalities of information, such as RGB and frequency domains, to improve DeepFake detection. Limitation: Feature fusion is computationally expensive and requires careful selection of fusion methods.
  • 10. MASKED GRAPH MODELING  Masked Language Modeling (MLM): Authors: A. Wettig, J. Devlin This method involves masking partial tokens to improve the performance of language models. Recent studies indicate that MLM is also effective for computer vision tasks. Limitation: This method is limited to language and computer vision tasks and may not be applicable to other domains.  Masked Autoencoders (MAEs): Authors: K. He, X. Chen, S. Xie MAEs are the pioneering methods that learn visual representation by reconstructing masked image patches. The rationale behind masked modeling is information redundancy. Limitation: This method is limited to learning visual representation and may not be suitable for other tasks.
  • 11. MASKED GRAPH MODELING(CONTD)  Masked Graph Modeling: Authors: F. Manessi and A. Rozza This method involves masking vertices and edges of a graph during training to improve performance. Most works on masked graph modeling adopt self-supervised learning to predict the masked vertices and edges of a graph. Limitation: This method may require a large amount of training data to achieve good performance and may be computationally expensive.  Graph Autoencoders (GAEs): Authors: A. Salehi, G. Cui, J. Zhou GAEs are used to learn the structure of a graph by reconstructing it from its latent representation. Limitation: This method may not be suitable for tasks that require more complex relationships between nodes in a graph.
  • 12. MASKED GRAPH MODELING(CONTD)  Pretext Tasks: Authors: W. Jin et al. Pretext tasks are self-supervised learning tasks used to train neural networks. They have limited improvement in performance and may not generalize well to other tasks. Limitation: This method may require a large amount of training data to achieve good performance and may be computationally expensive.
  • 13. PROPOSED TECHNIQUE – MASKED RELATION LEARNING It consists of two main components:  SpatioTemporal Attention (STA) module  Masked Relation Learner (MRL). SpatioTemporal Attention  SpatioTemporal Attention is a technique used in machine learning and computer vision to selectively focus on specific regions or frames of a video sequence. It involves allocating more computational resources to the relevant parts of the video and ignoring the irrelevant parts. In the context of video analysis, spatio- temporal attention is used to recognize and classify actions, gestures, or events in a video sequence.
  • 14. SPATIOTEMPORAL ATTENTION(CONTD)  How SpatioTemporal Attention is used in masked relation learning for deepfake detection  In the context of deepfake detection, spatio-temporal attention is used to identify subtle changes in the visual appearance of the face or the body that are indicative of manipulation. By focusing on the most relevant regions of the image or video, the model can better capture the underlying relationships between these regions, which can help to distinguish between real and fake content.  Overall, spatio-temporal attention is a powerful tool for improving the accuracy of deep learning models in detecting deepfakes, as it allows the model to selectively focus on the most relevant information in the video while ignoring irrelevant distractions.
  • 15. SPATIOTEMPORAL ATTENTION(CONTD)  Advantages of SpatioTemporal Attention  Robustness: SpatioTemporal Attention can make the model more robust to occlusions and distortions in the video, by allowing it to focus on the most informative regions of the image or video and ignore the irrelevant parts.  Selective focusing: SpatioTemporal Attention allows the model to selectively focus on specific regions or frames of a video sequence, which can improve the accuracy of the model's predictions.  Computational efficiency: SpatioTemporal Attention can help to reduce the computational cost of processing large video datasets, by allowing the model to focus on the most informative parts of the video and avoid processing irrelevant parts.
  • 16. MASKED RELATION LEARNER  Masked relation learning is a technique used in deep learning to learn the relationships between different parts of an image or video, while being robust to occlusions and manipulations. The technique involves masking different parts of the input and forcing the model to learn the relationships between the remaining unmasked parts.  Role of MRL in DeepFake detection:  In the context of deepfake detection, masked relation learning can help to identify subtle changes in the visual appearance of the face or the body that are indicative of manipulation.  The technique involves dividing the input into different regions, such as the eyes, nose, and mouth, and then masking some of these regions while leaving others unmasked. The model is then trained to predict the relationships between the unmasked regions, such as the relationship between the movement of the eyes and the mouth.
  • 17. MASKED RELATION LEARNER(CONTD)  Overall, masked relation learning is a powerful technique for learning the relationships between different parts of an image or video, and it has many practical applications in fields such as computer vision, robotics, and natural language processing.  Advantages of MRL:  Robustness: Masked relation learning can help the model to be more robust to manipulations, and distortions in the input, by focusing on the relationships between the unmasked parts and ignoring the masked parts.  Scalability: Masked relation learning can help to reduce the computational cost of processing large datasets, by focusing only on the most informative parts of the input.
  • 18. MASKED RELATION LEARNER(CONTD)  Transferability: Masked relation learning can be applied to different tasks and domains, making it a versatile technique for machine learning and computer vision.  Improved performance: By learning the relationships between different parts of the input, masked relation learning can help to improve the performance of deep learning models, particularly in tasks such as object detection, segmentation, and action recognition
  • 19. DATASETS  FaceForensics++ (FF++) : a standardized dataset for DeepFake detection. It consists of 1,000 pristine videos and 4,000 fake videos. Four manipulation techniques are used to generate fake videos, including DeepFakes1 (DF), Face2Face (F2F), FaceSwap2 (FS), and NeuralTextures (NT). To simulate the setting of social networks, FF++ has high-quality (HQ) and low-quality (LQ) copies created by light compression and heavy compression, respectively.
  • 21. DATASETS(CONTD)  Celeb-DF : a large-scale deepfakes dataset. It con- tains 590 real videos and 5,639 fake videos of celebrities. An undisclosed improved synthesis algorithm is devised to produce face forgeries. The realistic forgeries make it difficult for DeepFake detection.
  • 22. DATASETS(CONTD)  DeepFake Detection Challenge (DFDC) [46]: a public faceswap video dataset. It contains 1,131 real videos and 4,119 fake videos. Six advanced faceswap algorithms are used to craft fake videos. The real videos are filmed in a variety of real-world scenes. Many distractors such as dark lighting, extreme pose, and occlusion lead to challenging forgery detection.