SRI VENKATESWARA COLLEGE OF ENGINEERING AND TECHNOLOGY
CHITTOOR(AUTONOMOUS)
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
Title of the project
ADVANCED MACHINE LEARNING SOLUTIONS FOR DETECTING
FAKE ACCOUNT IN DIGITAL MEDIA
BATCH NO: B13
20781A04D1 : P.LOKESH
20781A04G1 :S.IRFAN
20781A04D5:P.VIJAY BHASKAR REDDY
20781A04G2: S.IRFAN
20781A04C7:P.MAHENDRA
Under the Guidance of
Mrs.M.SUMATHI,
ASSOCIATE PROFESSOR
CONTENTS
• OBJECTIVE/ABSTRACT
• INTRODUCTION
• LITERATURE SURVEY
• BASE PAPER
• PROPOSED WORK
ABSTRACT/OBJECTIVE
The proliferation of fake profiles and manipulated images on social media platforms poses a significant
challenge to online authenticity and trustworthiness. In this project, we propose a machine learning-
based approach to detect fake profiles and fake images with high accuracy. Our methodology involves
collecting a large dataset of both real and fake profiles/images, extracting relevant features, and training
a robust classification model. We leverage advanced techniques such as deep learning for image
analysis and natural language processing for profile text analysis. Additionally, we explore ensemble
learning methods to further improve detection performance. Through extensive experimentation and
evaluation on diverse datasets, we demonstrate the effectiveness of our approach in accurately
identifying fake profiles and images, thereby contributing to the mitigation of online misinformation
and deception.
LITERATURE SURVEY
Author Paper Title Methodology Techniques and Tools Used Drawbacks
Cheng at al.(2018) “You Are How You
Click: Clickstream
Analysis For Sybil
Detection”
Clickstream
analysis
Graph based techniques,
Machine Learning
Classifiers(e.g., Random
Forest)
Requires acces to
user click behavior
data
Stringhini et
al.(2010)
“Detecting Spammers on
Social Networks”
Analysis of
Network
Structure
Graph based techniques,
Machine Learning
Classifiers(e.g. SVM)
Limited Scalability for
real-time detection
Lee et al.(2016) “Detecting Suspicious
Accounts in Online Social
Networks”
Behavioral
analysis
User behavioral features,
Machine Learning classifiers
(e.g.,RandomForest, XGBoost)
Limited Scalability for
real-time detection
Wang et al.(2019) “Detecting Fake Accounts
in Online Social Networks
at Scale”
Network
Analysis
Graph based techniques,
Machine Learning
(e.g.,CNN,LSTM),Behavioral
analysis
Limited Specific
Social network
platforms,
Performance
overhead for large-
scale analysis
Author Paper Title Methodology Techniques and Tools
Used
Drawbacks
Kumar et al.(2017) “Temporal Patterns of
User Behavior on Twitter
and Diurnal variation of
social spam”
Time-based
analysis
Time-series analysis,
Machine Learning
classifiers (e.g., Decision
Trees, Logistic Regression
Limited to detecting
temporal patterns of
spamming behavior,
may not generalize to
all types
of fake accounts.
Cresci et al.(2015) “fame for sale: efficient
detection of fake Twitter
followers”
Network Analysis Graph-based techniques,
Bot detection algorithms,
Statistical analysis.
Requires access to
ground truth data for
training, Limited to
detecting fake
followers rather than
broader fake accounts.
Shao et al.(2018) “FakeNewsNet: A Data
Repistory with news
content and social
context and dynamic
information for studying
fake news on social
media”
Content Analysis Natural Language
processing (NLP)
techniques,Deep
learning(e.g.,CNN,LSTM),S
ocial network analysis
Relies on labeled
datasets,Limited to
fake news detection
rather than fake
accounts
Overview : -
The existing system is a machine learning-based approach to detect fake Twitter accounts.
Key Features:
1. Data Features: Utilizes a range of features like the number of abuse reports, rejected
friend requests, unaccepted friend requests, number of friends and followers, likes to
unknown accounts, and comments per day.
2. Machine Learning Models: Employs classifiers such as Naive Bayes, SVC (Support
Vector Classifier), and K-Nearest Neighbors.
3. Evaluation Metrics: Focuses on calculating accuracy and error rate of the classifiers.
4. Result Visualization: Provides a bar graph comparison of the performance of different
classifiers.
Existing System
Limitations
 Data Imbalance
 Feature Engineering Challenges
 Adversarial Attacks
 Generalization Issues
 Privacy Concerns
PROPOSED SYSTEM
Overview
An advanced system integrating more dynamic features and real-time analysis to enhance
the detection of fake Twitter accounts.
Enhancements
1. Dynamic Feature Analysis
2. Advanced Machine Learning Models
3. Real-Time Detection
4. Adaptive Learning
5. Explainability and Transparency
BLOCK DIAGRAM
DataSet
SVC, NAÏVE BAYES,
K Nearest neighbour
Fake ID
Normal ID
Reduction
Preprocessing
Classification result
Training Phase
Expected Improvements
• Enhanced accuracy and adaptability to new types of fake account behaviors.
• Real-time detection capability leading to quicker response and mitigation.
• Greater transparency and trust in the detection system.
This proposed system aims to address the limitations of the existing one by incorporating
advanced technologies and methodologies. The focus is on creating a more robust,
adaptable, and user-friendly system that keeps pace with the rapidly evolving landscape
of social media and online behaviors
70% COMPLETION OUTPUT
ADVANTAGES
 Improved Online Trustworthiness
 Mitigation of Misinformation
 Enhanced User Security
 Protection of Brand Reputation
 Cost and Resource Savings Empowerment of Users
 Insights into Online Behavior
APPLICATIONS:
 Social Media Platforms
 Online Marketplace
 Cybersecurity
 Digital Forensics
 Brand Protection
 Content Moderation
 Journalism and Media
 Academic Research
Base paper :
HICHEM FELOUAT JUNICHI YAMAGISHI 1,2, HUY H. NGUYEN 1,2, (Member, IEEE), TRUNG-NGHIA LE 1,5, (Senior
Member, IEEE), AND ISAO ECHIZEN 1,2,6, (Senior Member, IEEE)
accepted 15 February 2024, date of publication 23 February 2024, date of current version 1 March 2024.
Reference Paper:
1. P. Kumar, M. Vatsa, and R. Singh, ‘‘Detecting Face2Face facial reenactment in videos,’’ in Proc. IEEE Winter
Conf. Appl. Comput. Vis. (WACV), Mar. 2020, pp. 2578–2586
2. K.Carta, C. Barral, N. El Mrabet, and S. Mouille, ‘‘Video injection attacks on remote digital identity verification
solution using face recognition,’’ in Proc. 13th Int. Multi-Conference Complex., Informat. Cybern. (IMCIC), Mar.
2022, pp. 92–97.
3. T.-L. Do, M.-K. Tran, H. H. Nguyen, and M.-T. Tran, ‘‘Potential attacks of DeepFake on eKYC systems and
remedy for eKYC with DeepFake detection using two-stream network of facial appearance and motion
features,’’ Social Netw. Comput. Sci., vol. 3, no. 6, pp. 1–17, Sep. 2022.
4. A. Nanda, S. W. A. Shah, J. J. Jeong, R. Doss, and J. Webb, ‘‘Towards higher levels of assurance in remote
identity proofing,’’ IEEE Consum. Electron. Mag., vol. 13, no. 1, pp. 1–8, Jan. 2023.
5. D. Dagar and D. K. Vishwakarma, ‘‘A literature review and perspectives in deepfakes: Generation, detection,
and applications,’’ Int. J. Multimedia Inf. Retr., vol. 11, no. 3, pp. 219–289, Sep. 2022.
6. A.Rössler, D.Cozzolino, L.Verdoliva, C. Riess, J. Thies, and M. Niessner,
‘‘FaceForensics++:Learningtodetectmanipulatedfacialimages,’’inProc. IEEE/CVF Int. Conf. Comput. Vis.
(ICCV), Oct. 2019, pp. 1–11.
7. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, ‘‘High-resolution image synthesis with latent
diffusion models,’’ in Proc. IEEE/CVF Conf. Com
8. N. Dufour, A. Gully, P. Karlsson, A. V. Vorbyov, T. Leung, J. Childs, and C. Bregler, ‘‘DeepFakes detection
dataset by Google & Jigsaw,’’ Google, USA, 2019. [Online]. Available: https://blog.research.google/
2019/09/contributing-data-to-deepfake-detection.html?m=1
THANK YOU

B13 FIRST REVIEW 2 (1).pdf advanced machine learning

  • 1.
    SRI VENKATESWARA COLLEGEOF ENGINEERING AND TECHNOLOGY CHITTOOR(AUTONOMOUS) BACHELOR OF TECHNOLOGY IN ELECTRONICS AND COMMUNICATION ENGINEERING Title of the project ADVANCED MACHINE LEARNING SOLUTIONS FOR DETECTING FAKE ACCOUNT IN DIGITAL MEDIA BATCH NO: B13 20781A04D1 : P.LOKESH 20781A04G1 :S.IRFAN 20781A04D5:P.VIJAY BHASKAR REDDY 20781A04G2: S.IRFAN 20781A04C7:P.MAHENDRA Under the Guidance of Mrs.M.SUMATHI, ASSOCIATE PROFESSOR
  • 2.
    CONTENTS • OBJECTIVE/ABSTRACT • INTRODUCTION •LITERATURE SURVEY • BASE PAPER • PROPOSED WORK
  • 3.
    ABSTRACT/OBJECTIVE The proliferation offake profiles and manipulated images on social media platforms poses a significant challenge to online authenticity and trustworthiness. In this project, we propose a machine learning- based approach to detect fake profiles and fake images with high accuracy. Our methodology involves collecting a large dataset of both real and fake profiles/images, extracting relevant features, and training a robust classification model. We leverage advanced techniques such as deep learning for image analysis and natural language processing for profile text analysis. Additionally, we explore ensemble learning methods to further improve detection performance. Through extensive experimentation and evaluation on diverse datasets, we demonstrate the effectiveness of our approach in accurately identifying fake profiles and images, thereby contributing to the mitigation of online misinformation and deception.
  • 4.
    LITERATURE SURVEY Author PaperTitle Methodology Techniques and Tools Used Drawbacks Cheng at al.(2018) “You Are How You Click: Clickstream Analysis For Sybil Detection” Clickstream analysis Graph based techniques, Machine Learning Classifiers(e.g., Random Forest) Requires acces to user click behavior data Stringhini et al.(2010) “Detecting Spammers on Social Networks” Analysis of Network Structure Graph based techniques, Machine Learning Classifiers(e.g. SVM) Limited Scalability for real-time detection Lee et al.(2016) “Detecting Suspicious Accounts in Online Social Networks” Behavioral analysis User behavioral features, Machine Learning classifiers (e.g.,RandomForest, XGBoost) Limited Scalability for real-time detection Wang et al.(2019) “Detecting Fake Accounts in Online Social Networks at Scale” Network Analysis Graph based techniques, Machine Learning (e.g.,CNN,LSTM),Behavioral analysis Limited Specific Social network platforms, Performance overhead for large- scale analysis
  • 5.
    Author Paper TitleMethodology Techniques and Tools Used Drawbacks Kumar et al.(2017) “Temporal Patterns of User Behavior on Twitter and Diurnal variation of social spam” Time-based analysis Time-series analysis, Machine Learning classifiers (e.g., Decision Trees, Logistic Regression Limited to detecting temporal patterns of spamming behavior, may not generalize to all types of fake accounts. Cresci et al.(2015) “fame for sale: efficient detection of fake Twitter followers” Network Analysis Graph-based techniques, Bot detection algorithms, Statistical analysis. Requires access to ground truth data for training, Limited to detecting fake followers rather than broader fake accounts. Shao et al.(2018) “FakeNewsNet: A Data Repistory with news content and social context and dynamic information for studying fake news on social media” Content Analysis Natural Language processing (NLP) techniques,Deep learning(e.g.,CNN,LSTM),S ocial network analysis Relies on labeled datasets,Limited to fake news detection rather than fake accounts
  • 6.
    Overview : - Theexisting system is a machine learning-based approach to detect fake Twitter accounts. Key Features: 1. Data Features: Utilizes a range of features like the number of abuse reports, rejected friend requests, unaccepted friend requests, number of friends and followers, likes to unknown accounts, and comments per day. 2. Machine Learning Models: Employs classifiers such as Naive Bayes, SVC (Support Vector Classifier), and K-Nearest Neighbors. 3. Evaluation Metrics: Focuses on calculating accuracy and error rate of the classifiers. 4. Result Visualization: Provides a bar graph comparison of the performance of different classifiers. Existing System
  • 7.
    Limitations  Data Imbalance Feature Engineering Challenges  Adversarial Attacks  Generalization Issues  Privacy Concerns
  • 8.
    PROPOSED SYSTEM Overview An advancedsystem integrating more dynamic features and real-time analysis to enhance the detection of fake Twitter accounts. Enhancements 1. Dynamic Feature Analysis 2. Advanced Machine Learning Models 3. Real-Time Detection 4. Adaptive Learning 5. Explainability and Transparency
  • 9.
    BLOCK DIAGRAM DataSet SVC, NAÏVEBAYES, K Nearest neighbour Fake ID Normal ID Reduction Preprocessing Classification result Training Phase
  • 10.
    Expected Improvements • Enhancedaccuracy and adaptability to new types of fake account behaviors. • Real-time detection capability leading to quicker response and mitigation. • Greater transparency and trust in the detection system. This proposed system aims to address the limitations of the existing one by incorporating advanced technologies and methodologies. The focus is on creating a more robust, adaptable, and user-friendly system that keeps pace with the rapidly evolving landscape of social media and online behaviors
  • 11.
  • 12.
    ADVANTAGES  Improved OnlineTrustworthiness  Mitigation of Misinformation  Enhanced User Security  Protection of Brand Reputation  Cost and Resource Savings Empowerment of Users  Insights into Online Behavior
  • 13.
    APPLICATIONS:  Social MediaPlatforms  Online Marketplace  Cybersecurity  Digital Forensics  Brand Protection  Content Moderation  Journalism and Media  Academic Research
  • 14.
    Base paper : HICHEMFELOUAT JUNICHI YAMAGISHI 1,2, HUY H. NGUYEN 1,2, (Member, IEEE), TRUNG-NGHIA LE 1,5, (Senior Member, IEEE), AND ISAO ECHIZEN 1,2,6, (Senior Member, IEEE) accepted 15 February 2024, date of publication 23 February 2024, date of current version 1 March 2024. Reference Paper: 1. P. Kumar, M. Vatsa, and R. Singh, ‘‘Detecting Face2Face facial reenactment in videos,’’ in Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), Mar. 2020, pp. 2578–2586 2. K.Carta, C. Barral, N. El Mrabet, and S. Mouille, ‘‘Video injection attacks on remote digital identity verification solution using face recognition,’’ in Proc. 13th Int. Multi-Conference Complex., Informat. Cybern. (IMCIC), Mar. 2022, pp. 92–97. 3. T.-L. Do, M.-K. Tran, H. H. Nguyen, and M.-T. Tran, ‘‘Potential attacks of DeepFake on eKYC systems and remedy for eKYC with DeepFake detection using two-stream network of facial appearance and motion features,’’ Social Netw. Comput. Sci., vol. 3, no. 6, pp. 1–17, Sep. 2022. 4. A. Nanda, S. W. A. Shah, J. J. Jeong, R. Doss, and J. Webb, ‘‘Towards higher levels of assurance in remote identity proofing,’’ IEEE Consum. Electron. Mag., vol. 13, no. 1, pp. 1–8, Jan. 2023. 5. D. Dagar and D. K. Vishwakarma, ‘‘A literature review and perspectives in deepfakes: Generation, detection, and applications,’’ Int. J. Multimedia Inf. Retr., vol. 11, no. 3, pp. 219–289, Sep. 2022.
  • 15.
    6. A.Rössler, D.Cozzolino,L.Verdoliva, C. Riess, J. Thies, and M. Niessner, ‘‘FaceForensics++:Learningtodetectmanipulatedfacialimages,’’inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 1–11. 7. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, ‘‘High-resolution image synthesis with latent diffusion models,’’ in Proc. IEEE/CVF Conf. Com 8. N. Dufour, A. Gully, P. Karlsson, A. V. Vorbyov, T. Leung, J. Childs, and C. Bregler, ‘‘DeepFakes detection dataset by Google & Jigsaw,’’ Google, USA, 2019. [Online]. Available: https://blog.research.google/ 2019/09/contributing-data-to-deepfake-detection.html?m=1
  • 16.