SlideShare a Scribd company logo
Detecting Fake Profiles On Online Matrimony
Vaibhav Garg Dr. Ponnurangam Kumaraguru (Chair)
linkedin.com/in/vaibhav-garg-
0a708899
facebook.com/in/vaibhav.gar
g.104203
@rk_check
2
Thesis Committee
◆ Dr. Arun Balaji Buduru, IIIT Delhi
◆ Dr. Siddhartha Asthana, United Health Group (Optum)
◆ Dr. Ponnurangam Kumaraguru, IIIT Delhi
3
4
Core Thesis Question
How to automatically detect fake profiles on online
matrimony ?
5
Demo
* Due to the privacy policy of the company, we can not give demo on the actual
company’s portal.
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
6
7
Register Suggested
View
Profile
Start
Conversation
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
8
9
About the Data
◆ To dig into the problem, we chose a use case of India’s
leading matrimony website
◆ Ground Truth: 5,40,737 genuine profiles and very less
number of fake profiles.
◆ Data of Categorical Attributes : age, body type, caste, city,
country, education, height, income, manglik, marital status,
mother tongue, occupation, religion.
Categorical Data
10
Attribute Number of
Categories
Different Categories
Caste 470 Hindu: Arora, Hindu: Aggarwal,
Hindu: Brahmin etc.
Height 37 5’0, 5’1, 5’2, 5’3 etc.
Income 25 Rs. 0 - 1 Lakh, Rs 1-2 Lakh etc
Mother Tongue 42 Telugu, Bengali, Hindi-Delhi etc.
Occupation 69 Doctor, Analyst, IT-Engineer etc.
Categorical Data
11
Attribute Number of
Categories
Different Categories
Religion 10 Hindu, Muslim, Christian etc.
Body Type 4 Slim, Average, Athletic, Heavy
Country 214 India, Afghanistan, Australia etc.
City 3683 Delhi, UP, Ahmedabad etc.
Manglik 2 Manglik, Non-Manglik
Categorical Data
12
Attribute Number of
Categories
Different Categories
Marital Status 4 Never Married, Divorcee,
Separated and Widowed
Education 53 B.A, B.Com, B.Tech etc.
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
13
14
Behaviour Heterogeneity
C1
Genuine
Profile
Fake
Profile
C2
C3
C4
C8
C7
C6
C5
C1
C2 C3
15
Inconsistent Edits
Edit Done After 4 Days of Registration
16
Profile Inconsistency
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
17
18
Behavioural Trend for Caste Attribute
Experimented on 100 fake and 100 genuine profiles belonging to
Aggarwal Community
19
Behavioural Trend for Marital Status Attribute
Experimented on 100 fake and 100 genuine profiles belonging to Non
Married Community
20
Static Windows
User’s First 8 days Activity
First 12
hours
Day 0
… . . . . .
0th window 1st window
Day 0
Activity
Day 1
Activity
Day 6
Activity
Day 7
Activity
… . . . . .
Last 12
hours
Day 0
First 12
hours
Day 7
Last 12
hours
Day 7
15th window 16th window
21
Static Windows and Feature Generation
22
Which Model to Choose ?
Model Architecture
23
Output
Features
Offline Results on Behaviour Features
24
Confusion Matrix Predicted Fake Predicted Clean
Actual Fake 2953 852
Actual Clean 168 17799
Above results are obtained on 3805 fake profiles and 17967 clean profiles
Drawback: The user has to be 8 days old on portal to be scrutinized through this approach
LIVE Results : True Positives
25
LIVE Results : False Negatives
26
Edit and Profile features needs to be incorporated !!
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
27
28
Edit Summary for Mother Tongue Attribute
Experimented on 100 fake and 100 genuine profiles which registered
with Hindi-UP category
29
Edit Summary for Income Attribute
Experimented on 100 fake and 100 genuine profiles which registered
with Rs 5-7.5 Lakh category
30
Concept of Dynamic Windows
User’s Active Lifetime on portal = T seconds
User’s total initiates = N
Time period of first N/W
initiates
If we select no of windows = W
Time period of next N/W
initiates
Time period of last N/W
initiates
… . . . . .
0th window 1st window last window
Feature Designing
◆ Profile Features : One hot vector of profile attributes
◆ Behavior Features : In dynamic time windows, each feature stores the
proportion of initiates sent to a particular category of attribute
◆ Edit Features : In dynamic time windows, each feature stores the proportion of
time user has spent on that particular category of attribute
◆ Other Raw Features : In each window, we also store the total interests sent
and time duration of that window.
31
32
Feature Designing
0th window
+ + . . . .
Nth window
33
Experimenting with number of dynamic windows
No of Windows Precision Recall Accuracy
Using 5 windows 0.170 0.510 0.8830
Using 4 window 0.192 0.635 0.8891
Using 3 windows 0.230 0.780 0.8977
Using 2 windows 0.242 0.804 0.8975
Using 1 window 0.266 0.866 0.8972
34
Feature Selection on Best Model
Method Precision Recall Accuracy
Best Model 0.266 0.866 0.8972
Best Model + Feature
Selection
0.269 0.894 0.9083
Criteria Used = (Entropy for fake) - (Entropy for clean)
(Entropy for fake)
Precision is still low !!
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
35
36
Affinity Features along with Behaviour Features
◆ An Affinity score between two categories i and j is the
likelihood score of a person having category i to send
interests to user having category j
◆ Affinity scores when incorporated with behaviour features
compare between how a user is expected to behave and
how he/she actually behaves on the platform
37
Affinity Features
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
38
39
Proposed Full length Feature Vector
Profile Features
Behaviour Features in
Time windows
Affinity Features
Edit Features in
Time windows
+ + +
40
Final Model Architecture
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
41
42
Final Results
Method Precision Recall Accuracy
Proposed Features +
Autoencoder
0.341 0.902 0.9176
Product team demanded for 25% precision at 60% recall !!
Outline
◆ About Online Matrimony
◆ About the Data
◆ Characteristics of a fake profile
◆ Using only Behaviour Trends
◆ Using Behavior, Edit and Profile Information
◆ Incorporating Community features
◆ Feature Engineering: Proposed Full length feature vector
◆ Final Results
◆ Conclusion
43
Conclusion
◆ We first studied the distinction in behaviour, profile and edit
pattern between genuine and fake users
◆ We incorporated these characteristics in the form of
features using dynamic time windows.
◆ We then trained the autoencoder model to detect fake
profiles on online matrimony.
44
45
Real World Impact
Week 1 Week 2
46
Real World Impact
Week 3 Week 4
Limitations and Future Work
◆ More number of samples for training autoencoder can lead
to more generalisation.
◆ We detected fake profiles using categorical attributes only.
Text spamming can be explored.
47
Acknowledgement
◆ Committee Members
◆ Hunny, Adhish from InfoEdge India Ltd.
◆ Members of Precog family
◆ Family and friends
48
49
References
◆ https://timesofindia.indiatimes.com/city/hyderabad/nigerian-held-for-matrim
onial-fraud-in-hyderabad/articleshow/66939563.cms
◆ https://www.hindustantimes.com/mumbai-news/woman-creates-fake-profil
e-on-matrimony-site-cheats-mumbai-man-of-rs23-lakh/story-KHLj4zPWI8U
Gv31YM5A8tK.html
◆ https://timesofindia.indiatimes.com/city/mangaluru/online-matrimony-frauds
-on-the-rise-in-mangaluru/articleshow/66102334.cms
◆ https://timesofindia.indiatimes.com/city/pune/matrimonial-fraud-on-the-rise-
more-than-50-cases-registered-this-year/articleshow/60049950.cms
◆ https://dl.acm.org/citation.cfm?id=2689747
◆ https://link.springer.com/book/10.1007%2F978-3-319-20466-6
◆ https://dl.acm.org/citation.cfm?id=3106489
Thanks!
vaibhav17064@iiitd.ac.in
50

More Related Content

What's hot

Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
KSA PDPL - Personal Data Protection Law.pdf
KSA PDPL - Personal Data Protection Law.pdfKSA PDPL - Personal Data Protection Law.pdf
KSA PDPL - Personal Data Protection Law.pdf
DaviesParker
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Salah Amean
 
Introduction to US Privacy and Data Security Regulations and Requirements (Se...
Introduction to US Privacy and Data Security Regulations and Requirements (Se...Introduction to US Privacy and Data Security Regulations and Requirements (Se...
Introduction to US Privacy and Data Security Regulations and Requirements (Se...
Financial Poise
 
Employees and Internet Use - Legal Aspects
Employees and Internet Use - Legal Aspects Employees and Internet Use - Legal Aspects
Employees and Internet Use - Legal Aspects
Darius Whelan
 
3 Centrality
3 Centrality3 Centrality
3 Centrality
Maksim Tsvetovat
 
OpenERP Health & Hospital Management System
OpenERP Health & Hospital Management SystemOpenERP Health & Hospital Management System
OpenERP Health & Hospital Management System
Apagen Solutions Pvt. Ltd.
 
Trends in DM.pptx
Trends in DM.pptxTrends in DM.pptx
Trends in DM.pptx
ImXaib
 
Digital forensics research: The next 10 years
Digital forensics research: The next 10 yearsDigital forensics research: The next 10 years
Digital forensics research: The next 10 years
Shekh Md Mehedi Hasan
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
Fraboni Ec
 
Computer Network Technology | Dynamic Host Configuration Protocol
Computer Network Technology | Dynamic Host Configuration ProtocolComputer Network Technology | Dynamic Host Configuration Protocol
Computer Network Technology | Dynamic Host Configuration Protocol
International Institute of Information Technology (I²IT)
 
100 soruda kvkk
100 soruda kvkk100 soruda kvkk
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
Vijay Yadav
 
Database normalization
Database normalizationDatabase normalization
Database normalization
VARSHAKUMARI49
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
ZHAO Sam
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
Krish_ver2
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
Kartik Kalpande Patil
 
Comparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented DatabaseComparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented Database
Editor IJMTER
 
data-leakage-detection
data-leakage-detectiondata-leakage-detection
data-leakage-detection
Nagendra Kumar
 

What's hot (20)

Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
KSA PDPL - Personal Data Protection Law.pdf
KSA PDPL - Personal Data Protection Law.pdfKSA PDPL - Personal Data Protection Law.pdf
KSA PDPL - Personal Data Protection Law.pdf
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Introduction to US Privacy and Data Security Regulations and Requirements (Se...
Introduction to US Privacy and Data Security Regulations and Requirements (Se...Introduction to US Privacy and Data Security Regulations and Requirements (Se...
Introduction to US Privacy and Data Security Regulations and Requirements (Se...
 
Employees and Internet Use - Legal Aspects
Employees and Internet Use - Legal Aspects Employees and Internet Use - Legal Aspects
Employees and Internet Use - Legal Aspects
 
3 Centrality
3 Centrality3 Centrality
3 Centrality
 
OpenERP Health & Hospital Management System
OpenERP Health & Hospital Management SystemOpenERP Health & Hospital Management System
OpenERP Health & Hospital Management System
 
Trends in DM.pptx
Trends in DM.pptxTrends in DM.pptx
Trends in DM.pptx
 
Digital forensics research: The next 10 years
Digital forensics research: The next 10 yearsDigital forensics research: The next 10 years
Digital forensics research: The next 10 years
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Computer Network Technology | Dynamic Host Configuration Protocol
Computer Network Technology | Dynamic Host Configuration ProtocolComputer Network Technology | Dynamic Host Configuration Protocol
Computer Network Technology | Dynamic Host Configuration Protocol
 
100 soruda kvkk
100 soruda kvkk100 soruda kvkk
100 soruda kvkk
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
 
Database normalization
Database normalizationDatabase normalization
Database normalization
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
 
Comparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented DatabaseComparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented Database
 
data-leakage-detection
data-leakage-detectiondata-leakage-detection
data-leakage-detection
 

Similar to Detecting Fake Profiles On Online Matrimony

Odata V4 : The New way to REST for Your Applications
Odata V4 : The New way to REST for Your Applications Odata V4 : The New way to REST for Your Applications
Odata V4 : The New way to REST for Your Applications
Alok Chhabria
 
Predictive Maintenance - Predict the Unpredictable
Predictive Maintenance - Predict the UnpredictablePredictive Maintenance - Predict the Unpredictable
Predictive Maintenance - Predict the Unpredictable
Ivo Andreev
 
Machine learning specialist ver#4
Machine learning specialist ver#4Machine learning specialist ver#4
Machine learning specialist ver#4
EPSILON AI INSTITUTE
 
Power of Flows and Prepare for Salesforce Admin Certification
Power of Flows and Prepare for Salesforce Admin CertificationPower of Flows and Prepare for Salesforce Admin Certification
Power of Flows and Prepare for Salesforce Admin Certification
Nishant Singh Panwar
 
Presentation
PresentationPresentation
Presentation
FariaLara
 
Flipkart Data Platform @ Scale - slash n 2018 reprise
Flipkart Data Platform @ Scale - slash n 2018 repriseFlipkart Data Platform @ Scale - slash n 2018 reprise
Flipkart Data Platform @ Scale - slash n 2018 reprise
FlipkartStories
 
[Pinto] Is my SharePoint Development team properly enlighted?
[Pinto] Is my SharePoint Development team properly enlighted?[Pinto] Is my SharePoint Development team properly enlighted?
[Pinto] Is my SharePoint Development team properly enlighted?
European Collaboration Summit
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 -  Automated ML - Panagiotis PapaemmanouilGDG DEvFest Hellas 2020 -  Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
Panagiotis Papaemmanouil
 
Login & Registration defect taxonomy v1.0
Login & Registration defect taxonomy v1.0Login & Registration defect taxonomy v1.0
Login & Registration defect taxonomy v1.0
Samer Desouky
 
Hitachi ID Identity Manager
Hitachi ID Identity ManagerHitachi ID Identity Manager
Hitachi ID Identity Manager
Hitachi ID Systems, Inc.
 
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Bhakthi Liyanage
 
Ria Sankar on Building AI Products
Ria Sankar on Building AI ProductsRia Sankar on Building AI Products
Ria Sankar on Building AI Products
Ria Sankar
 
Simran confidentiality protection in crowdsourcing
Simran confidentiality protection in crowdsourcingSimran confidentiality protection in crowdsourcing
Simran confidentiality protection in crowdsourcing
IIIT Hyderabad
 
[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf
[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf
[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf
Selvaraj Seerangan
 
Chapter-5.pdf
Chapter-5.pdfChapter-5.pdf
Chapter-5.pdf
ssuser01a3d0
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideas
Richard Robinson
 
Chapter 5 IoT Design methodologies
Chapter 5 IoT Design methodologiesChapter 5 IoT Design methodologies
Chapter 5 IoT Design methodologies
pavan penugonda
 
A Busy Lawyer’s Guide to Managing Documents and Court Forms
A Busy Lawyer’s Guide to Managing Documents and Court FormsA Busy Lawyer’s Guide to Managing Documents and Court Forms
A Busy Lawyer’s Guide to Managing Documents and Court Forms
Clio - Cloud-Based Legal Technology
 
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Turkish Testing Board
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
VaralakshmiKC
 

Similar to Detecting Fake Profiles On Online Matrimony (20)

Odata V4 : The New way to REST for Your Applications
Odata V4 : The New way to REST for Your Applications Odata V4 : The New way to REST for Your Applications
Odata V4 : The New way to REST for Your Applications
 
Predictive Maintenance - Predict the Unpredictable
Predictive Maintenance - Predict the UnpredictablePredictive Maintenance - Predict the Unpredictable
Predictive Maintenance - Predict the Unpredictable
 
Machine learning specialist ver#4
Machine learning specialist ver#4Machine learning specialist ver#4
Machine learning specialist ver#4
 
Power of Flows and Prepare for Salesforce Admin Certification
Power of Flows and Prepare for Salesforce Admin CertificationPower of Flows and Prepare for Salesforce Admin Certification
Power of Flows and Prepare for Salesforce Admin Certification
 
Presentation
PresentationPresentation
Presentation
 
Flipkart Data Platform @ Scale - slash n 2018 reprise
Flipkart Data Platform @ Scale - slash n 2018 repriseFlipkart Data Platform @ Scale - slash n 2018 reprise
Flipkart Data Platform @ Scale - slash n 2018 reprise
 
[Pinto] Is my SharePoint Development team properly enlighted?
[Pinto] Is my SharePoint Development team properly enlighted?[Pinto] Is my SharePoint Development team properly enlighted?
[Pinto] Is my SharePoint Development team properly enlighted?
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 -  Automated ML - Panagiotis PapaemmanouilGDG DEvFest Hellas 2020 -  Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
 
Login & Registration defect taxonomy v1.0
Login & Registration defect taxonomy v1.0Login & Registration defect taxonomy v1.0
Login & Registration defect taxonomy v1.0
 
Hitachi ID Identity Manager
Hitachi ID Identity ManagerHitachi ID Identity Manager
Hitachi ID Identity Manager
 
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
 
Ria Sankar on Building AI Products
Ria Sankar on Building AI ProductsRia Sankar on Building AI Products
Ria Sankar on Building AI Products
 
Simran confidentiality protection in crowdsourcing
Simran confidentiality protection in crowdsourcingSimran confidentiality protection in crowdsourcing
Simran confidentiality protection in crowdsourcing
 
[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf
[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf
[PPT] _ Unit 2 _ 9.0 _ Domain Specific IoT _Home Automation.pdf
 
Chapter-5.pdf
Chapter-5.pdfChapter-5.pdf
Chapter-5.pdf
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideas
 
Chapter 5 IoT Design methodologies
Chapter 5 IoT Design methodologiesChapter 5 IoT Design methodologies
Chapter 5 IoT Design methodologies
 
A Busy Lawyer’s Guide to Managing Documents and Court Forms
A Busy Lawyer’s Guide to Managing Documents and Court FormsA Busy Lawyer’s Guide to Managing Documents and Court Forms
A Busy Lawyer’s Guide to Managing Documents and Court Forms
 
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
Testistanbul 2016 - Keynote: "Why Automated Verification Matters" by Kristian...
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
 

More from IIIT Hyderabad

Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
IIIT Hyderabad
 
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
IIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
IIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
IIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
IIIT Hyderabad
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
IIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
IIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
IIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
IIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
IIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
IIIT Hyderabad
 

More from IIIT Hyderabad (20)

Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
 
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 

Recently uploaded

Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
rawankhanlove256
 
Rockets and missiles notes engineering ppt
Rockets and missiles notes engineering pptRockets and missiles notes engineering ppt
Rockets and missiles notes engineering ppt
archithaero
 
Time-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 TalkTime-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 Talk
Evan Chan
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
sunnuchadda
 
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdfRed Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
mdfkobir
 
Online airline reservation system project report.pdf
Online airline reservation system project report.pdfOnline airline reservation system project report.pdf
Online airline reservation system project report.pdf
Kamal Acharya
 
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
Jim Mimlitz, P.E.
 
Stiffness Method for structure analysis - Truss
Stiffness Method  for structure analysis - TrussStiffness Method  for structure analysis - Truss
Stiffness Method for structure analysis - Truss
adninhaerul
 
lecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmt
lecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmtlecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmt
lecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmt
RAtna29
 
Jet Propulsion and its working principle.pdf
Jet Propulsion and its working principle.pdfJet Propulsion and its working principle.pdf
Jet Propulsion and its working principle.pdf
KIET Group of Institutions
 
DBMS Commands DDL DML DCL ENTITY RELATIONSHIP.pptx
DBMS Commands  DDL DML DCL ENTITY RELATIONSHIP.pptxDBMS Commands  DDL DML DCL ENTITY RELATIONSHIP.pptx
DBMS Commands DDL DML DCL ENTITY RELATIONSHIP.pptx
Tulasi72
 
libro de modelado de diseño-part-1[160-250].pdf
libro de modelado de diseño-part-1[160-250].pdflibro de modelado de diseño-part-1[160-250].pdf
libro de modelado de diseño-part-1[160-250].pdf
celiosilva66
 
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...
YanKing2
 
OSHA LOTO training, LOTO, lock out tag out
OSHA LOTO training, LOTO, lock out tag outOSHA LOTO training, LOTO, lock out tag out
OSHA LOTO training, LOTO, lock out tag out
Ateeb19
 
IE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptx
IE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptxIE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptx
IE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptx
BehairyAhmed2
 
Quadcopter Dynamics, Stability and Control
Quadcopter Dynamics, Stability and ControlQuadcopter Dynamics, Stability and Control
Quadcopter Dynamics, Stability and Control
Blesson Easo Varghese
 
RECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptxRECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptx
peacesoul123
 
printing of ic circuits.pdf
printing       of        ic     circuits.pdfprinting       of        ic     circuits.pdf
printing of ic circuits.pdf
chidambaramnatarajar
 
Disaster Management and Mitigation presentation
Disaster Management and Mitigation presentationDisaster Management and Mitigation presentation
Disaster Management and Mitigation presentation
RajaRamannaTarigoppu
 
CONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY ppt
CONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY pptCONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY ppt
CONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY ppt
ASHOK KUMAR SINGH
 

Recently uploaded (20)

Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
 
Rockets and missiles notes engineering ppt
Rockets and missiles notes engineering pptRockets and missiles notes engineering ppt
Rockets and missiles notes engineering ppt
 
Time-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 TalkTime-State Analytics: MinneAnalytics 2024 Talk
Time-State Analytics: MinneAnalytics 2024 Talk
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
 
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdfRed Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
 
Online airline reservation system project report.pdf
Online airline reservation system project report.pdfOnline airline reservation system project report.pdf
Online airline reservation system project report.pdf
 
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
 
Stiffness Method for structure analysis - Truss
Stiffness Method  for structure analysis - TrussStiffness Method  for structure analysis - Truss
Stiffness Method for structure analysis - Truss
 
lecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmt
lecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmtlecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmt
lecture10-efficient-scoring.ppmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmt
 
Jet Propulsion and its working principle.pdf
Jet Propulsion and its working principle.pdfJet Propulsion and its working principle.pdf
Jet Propulsion and its working principle.pdf
 
DBMS Commands DDL DML DCL ENTITY RELATIONSHIP.pptx
DBMS Commands  DDL DML DCL ENTITY RELATIONSHIP.pptxDBMS Commands  DDL DML DCL ENTITY RELATIONSHIP.pptx
DBMS Commands DDL DML DCL ENTITY RELATIONSHIP.pptx
 
libro de modelado de diseño-part-1[160-250].pdf
libro de modelado de diseño-part-1[160-250].pdflibro de modelado de diseño-part-1[160-250].pdf
libro de modelado de diseño-part-1[160-250].pdf
 
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large...
 
OSHA LOTO training, LOTO, lock out tag out
OSHA LOTO training, LOTO, lock out tag outOSHA LOTO training, LOTO, lock out tag out
OSHA LOTO training, LOTO, lock out tag out
 
IE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptx
IE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptxIE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptx
IE-469-Lecture-Notes-3IE-469-Lecture-Notes-3.pptx
 
Quadcopter Dynamics, Stability and Control
Quadcopter Dynamics, Stability and ControlQuadcopter Dynamics, Stability and Control
Quadcopter Dynamics, Stability and Control
 
RECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptxRECENT DEVELOPMENTS IN RING SPINNING.pptx
RECENT DEVELOPMENTS IN RING SPINNING.pptx
 
printing of ic circuits.pdf
printing       of        ic     circuits.pdfprinting       of        ic     circuits.pdf
printing of ic circuits.pdf
 
Disaster Management and Mitigation presentation
Disaster Management and Mitigation presentationDisaster Management and Mitigation presentation
Disaster Management and Mitigation presentation
 
CONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY ppt
CONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY pptCONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY ppt
CONFINED SPACE ENTRY TRAINING FOR OIL INDUSTRY ppt
 

Detecting Fake Profiles On Online Matrimony

  • 1. Detecting Fake Profiles On Online Matrimony Vaibhav Garg Dr. Ponnurangam Kumaraguru (Chair) linkedin.com/in/vaibhav-garg- 0a708899 facebook.com/in/vaibhav.gar g.104203 @rk_check
  • 2. 2 Thesis Committee ◆ Dr. Arun Balaji Buduru, IIIT Delhi ◆ Dr. Siddhartha Asthana, United Health Group (Optum) ◆ Dr. Ponnurangam Kumaraguru, IIIT Delhi
  • 3. 3
  • 4. 4 Core Thesis Question How to automatically detect fake profiles on online matrimony ?
  • 5. 5 Demo * Due to the privacy policy of the company, we can not give demo on the actual company’s portal.
  • 6. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 6
  • 8. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 8
  • 9. 9 About the Data ◆ To dig into the problem, we chose a use case of India’s leading matrimony website ◆ Ground Truth: 5,40,737 genuine profiles and very less number of fake profiles. ◆ Data of Categorical Attributes : age, body type, caste, city, country, education, height, income, manglik, marital status, mother tongue, occupation, religion.
  • 10. Categorical Data 10 Attribute Number of Categories Different Categories Caste 470 Hindu: Arora, Hindu: Aggarwal, Hindu: Brahmin etc. Height 37 5’0, 5’1, 5’2, 5’3 etc. Income 25 Rs. 0 - 1 Lakh, Rs 1-2 Lakh etc Mother Tongue 42 Telugu, Bengali, Hindi-Delhi etc. Occupation 69 Doctor, Analyst, IT-Engineer etc.
  • 11. Categorical Data 11 Attribute Number of Categories Different Categories Religion 10 Hindu, Muslim, Christian etc. Body Type 4 Slim, Average, Athletic, Heavy Country 214 India, Afghanistan, Australia etc. City 3683 Delhi, UP, Ahmedabad etc. Manglik 2 Manglik, Non-Manglik
  • 12. Categorical Data 12 Attribute Number of Categories Different Categories Marital Status 4 Never Married, Divorcee, Separated and Widowed Education 53 B.A, B.Com, B.Tech etc.
  • 13. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 13
  • 15. 15 Inconsistent Edits Edit Done After 4 Days of Registration
  • 17. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 17
  • 18. 18 Behavioural Trend for Caste Attribute Experimented on 100 fake and 100 genuine profiles belonging to Aggarwal Community
  • 19. 19 Behavioural Trend for Marital Status Attribute Experimented on 100 fake and 100 genuine profiles belonging to Non Married Community
  • 20. 20 Static Windows User’s First 8 days Activity First 12 hours Day 0 … . . . . . 0th window 1st window Day 0 Activity Day 1 Activity Day 6 Activity Day 7 Activity … . . . . . Last 12 hours Day 0 First 12 hours Day 7 Last 12 hours Day 7 15th window 16th window
  • 21. 21 Static Windows and Feature Generation
  • 22. 22 Which Model to Choose ?
  • 24. Offline Results on Behaviour Features 24 Confusion Matrix Predicted Fake Predicted Clean Actual Fake 2953 852 Actual Clean 168 17799 Above results are obtained on 3805 fake profiles and 17967 clean profiles Drawback: The user has to be 8 days old on portal to be scrutinized through this approach
  • 25. LIVE Results : True Positives 25
  • 26. LIVE Results : False Negatives 26 Edit and Profile features needs to be incorporated !!
  • 27. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 27
  • 28. 28 Edit Summary for Mother Tongue Attribute Experimented on 100 fake and 100 genuine profiles which registered with Hindi-UP category
  • 29. 29 Edit Summary for Income Attribute Experimented on 100 fake and 100 genuine profiles which registered with Rs 5-7.5 Lakh category
  • 30. 30 Concept of Dynamic Windows User’s Active Lifetime on portal = T seconds User’s total initiates = N Time period of first N/W initiates If we select no of windows = W Time period of next N/W initiates Time period of last N/W initiates … . . . . . 0th window 1st window last window
  • 31. Feature Designing ◆ Profile Features : One hot vector of profile attributes ◆ Behavior Features : In dynamic time windows, each feature stores the proportion of initiates sent to a particular category of attribute ◆ Edit Features : In dynamic time windows, each feature stores the proportion of time user has spent on that particular category of attribute ◆ Other Raw Features : In each window, we also store the total interests sent and time duration of that window. 31
  • 32. 32 Feature Designing 0th window + + . . . . Nth window
  • 33. 33 Experimenting with number of dynamic windows No of Windows Precision Recall Accuracy Using 5 windows 0.170 0.510 0.8830 Using 4 window 0.192 0.635 0.8891 Using 3 windows 0.230 0.780 0.8977 Using 2 windows 0.242 0.804 0.8975 Using 1 window 0.266 0.866 0.8972
  • 34. 34 Feature Selection on Best Model Method Precision Recall Accuracy Best Model 0.266 0.866 0.8972 Best Model + Feature Selection 0.269 0.894 0.9083 Criteria Used = (Entropy for fake) - (Entropy for clean) (Entropy for fake) Precision is still low !!
  • 35. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 35
  • 36. 36 Affinity Features along with Behaviour Features ◆ An Affinity score between two categories i and j is the likelihood score of a person having category i to send interests to user having category j ◆ Affinity scores when incorporated with behaviour features compare between how a user is expected to behave and how he/she actually behaves on the platform
  • 38. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 38
  • 39. 39 Proposed Full length Feature Vector Profile Features Behaviour Features in Time windows Affinity Features Edit Features in Time windows + + +
  • 41. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 41
  • 42. 42 Final Results Method Precision Recall Accuracy Proposed Features + Autoencoder 0.341 0.902 0.9176 Product team demanded for 25% precision at 60% recall !!
  • 43. Outline ◆ About Online Matrimony ◆ About the Data ◆ Characteristics of a fake profile ◆ Using only Behaviour Trends ◆ Using Behavior, Edit and Profile Information ◆ Incorporating Community features ◆ Feature Engineering: Proposed Full length feature vector ◆ Final Results ◆ Conclusion 43
  • 44. Conclusion ◆ We first studied the distinction in behaviour, profile and edit pattern between genuine and fake users ◆ We incorporated these characteristics in the form of features using dynamic time windows. ◆ We then trained the autoencoder model to detect fake profiles on online matrimony. 44
  • 47. Limitations and Future Work ◆ More number of samples for training autoencoder can lead to more generalisation. ◆ We detected fake profiles using categorical attributes only. Text spamming can be explored. 47
  • 48. Acknowledgement ◆ Committee Members ◆ Hunny, Adhish from InfoEdge India Ltd. ◆ Members of Precog family ◆ Family and friends 48
  • 49. 49 References ◆ https://timesofindia.indiatimes.com/city/hyderabad/nigerian-held-for-matrim onial-fraud-in-hyderabad/articleshow/66939563.cms ◆ https://www.hindustantimes.com/mumbai-news/woman-creates-fake-profil e-on-matrimony-site-cheats-mumbai-man-of-rs23-lakh/story-KHLj4zPWI8U Gv31YM5A8tK.html ◆ https://timesofindia.indiatimes.com/city/mangaluru/online-matrimony-frauds -on-the-rise-in-mangaluru/articleshow/66102334.cms ◆ https://timesofindia.indiatimes.com/city/pune/matrimonial-fraud-on-the-rise- more-than-50-cases-registered-this-year/articleshow/60049950.cms ◆ https://dl.acm.org/citation.cfm?id=2689747 ◆ https://link.springer.com/book/10.1007%2F978-3-319-20466-6 ◆ https://dl.acm.org/citation.cfm?id=3106489