SlideShare a Scribd company logo
www.insight-centre.org

An Ontology-based Technique for
Online Profile Resolution
Keith Cortis, Simon Scerri, Ismael Rivera,
Siegfried Handschuh

International Conference on Social Informatics
Kyoto, Japan

27th November 2013
Introduction (1)
www.insight-centre.org



Instance Matching : if two instances /
representations refer to the same real world
entity or not e.g., persons

 Research Challenge : Discovery of multiple
online profiles that refer to the same person
identity on heterogeneous social networks
Introduction (2)
www.insight-centre.org



Improved profile matching system extended
with:
 Named

Entity Recognition
 Linked Open Data
 Semantic Matching

Additional Benefit: Ontology used
background schema
 Advantage: Standard schema enables
cross-network interoperability


as

a
Motivation
www.insight-centre.org

 Contact Matcher Applications:
 Control sharing of personal data
 Detection of fully or partly anonymous
contacts
o

> 83 million fake accounts

 New contacts suggestions that are of direct
interest to user
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country

Country
country

5
Online Profile Suggestions

6
Online Profile Merging

Attribute Weighting
Function
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction

2
Semantic Lifting
Semantic Lifting
www.insight-centre.org

 Lifting semi-/un-structured profile information
from a remote schema

 Transform information to instances of the
Contact Ontology (NCO)
 NCO - Identity-related online profile information
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Large KB
Gazetteer

Surname

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

Country
Attribute Value Matching
www.insight-centre.org

 Direct Value Comparison

 String Matching
Best string matching metric for each
attribute type
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Large KB
Gazetteer

Surname

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b
Semantic-based
Matching Extension
City

Country
country

Country
Semantic-based Matching
www.insight-centre.org

 Indirect semantic relations at a schema level
 Use-case: Location-related profile attributes
 Location sub-entities being semantically
compared are: city, region and country
 Find the semantic relations between the subentities in question in a bi-directional manner
 E.g. Galway (profile 1) vs. Ireland (profile 2)
Galway

locatedWithin

Ireland

Ireland

country
isPartOf

isLocationOf
containsLocation

Galway
capital
largestCity
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country
country

Country

Attribute Weighting
Function
Attribute Weighting Function
www.insight-centre.org

 Approach 1: Direct Similarity Score
Name

Justin Bieber

Similarity Value

J. Bieber
0.90

 Approach 2: Normalised Similarity Score
based on a threshold for each attribute type
Attribute Threshold for Name : 0.70
Name

Justin Bieber

J. Bieber

Metric Similarity Value

0.90

Similarity Value

1.0

Name

Justin Bieber

Joffrey Baratheon

Metric Similarity Value

0.4

Similarity Value

0.0
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country

Country
country

5
Online Profile Suggestions

Attribute Weighting
Function
Online Profile Suggestions
www.insight-centre.org

Name

Joffrey Baratheon

Joff Baratheon

City

King’s Landing

King’s Landing

Role

King

King

286AL

286AL

Date of Birth
Similarity Score

0.95
Similarity Threshold: 0.90

Name

Joffrey Baratheon

Joffrey Bieber

City

King’s Landing

London, Ontario

Role

King

Singer

286AL

01/03/1994

Date of Birth
Similarity Score

0.30
Online Profile Suggestions
www.insight-centre.org
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country

Country
country

5
Online Profile Suggestions

6
Online Profile Merging

Attribute Weighting
Function
Experiments & Evaluation
www.insight-centre.org

 Two-staged evaluation:
1. Technique
a) Best attribute similarity score approach
b) If NER & semantic-based matching extension
improve overall technique
c) The computational performance of hybrid
technique against the syntactic-based one
d) A similarity threshold that determines profile
equivalence within a satisfactory degree of
confidence

2. Usability
e) Level of precision for the profile matching
Technique Evaluation
www.insight-centre.org

 Two Datasets:
1. A controlled dataset of public profiles obtained
from the Web (LinkedIn and Twitter)
 182 online profiles
–
–

112 ambiguous real-world
persons (common attributes)
70 refer to 35 well-known
sports journalists

 Maximised False Positives

2. Private personal and contact-list profiles
obtained from 5 consenting participants
Technique Evaluation – Experiment 1
www.insight-centre.org

 Profile attribute similarity score that fares best
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Normalised Approach

Precision
Recall
F1-Measure

0.7

0.75

0.8

0.85

Threshold value

0.9

Results

Result

Direct Approach
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Precision
Recall
F1-Measure

0.7

0.75

0.8

0.85

0.9

Threshold value

 Direct Approach outperforms Normalised Approach
 8631 online profile pair comparisons
Technique Evaluation – Experiment 2
www.insight-centre.org

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

String
Technique

Precision
Recall
F1-Measure

0.7

0.75
Threshold value

0.8

Result

Result

 String-based technique vs. String + NER + Semanticbased technique
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Hybrid
Technique

Precision
Recall
F1-Measure

0.7

0.75

0.8

Threshold value

 New hybrid technique improves the results
considerably over the string-only based one
 F-measure -> more or less stable for thresholds of
0.75 and 0.8.
Technique Evaluation – Experiment 3
www.insight-centre.org

 Computational performance of hybrid technique vs.
syntactic-only based one
 For this test we selected profile pairs:
 Having a number of common attributes
 At least 1 attribute candidate for semantic matching
40
35

Time (ms)

30
25
20

Syntactic

15

Hybrid

10
5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of Common Attributes

 On average hybrid technique takes ≈15ms more
Technique Evaluation – Experiment 4
www.insight-centre.org

 Find a deterministic similarity threshold with the
highest degree of confidence
1.0
0.9
0.8
0.7

Result

0.6
0.5
0.4
0.3
0.2
0.1
0.0

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

Precision

0.290

0.317

0.550

0.694

0.806

0.876

0.940

0.947

0.988

Recall

0.805

0.784

0.654

0.600

0.584

0.573

0.508

0.486

0.454

F1-Measure 0.426

0.452

0.598

0.643

0.677

0.693

0.660

0.643

0.622

 Optimal threshold is 0.9 -> F-measure of 0.693
Usability Evaluation (1)
www.insight-centre.org

 Quantitative & Qualitative
 Performance of profile matching technique
 Contact matcher run against the two social
networks that user is most active
 Social Networks chosen:
 Number of participants: 16
 Person suggestion page
 Short survey about their user experience
Usability Evaluation (2)
www.insight-centre.org

 Usability Evaluation Results:
#Distinct Profiles: 8,415
#Average Profiles per Social Network per
Participant: 262
#Comparisons: 1,041,279
#Person Matching Suggestions: 1,195
#Correct Matches: 975
#Incorrect Matches: 220
#Precision rate: 0.816
Usability Evaluation (3)
www.insight-centre.org

 Statistics & Results:
Social Network Integration
– 56.25% : LinkedIn and Facebook
– 25% : LinkedIn and Twitter
– 18.75% : Facebook and Twitter

User Satisfaction
– 50% : Extremely
– 43.8% : Quite a bit
– 0% : Moderately
– 6.3% : A little
– 0% : Not at all
Usability Evaluation (4)
www.insight-centre.org

Application 1: Management & Sharing

Application 2: Enhanced Security

Application 3: Networking & Suggestions
Limitations
www.insight-centre.org

 Person’s gender is not provided by all social
network APIs
Identify gender based on first name or
surname through NER
 Weights of some profile attributes e.g., first
name, surname are too high
 In some cases they impact the final result too
strongly
More experiments will be conducted to finetune these weights
Future Work
www.insight-centre.org

 Consider identification of higher degrees of
semantic relatedness

country

 Enrich technique with other LOD cloud datasets
 Additional social networks targeted
Conclusion
www.insight-centre.org

 Profile matching algorithm with:
Semantic Lifting
NER on semi-/un-structured profile information
Linked Open Data to improve the NER process
Semantic matching at the schema level to find
any possible indirect semantic relations
Weighted Profile Attribute Matching

 Quantitative & Qualitative Evaluation
Thank you for your attention
Related Work Comparison
www.insight-centre.org

 Existing Profile Matching Approaches based on:
User’s friends
Specific Inverse Functional Properties e.g., email
address
String matching of all profile attribute
Semantic relatedness between text, depending
on remote Knowledge Bases e.g., Wikipedia

 Evaluation of these Approaches:
Technique Evaluation on controlled datasets
No Usability Evaluation

More Related Content

What's hot

IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure AlgorithmIRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET Journal
 
Learning to detect phishing ur ls
Learning to detect phishing ur lsLearning to detect phishing ur ls
Learning to detect phishing ur ls
eSAT Publishing House
 
IRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine LearningIRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine Learning
IRJET Journal
 
Determining a digital profile from public social media information.
Determining a digital profile from public social media information.Determining a digital profile from public social media information.
Determining a digital profile from public social media information.
Karolina Stamblewska
 
Social networks protection against fake profiles and social bots attacks
Social networks protection against  fake profiles and social bots attacksSocial networks protection against  fake profiles and social bots attacks
Social networks protection against fake profiles and social bots attacks
Aboul Ella Hassanien
 
Covert communication in mobile applications
Covert communication in mobile applicationsCovert communication in mobile applications
Covert communication in mobile applications
Andrey Apuhtin
 
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEYPHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
IJNSA Journal
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
ijcsit
 
Vulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using WebkillVulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using Webkill
ijtsrd
 
AppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking AppsAppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking Apps
Markus Huber
 
A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques
IJECEIAES
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
gerogepatton
 
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACTIEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
tsysglobalsolutions
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
IJCNCJournal
 
762019109
762019109762019109
762019109
IJRAT
 
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET Journal
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learning
MirXahid1
 
Fake Product Review Monitoring System
Fake Product Review Monitoring SystemFake Product Review Monitoring System
Fake Product Review Monitoring System
ijtsrd
 

What's hot (20)

IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure AlgorithmIRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
 
Learning to detect phishing ur ls
Learning to detect phishing ur lsLearning to detect phishing ur ls
Learning to detect phishing ur ls
 
IRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine LearningIRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine Learning
 
Determining a digital profile from public social media information.
Determining a digital profile from public social media information.Determining a digital profile from public social media information.
Determining a digital profile from public social media information.
 
Social networks protection against fake profiles and social bots attacks
Social networks protection against  fake profiles and social bots attacksSocial networks protection against  fake profiles and social bots attacks
Social networks protection against fake profiles and social bots attacks
 
Covert communication in mobile applications
Covert communication in mobile applicationsCovert communication in mobile applications
Covert communication in mobile applications
 
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEYPHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
 
Vulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using WebkillVulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using Webkill
 
AppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking AppsAppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking Apps
 
A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques
 
B07040308
B07040308B07040308
B07040308
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
 
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACTIEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
 
762019109
762019109762019109
762019109
 
Iy2515891593
Iy2515891593Iy2515891593
Iy2515891593
 
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learning
 
Fake Product Review Monitoring System
Fake Product Review Monitoring SystemFake Product Review Monitoring System
Fake Product Review Monitoring System
 

Viewers also liked

Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networks
IIIT Hyderabad
 
Profile Matching in Solving Rank Problem
Profile Matching in Solving Rank ProblemProfile Matching in Solving Rank Problem
Profile Matching in Solving Rank Problem
Universitas Pembangunan Panca Budi
 
Whitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsWhitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest Minds
Happiest Minds Technologies
 
Timilar ppt
Timilar pptTimilar ppt
Timilar ppt
Amit Joshi
 
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
kcortis
 
Facebook data analysis using r
Facebook data analysis using rFacebook data analysis using r
Facebook data analysis using r
Praveen Kumar Donta
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
project report of social networking web sites
project report of social networking web sitesproject report of social networking web sites
project report of social networking web sitesGyanendra Pratap Singh
 
Social Networking Project
Social Networking ProjectSocial Networking Project
Social Networking Projectjessduff44
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
akash_mishra
 
Twitter text mining using sas
Twitter text mining using sasTwitter text mining using sas
Twitter text mining using sas
Analyst
 

Viewers also liked (13)

Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networks
 
Profile Matching in Solving Rank Problem
Profile Matching in Solving Rank ProblemProfile Matching in Solving Rank Problem
Profile Matching in Solving Rank Problem
 
Whitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsWhitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest Minds
 
Timilar ppt
Timilar pptTimilar ppt
Timilar ppt
 
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
 
Facebook data analysis using r
Facebook data analysis using rFacebook data analysis using r
Facebook data analysis using r
 
social networking site
social networking sitesocial networking site
social networking site
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
project report of social networking web sites
project report of social networking web sitesproject report of social networking web sites
project report of social networking web sites
 
Social Networking Project
Social Networking ProjectSocial Networking Project
Social Networking Project
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Twitter text mining using sas
Twitter text mining using sasTwitter text mining using sas
Twitter text mining using sas
 

Similar to An Ontology-based Technique for Online Profile Resolution

2006-05-25__coi-semdis
2006-05-25__coi-semdis2006-05-25__coi-semdis
2006-05-25__coi-semdiswebuploader
 
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...Carlton Northern
 
Graph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisGraph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysis
Neo4j
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
Krishnaram Kenthapadi
 
Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)Marco Balduzzi
 
PHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINKPHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINK
RajeshRavi44
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学
Xu jiakon
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy Hodler
Neo4j
 
FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1
Mark Wilkinson
 
Candidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsCandidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital Footprints
IOSRjournaljce
 
Entity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutionsEntity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutions
Pvrtechnologies Nellore
 
Lincoln talent analysis
Lincoln talent analysisLincoln talent analysis
Lincoln talent analysis
CEB TalentNeuron
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
Trey Grainger
 
Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)
Reiner Kraft
 
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
vivekkaushik795
 
Network Analysis for SEO and Social Media
Network Analysis for SEO and Social MediaNetwork Analysis for SEO and Social Media
Network Analysis for SEO and Social Media
Mediative
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia Kalavri
Flink Forward
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systemsPersonalizing the web building effective recommender systems
Personalizing the web building effective recommender systems
Aravindharamanan S
 

Similar to An Ontology-based Technique for Online Profile Resolution (20)

2006-05-25__coi-semdis
2006-05-25__coi-semdis2006-05-25__coi-semdis
2006-05-25__coi-semdis
 
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
 
Graph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisGraph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysis
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)
 
PHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINKPHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINK
 
Mazhiming
MazhimingMazhiming
Mazhiming
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy Hodler
 
FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1
 
Candidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsCandidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital Footprints
 
Entity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutionsEntity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutions
 
Lincoln talent analysis
Lincoln talent analysisLincoln talent analysis
Lincoln talent analysis
 
Ithet
IthetIthet
Ithet
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)
 
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
 
Network Analysis for SEO and Social Media
Network Analysis for SEO and Social MediaNetwork Analysis for SEO and Social Media
Network Analysis for SEO and Social Media
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia Kalavri
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systemsPersonalizing the web building effective recommender systems
Personalizing the web building effective recommender systems
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 

An Ontology-based Technique for Online Profile Resolution

  • 1. www.insight-centre.org An Ontology-based Technique for Online Profile Resolution Keith Cortis, Simon Scerri, Ismael Rivera, Siegfried Handschuh International Conference on Social Informatics Kyoto, Japan 27th November 2013
  • 2. Introduction (1) www.insight-centre.org  Instance Matching : if two instances / representations refer to the same real world entity or not e.g., persons  Research Challenge : Discovery of multiple online profiles that refer to the same person identity on heterogeneous social networks
  • 3. Introduction (2) www.insight-centre.org  Improved profile matching system extended with:  Named Entity Recognition  Linked Open Data  Semantic Matching Additional Benefit: Ontology used background schema  Advantage: Standard schema enables cross-network interoperability  as a
  • 4. Motivation www.insight-centre.org  Contact Matcher Applications:  Control sharing of personal data  Detection of fully or partly anonymous contacts o > 83 million fake accounts  New contacts suggestions that are of direct interest to user
  • 5. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country Country country 5 Online Profile Suggestions 6 Online Profile Merging Attribute Weighting Function
  • 6. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction 2 Semantic Lifting
  • 7. Semantic Lifting www.insight-centre.org  Lifting semi-/un-structured profile information from a remote schema  Transform information to instances of the Contact Ontology (NCO)  NCO - Identity-related online profile information
  • 8. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Large KB Gazetteer Surname City 4 Hybrid Matching Process a Attribute Value Matching Country
  • 9. Attribute Value Matching www.insight-centre.org  Direct Value Comparison  String Matching Best string matching metric for each attribute type
  • 10. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Large KB Gazetteer Surname City 4 Hybrid Matching Process a Attribute Value Matching b Semantic-based Matching Extension City Country country Country
  • 11. Semantic-based Matching www.insight-centre.org  Indirect semantic relations at a schema level  Use-case: Location-related profile attributes  Location sub-entities being semantically compared are: city, region and country  Find the semantic relations between the subentities in question in a bi-directional manner  E.g. Galway (profile 1) vs. Ireland (profile 2) Galway locatedWithin Ireland Ireland country isPartOf isLocationOf containsLocation Galway capital largestCity
  • 12. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country country Country Attribute Weighting Function
  • 13. Attribute Weighting Function www.insight-centre.org  Approach 1: Direct Similarity Score Name Justin Bieber Similarity Value J. Bieber 0.90  Approach 2: Normalised Similarity Score based on a threshold for each attribute type Attribute Threshold for Name : 0.70 Name Justin Bieber J. Bieber Metric Similarity Value 0.90 Similarity Value 1.0 Name Justin Bieber Joffrey Baratheon Metric Similarity Value 0.4 Similarity Value 0.0
  • 14. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country Country country 5 Online Profile Suggestions Attribute Weighting Function
  • 15. Online Profile Suggestions www.insight-centre.org Name Joffrey Baratheon Joff Baratheon City King’s Landing King’s Landing Role King King 286AL 286AL Date of Birth Similarity Score 0.95 Similarity Threshold: 0.90 Name Joffrey Baratheon Joffrey Bieber City King’s Landing London, Ontario Role King Singer 286AL 01/03/1994 Date of Birth Similarity Score 0.30
  • 17. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country Country country 5 Online Profile Suggestions 6 Online Profile Merging Attribute Weighting Function
  • 18. Experiments & Evaluation www.insight-centre.org  Two-staged evaluation: 1. Technique a) Best attribute similarity score approach b) If NER & semantic-based matching extension improve overall technique c) The computational performance of hybrid technique against the syntactic-based one d) A similarity threshold that determines profile equivalence within a satisfactory degree of confidence 2. Usability e) Level of precision for the profile matching
  • 19. Technique Evaluation www.insight-centre.org  Two Datasets: 1. A controlled dataset of public profiles obtained from the Web (LinkedIn and Twitter)  182 online profiles – – 112 ambiguous real-world persons (common attributes) 70 refer to 35 well-known sports journalists  Maximised False Positives 2. Private personal and contact-list profiles obtained from 5 consenting participants
  • 20. Technique Evaluation – Experiment 1 www.insight-centre.org  Profile attribute similarity score that fares best 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Normalised Approach Precision Recall F1-Measure 0.7 0.75 0.8 0.85 Threshold value 0.9 Results Result Direct Approach 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Precision Recall F1-Measure 0.7 0.75 0.8 0.85 0.9 Threshold value  Direct Approach outperforms Normalised Approach  8631 online profile pair comparisons
  • 21. Technique Evaluation – Experiment 2 www.insight-centre.org 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 String Technique Precision Recall F1-Measure 0.7 0.75 Threshold value 0.8 Result Result  String-based technique vs. String + NER + Semanticbased technique 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Hybrid Technique Precision Recall F1-Measure 0.7 0.75 0.8 Threshold value  New hybrid technique improves the results considerably over the string-only based one  F-measure -> more or less stable for thresholds of 0.75 and 0.8.
  • 22. Technique Evaluation – Experiment 3 www.insight-centre.org  Computational performance of hybrid technique vs. syntactic-only based one  For this test we selected profile pairs:  Having a number of common attributes  At least 1 attribute candidate for semantic matching 40 35 Time (ms) 30 25 20 Syntactic 15 Hybrid 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of Common Attributes  On average hybrid technique takes ≈15ms more
  • 23. Technique Evaluation – Experiment 4 www.insight-centre.org  Find a deterministic similarity threshold with the highest degree of confidence 1.0 0.9 0.8 0.7 Result 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 Precision 0.290 0.317 0.550 0.694 0.806 0.876 0.940 0.947 0.988 Recall 0.805 0.784 0.654 0.600 0.584 0.573 0.508 0.486 0.454 F1-Measure 0.426 0.452 0.598 0.643 0.677 0.693 0.660 0.643 0.622  Optimal threshold is 0.9 -> F-measure of 0.693
  • 24. Usability Evaluation (1) www.insight-centre.org  Quantitative & Qualitative  Performance of profile matching technique  Contact matcher run against the two social networks that user is most active  Social Networks chosen:  Number of participants: 16  Person suggestion page  Short survey about their user experience
  • 25. Usability Evaluation (2) www.insight-centre.org  Usability Evaluation Results: #Distinct Profiles: 8,415 #Average Profiles per Social Network per Participant: 262 #Comparisons: 1,041,279 #Person Matching Suggestions: 1,195 #Correct Matches: 975 #Incorrect Matches: 220 #Precision rate: 0.816
  • 26. Usability Evaluation (3) www.insight-centre.org  Statistics & Results: Social Network Integration – 56.25% : LinkedIn and Facebook – 25% : LinkedIn and Twitter – 18.75% : Facebook and Twitter User Satisfaction – 50% : Extremely – 43.8% : Quite a bit – 0% : Moderately – 6.3% : A little – 0% : Not at all
  • 27. Usability Evaluation (4) www.insight-centre.org Application 1: Management & Sharing Application 2: Enhanced Security Application 3: Networking & Suggestions
  • 28. Limitations www.insight-centre.org  Person’s gender is not provided by all social network APIs Identify gender based on first name or surname through NER  Weights of some profile attributes e.g., first name, surname are too high  In some cases they impact the final result too strongly More experiments will be conducted to finetune these weights
  • 29. Future Work www.insight-centre.org  Consider identification of higher degrees of semantic relatedness country  Enrich technique with other LOD cloud datasets  Additional social networks targeted
  • 30. Conclusion www.insight-centre.org  Profile matching algorithm with: Semantic Lifting NER on semi-/un-structured profile information Linked Open Data to improve the NER process Semantic matching at the schema level to find any possible indirect semantic relations Weighted Profile Attribute Matching  Quantitative & Qualitative Evaluation Thank you for your attention
  • 31. Related Work Comparison www.insight-centre.org  Existing Profile Matching Approaches based on: User’s friends Specific Inverse Functional Properties e.g., email address String matching of all profile attribute Semantic relatedness between text, depending on remote Knowledge Bases e.g., Wikipedia  Evaluation of these Approaches: Technique Evaluation on controlled datasets No Usability Evaluation