SlideShare a Scribd company logo
1 of 55
Download to read offline
PhD candidate: Hussein Hazimeh
Director: Prof. Philippe Cudré-Mauroux / UNI-FR
Co-Director: Prof. Elena Mugellini / HES-SO
28.06.2019
Automatic Knowledge Graph
Entity Refinement Based on Social
Networks
Introduction
Research
problems
Research
questions Contributions
Conclusions &
open problems
29.06.2019 Hussein Hazimeh PhD presentation 2
Agenda
Introduction
29.06.2019 Hussein Hazimeh PhD presentation 3
• What is a knowledge graph?
 Objectives:
• KGs Allow users to visualize knowledge facts about real-world entities (nodes) and the interrelations between
them (edges).
 Data source:
• Incorporate knowledge from structured repositories such as DBpedia.
• Extract knowledge from semi-structured web resources such as Wikipedia.
 Privacy:
• KGs composed into private (can’t use/analyse its knowledge) and public (can use/analyse its knowledge).
29.06.2019 Hussein Hazimeh PhD presentation 4
Introduction: Knowledge Graphs (1/3)
Introduction Challenges & RQs SOA Contributions Conclusions Future work
29.06.2019 Hussein Hazimeh PhD presentation 5
Introduction: Knowledge Graphs (2/3)
Introduction Challenges & RQs SOA Contributions Conclusions Future work
• Why using knowledge graphs?
 Transfer data into knowledge.
 Knowledge represented in the form of entity and relations.
 Connect different types of data.
 Improve decisions by finding things faster.
 Re-use of publicly available industry graphs and ontologies.
 Readable by humans and machines.
 Have been in use for all types of industries (gas, pharmatical, banking, and retail).
29.06.2019 Hussein Hazimeh PhD presentation 6
Introduction: Knowledge Graphs facts (3/3)
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Google knowledge panel Wikidata KG YAGO KG
Research problems
29.06.2019 Hussein Hazimeh PhD presentation 7
29.06.2019 Hussein Hazimeh PhD presentation 8
Research problems – (1) knowledge graphs
Introduction Challenges & RQs SOA Contributions Conclusions Future work
• Importance of solving this problem:
1. Automation
• Manual => time-consuming
• Automatic => faster
2. Utility for other systems
• Digital libraries
• Recommendation systems
• Recruiting systems
• Problem (P1) – missed links in knowledge graphs:
• Knowledge graphs (KGs) are missing entity's social links to
(Facebook, Twitter).
• Unsolving this problem leads to a
time-consuming manual search for these links.
• Consequently, because of many systems (data sources) rely on these
KGs, they will be missing these links as well.
• Problem (P2) – matching challenges on social networks:
• Online Social Networks (OSNs) number of users is increasing, => user profile de-anonymization task
becomes more hardly.
75,980 Facebook user’s named “John Smith”.
• Privacy and access control policies users apply them, limit the access to certain information about
individuals.
• Un-updated user profile data (location, work, profile image, etc.).
• Non-identical information (users can share different information on their OSN profiles).
• Importance of solving this problem
• Automate the user linking process in order to transfer into a time-efficient approach.
• Find new privacy-aware attributes for matching (biographies).
• Find new context-aware attributes for matching(life events).
29.06.2019 Hussein Hazimeh PhD presentation 9
Research Challenges – (2) online social networks
Introduction Challenges & RQs SOA Contributions Conclusions Future work
2008<100M 2019>2B
• 100 entities per each class are examined.
• Both private (Google) and public (Wikidata)
• Knowledge graphs are considered.
• Examination results show that:
• The average # of social links does not exceed 50%.
• Academic entities have the lowest # of social links <4%.
29.06.2019 Hussein Hazimeh PhD presentation 10
Motivating scenario
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Motivation scenario
Research questions
29.06.2019 Hussein Hazimeh PhD presentation 11
1. General research question to address:
• Given a certain knowledge graph, how can we reinforce such entities inside this
graph?
2. Specific research questions to address:
• RQ1: How to profile online social networking users platforms?
• RQ2: Given an entity and its knowledge graph, what are the likely
corresponding OSN profile links of this entity?
• RQ3: Given a user profile on Facebook, what are the corresponding user profile
links on other OSNs?
29.06.2019 Hussein Hazimeh PhD presentation 12
Research Questions
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Concerns P1.
Concerns P2.
Contributions
29.06.2019 Hussein Hazimeh PhD presentation 13
User Profile Analytics and Discovery on Social Networks.C1
Comparative Review of Social Network Profile Matching Methods.C2
Novel Method for Interlinking User Profiles on Social Networks.C3
Automatic Embedding of Academic Entity Social Links into Knowledge Graphs.C4
Automatic Embedding of Social Event Sentiment Polarities into Knowledge Graphs.C5
29.06.2019 Hussein Hazimeh PhD presentation 14
Contributions
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Google
scholar
Data sources
KB
Knowledge
bases
Existing
New KGs
Embedding
Building new KGs
Life events
Find social link and sentimentProfiling
Sentiment
Biographies
Sentiment features
C2
C3
C3C4
C3C5
Answers RQ1.
Answers RQ3.
Answers RQ2.
• Knowledge graph embedding:
• Approaches for knowledge graph completion:
• General methods: internal and external.
• Translation Based Methods: tanslate entity/relation from head
to tail: TransE [9], TransH [102], TransR [56], TranSpace
[43].
• Dataset profiling:
• Social networks [65] YouTube, Flickr, Orkut.
• Other datasets: knowledge graphs [78].
• Sentiment analysis on social events:
• Lexicon-based [48, 105].
• Supervised-based [58].
29.06.2019 Hussein Hazimeh PhD presentation 15
Related work
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Reference Method Sources
[103] External Search engines
[46] Internal Reinforcement learning
[96] External Social media
[86] External DBLP, Microsoft academic search
[31] Internal LSH
[19] Internal,
external
External knowledge graphs
[92] External Social networks
[114, 89, 26] Internal Machine leraning
car driverfarmer
Farming
Skills
C1: User Profile Analytics and
Discovery on Social Networks
29.06.2019 Hussein Hazimeh PhD presentation 16
• Dataset profiling is the task of creating descriptive metadata about such entities.
An entity can be a person, organization, database, etc.
• Profiling methods and measures.
• Global measures: measure the facts of a dataset; its size, its shape, etc.
• Platform measures: study the dataset features according to its shape, database or graph
dataset, for example.
• Our task: analyze datasets of user profiles from 4 OSNs:
• Facebook DF, Google+ DG+, LinkedIn DL, Twitter DT
29.06.2019 | Hussein Hazimeh PhD presentation 17
User Profile Analytics and Discovery (UPAD)
Introduction Challenges & RQs SOA Contributions Conclusions Future work
C2
C2
C3 C4 C5C1
I. Platform-based measure
(1) Attribute
availability
(2) Activity
frequency
(3) Profile
completeness
(4) Mutability
index
II. Entity-based
measure
(1) Profile
confidentiality
• Study for each attribute in DS its % of availability.
• Availability of AS on DS.
• Results:
• Highest available: screennames.
• LinkedIn and Google+: highest and lowest image availability respectively.
• Google+: lowest education compared to Facebook; location rarely available.
• LinkedIn: location available; however, medium range between Facebook and Twitter.
• Bio on LinkedIn and Google+ highly available respectively, and similarly exists on Facebook and Twitter.
29.06.2019 18
Platform-based Measure: (1) Attribute Availability
Results
Introduction Challenges & RQs SOA Contributions Conclusions Future work
35.6%
38.1%
| Hussein Hazimeh PhD presentation C2 C3 C4 C5C1
• Activity frequency: the amount of public content.
• (tweets / posts / shares / retweets) published on a user
timeline (wall).
• 5 features analyzed
• Only the highly available features are considered.
• Results
• Text highest available content on all OSNs except
LinkedIn.
• Facebook: highest content/user.
29.06.2019 19
Platform-based Measure: (2) Activity Frequency
Results
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation
Number
of links (l)
Text (t)
Check-ins
(c)
Photos (p) Tags (t)
Mentions
(m)
C2 C3 C4 C5C1
• Profile completeness: the portion of
public-only attributes.
• Classes:
• Results:
• Surprisingly, only a very small portion of
user profiles on OSNs are totally
incomplete.
29.06.2019 20
Platform-based Measure: (3) User Profile Completeness
Results
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation
complete-
without-A
only-A
fully
complete
uncomplete
C2 C3 C4 C5C1
• Mutability index:
• A user U has an attribute A, where the old value of A = Ao
and the new value of A = An.
• Results:
• Facebook, Google+, and Twitter: the biography is the
highest mutable attribute.
• LinkedIn: publication is highest mutable.
• Contrary, screenname, gender, and birthdate were the lowest
mutable profile attributes.
29.06.2019 21
Platform-based Measure: (4) Mutability Index
Hypothesis
Results
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C2 C3 C4 C5C1
• Confidentiality model: responsible for calculating the probability of a
user account if confident (real) or non-confident (fake).
• To study the confidentiality: we consider a set of profile attributes and
assign a weight for each one.
• Results:
• LinkedIn and Facebook profiles are the highest confident, compared to
other OSNs having 95.3% and 83.78% respectively.
• Google+ and Twitter: mostly “unlikely confident” having 22.2% non-
confident score for Google+ and 37.76% for Twitter.
29.06.2019 22
Entity-based Measure: (1) Profile Confidentiality
Model
Results
Introduction Challenges & RQs SOA Contributions Conclusions Future work
95.3%
83.7%
37.7%
22.2%
| Hussein Hazimeh PhD presentation C2 C3 C4 C5C1
Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled. « Reliable User Profile Analytics and Discovery on
Social Networks. » In 8th International Conference on Software and Computer Applications - ACM (ICSCA
2019). Penang, Malaysia
C2: Comparative Review of Social
Network Profile Matching Methods
29.06.2019 23| Hussein Hazimeh PhD presentation
• Problem:
 Existence of user profiles belonging to a single
user across different social networking sites.
• Challenge:
 Link same profiles across OSNs, because of
different information between these profiles.
• How?
 Leverage a set of features from user profiles and
social network.
 Introduce a similarity measure depending on the
context of the attribute value (text, date, etc.).
 Develop the matching algorithm.
29.06.2019 24
Profile matching definition
Kiwifruit
Problem: match the Kiwifruit on
both baskets (b1 and b2).
• color: green
• Shape: oval
• Seeds: edible
• Texture: soft
Features
b1 b2
John profiles on multiple
OSNs
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C3 C4 C5C1 C2
• More than 16 distinct features have been used.
• Attribute-based:
• Location, Work, Image
• Behavioral-based:
• Timestamps, Writing style, Topic detection.
• Do all features used by the SOA?
• NO!
• Do all features lead to efficient matching results?
• NO!
• Behavioral-based cons:
• The Trade-off in activity between different accounts.
• Attribute-based features cons:
• Privacy and access control methods users apply.
29.06.2019 25
Profile matching approaches
Comparative analysis
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Faceboo
k
36%
Twitter
33%
Google+
9%
LinkedIn
12%
Flickr
10%
FB<-
>TW
44%
FB<->Lin
17%
TW<->Lin
22%
TW<->FL
17%
1 1
3
2
3 3
5
6 6
0
1
2
3
4
5
6
7
2007 2009 2011 2013 2015
Content and
behavioral
39%
Profile
attributes
55%
Both
6%
Resources Associations Feature types
Publications by year Similarity methods tree
| Hussein Hazimeh PhD presentation C3 C4 C5C1 C2
Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «Linking user profiles in social networks: a
comparative review. » International Journal of Social Network Mining, Volume 2: 333-361 - 2017.
C3: Automatic Embedding of Academic
Entity Social Links into Knowledge Graphs
29.06.2019 26| Hussein Hazimeh PhD presentation
• We propose a query-based approach for
social link embedding.
1. Query: entity name
A. Knowledge acquisition:
i. Google Scholar mainly
ii. Wikidata Metaphacts.
2. Profile matching
A. Knowledge base to social network matching
i. F-Link
ii. K-Link
3. Machine learning
A. Bottom-up paradigm
i. Clustering
ii. Classification
4. Enrichment and storage
A. Results embedding to Wikidata
B. Semantic storage and visualization
29.06.2019 27
Embedding social profile links to knowledge graphs
Architecture
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
• F-Link is an algorithm that is responsible to find the
Facebook profile link of a specific entity.
• It is composed of 5 main steps:
1. Knowledge base construction.
2. Facebook search (by name).
3. Similarity calculation.
4. Classification/clustering.
5. Facebook profile link output (with knowledge base).
29.06.2019 28
Embedding social profile links to knowledge graphs
F-Link algorithm (Matcher 1 – M1)
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link
F-Link steps
• Knowledge acquisition
• The initial Knowledge Base (KB) is constructed
from Google Scholar (GS) and Wikidata (WD).
𝐾𝐵 = 𝐺𝑆 ⊕ 𝑊𝐷
• Knowledge from both resources is integrated into
one KB.
• GS feature extraction:
• Profile headers.
• Profile publications.
• Wikidata feature extraction
• The Wikidata knowledge graph of the corresponding
entity name:
• Result: knowledge base.
29.06.2019 29
F-link method: Knowledge base construction
Google scholar sample
Wikidata sample
Introduction Challenges & RQs SOA Contributions Conclusions Future work
⊕
KB
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link
F-Link steps
• F-link: F stands for Facebook; goal: finds the Facebook profile link.
1. Facebook (FB):
• Profile data extraction.
• Extracted data is stored in knowledge base 𝐾𝐵𝑓.
• Other profile extractions, such as: workplace and living place.
2. Google scholar:
• Profile headers.
• Biography from PDF publications.
3. Wikidata
4. F-Link produces a link 𝐿 𝑓, 𝐿𝑓 = 𝐾𝐵 ⊗ 𝐾𝐵𝑓
29.06.2019 30
F-link algorithm
Screenname first name and last name
Biography short description about the profile owner
Content collection of posts, sharing, etc…
Introduction Challenges & RQs SOA Contributions Conclusions Future work
KB KBf
⊗
𝑛 𝑝𝑟𝑜𝑓𝑖𝑙𝑒𝑠
1 𝑝𝑟𝑜𝑓𝑖𝑙𝑒
𝑛𝑎𝑚𝑒
facebook.com/rob.tibshirani
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link
F-Link steps
• Each profile feature has a context:
• Semantic
• Syntactic
• Pairs {𝑠1, 𝑠2} of features from 𝐾𝐵 𝑎𝑛𝑑 𝐾𝐵𝑓 are
matched using the compatible similarity measure.
• Entity name
• N-gram
• [𝐾𝐵 : “Robert Tibshirani”] , [𝐾𝐵𝑓: “Rob
Tibshirani”]
• Affiliation and biography
• Stop words removal
• NER
• Cosine similarity
29.06.2019 31
F-link method: similarity measures
Affiliation
• Text
Tokenize
• NER
Syntactic
• Metric
Affiliation similarity workflow
Professor at the
University of
Stanford
e = [“University of
Stanford”: Organization,
“Professor”: Object]
[0, 1]
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link
F-Link steps
• K-link: K stands for Twitter and LinkedIn; goal: finds the
Twitter and LinkedIn profile links (FB, TW) and (FB,
LIn).
• It is composed of 5 main steps:
1. Knowledge base construction.
2. Twitter/LinkedIn search (by name).
3. Similarity calculation.
4. Classification/clustering.
5. Twitter/LinkedIn profile link output.
29.06.2019 32
Embedding social profile links to knowledge graphs
K-Link algorithm (Matcher 2 – M2)
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link
K-Link steps
• Matcher initial input: “reliable” Facebook set of profile links
from the F-Link matcher.
• The knowledge base is constructed from the data inside each
profile.
• Screenname, affiliation, life events, biography, etc.
• The key features of this matcher:
1. Life events
2. Biographies
29.06.2019 33
K-link algorithm
Introduction Challenges & RQs SOA Contributions Conclusions Future work
KBf KBk
⊗
𝑛 𝑝𝑟𝑜𝑓𝑖𝑙𝑒𝑠
1 𝑝𝑟𝑜𝑓𝑖𝑙𝑒
𝑛𝑎𝑚𝑒
linkedin.com/rob.tibshirani
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link
K-Link steps
• Facebook, Twitter, and LinkedIn
1. Basic profile attributes
2. Life events
3. Biography
• Formally, let 𝑇 , 𝐿 and 𝐹 represent Twitter and
LinkedIn OSNs, respectively.
• The profile of a user i in either T , L or F is represented
as 𝑃𝑖
𝑠
where s ∈ {T, L, F}.
• The profile attributes of a user i is modeled as follows
𝑃𝑖
𝑠
= {𝑛𝑖
𝑠
, 𝑙𝑖
𝑠
, 𝑒𝑖
𝑠
, 𝑏𝑖
𝑠
, 𝑝𝑖
𝑠
, 𝑑𝑖
𝑠
}.
• n denotes the screenname, l denotes the location, e
denotes the life events, b denotes the profile biography,
p denotes the profession and d denotes his birthday
date.
29.06.2019 34
K-link method: feature extraction
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Life event
• Text
Semantic
• LDA
Life event similarity workflow
Started a new
job at Google
e = “new job”
Life event search mechanism
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link
K-Link steps
• Similarity measure is conducted over a set of attributes:
• Screennames
• Life events
• Biographies
• Affiliations
• Locations
• Birthdates
• IDs
• Each attribute is matched with a specific similarity metric.
29.06.2019 35
K-link method: similarity measures
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link
K-Link steps
• In step 3, matching results is manipulated in a bottom-up machine learning
paradigm.
1. Clustering
2. Classification
• The dataset used for this task includes the similarity result of each pair of
attributes from two social networks.
• Why did we combine clustering and classification models?
29.06.2019 36
Combine clustering and classification methods to
find the social profile links
Introduction Challenges & RQs SOA Contributions Conclusions Future work
ClassificationClustering
• Label
prediction
• More
Powerful
• Class
prediction
Insufficientl
labeleddata
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link
K-Link steps
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link
F-Link steps
• Clustering
• Each profile similarity represents a feature vector.
• Only the cluster with the highest confidence (𝐶𝑓).
• Each cluster average is calculated.
• The clusters are filtered and the one with the highest average is
considered in the classification task.
29.06.2019 37
1- Data clustering
C1=0.2,0.2,0.5
C2=0.4,0.4,0.5
C3=0.8,0.8,0.8
Average
C3
Introduction Challenges & RQs SOA Contributions Conclusions Future work
𝑪 𝒇 = 0.8
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link
K-Link steps
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link
F-Link steps
• Classification
• Each profile in the previous confidence cluster is classified.
• The binary classification task is composed of two classes: “match”
or “not match”.
• Only the cluster with the highest confidence (𝐶𝑓).
• We use Bayesian naïve classifier (BNC).
• The feature vector
𝑣 = [𝑥1, 𝑥2, … , 𝑥 𝑛].
𝑥 𝑛 = similarity between a pair of attributes (life event for e.g.).
• In addition, we compared the performance of BNC with two other
methods: SVM and decision trees.
• Result of the classifier:
• Facebook link (𝐿 𝑓) and Twitter/LinkedIn link (𝐿 𝑘).
29.06.2019 38
2- Data classification
BNC tree example
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
Knowledge
base
Twitter/LinkedIn
search
Similarity
calculation
Classification/
Clustering
Twitter/LinkedIn
profile link (𝐿 𝑘).
K-Link steps
Knowledge
base
Facebook
search
Similarity
calculation
Classification/
Clustering
Facebook profile
link (𝐿 𝑓)
F-Link steps
• Data sources
• 5,694 are used in M1 and M2.
• Maximum life events / user profile is 8 events. In addition,
each class of event has a total of 2.2K at maximum.
• Up to 83 name matches from Facebook.
• Machine learning dataset:
• Manually labeled 300 instances..
• Each new classified instance is added to the main
dataset to increase its performance.
29.06.2019
39
Implementation: dataset facts
Source Type
Google scholar Scholary
Wikidata Knowledge graph
Facebook, Twitter, and
LinkedIn
Online social networks
Life events statistics
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
M1 – name search matching results
83
50
25
10
• We compare the precision and recall trade-
offs across multiple domains.
• Four domains are considered in our study:
• (CS = Computer Science, P = Physics, C =
Chemistry, M = Medicine).
• Results
• LIn has the highest precision and recall.
• Platform is more structured.
• User information is more confident.
• Profiles are updated regularly.
29.06.2019 40
Evaluation on different domains
precision and recall
Multiple domain comparison results
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
We validate that our approach can match multiple
domains
• HYDRA [61]: a system for linking identical user accounts by analyzing and comparing the
behavior of users.
• BM25 [39]: an approach for identifying a user across social networks by comparing their
tagging practice and usernames.
• MOBIUS [113]: they connect user profiles across social networks by comparing the
behavioral characteristics such as timestamp between posts
• OPL [115]: an approach for connecting social networking user profiles using internal and
external features.
29.06.2019 41
Benchmarking with baselines (1/2)
precision and recall
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
• Comparison with other
knowledge graphs
• Automatic method better than
manual.
• Multiple domains are compared.
• K-link baselines
29.06.2019 42
Benchmarking with baselines (2/2)
precision and recall
• Baseline: other approaches.
• HYDRA [61]: link users by behavior matching.
• Baseline: with/without named entities.
Successfully founded links
Comparing to other approaches
Comparing to other approaches
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
We validate that our approach outperforms many
baselines, and similar to HYDRA in precision and
recall
• Cases: only supervised methods, only un-
supervised, combination.
• With biographies (WB) and without (WoB).
• With life events (WL) and without (WoL).
• Using both methods yields to a higher
precision and recall compared to using one
only.
29.06.2019 43
Evaluation on supervised, un-supervised, & both
precision and recall
Comparing the matching results on 3 machines learning cases
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
We validate that biographies and life events
enhance the precision and recall. Combined
class/clus as well enhance the results in our case.
• We study the impact of considering
the profile biographies that exist
inside the PDF publications.
• We show how the #matches before
using any biography is enhanced
clearly after using one or more
biographies.
• Why the #matches is enhanced?
• Extra information inside biographies.
29.06.2019 44
Positive impact of PDF biographies
Comparing the results of matching 4 researchers before and after using biographies
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
We validate that biographies can enhance the
accuracy of the matching results.
Name #matches before using
biographies (b=0)
(Total=100)
#matches after using
biographies (b≥1)
(Total=100)
Jeff Offutt 64 78
Trevor Hastie 55 66
Eric Yu 61 70
Robert Tibshirani 71 84
Average 62.75 74.5 11.75%
• Pros of including biographies:
• Solve the problem of private user profile information.
• Image is the highest available attribute.
• Related approaches did not cosider it.
• Pros of including life events:
• Content characteristic:
• Solve the content trade-offs among OSNs (text-only on Facebook VS
image-only on Twitter).
• Solve the problem of un-updated profiles (last post on Twitter 3 months
before the last Facebook’s post date).
• Solve the volume trade-offs problem: if we have zero tweets on Twitter
compared to a n posts on Facebook and LinkedIn.
• Solve the Language difference issue (English on Facebook VS Chinese on
Twitter).
29.06.2019 45
Pros of our approaches
case studies
Biography VS attribute-based approaches
Life events VS behavioral-based approaches
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
• We study the accuracy of each similarity
function.
• We consider 6 profile attributes: screenname,
location, life event, biography, profession, date
of birth, and gender.
• Similarity scores for biographies are closed to
[0.4, 0.7].
• Location, birthdate, and profession usually have
scores closed to [0.8, 1].
• Specific and limited value (e.g., gender (Male,
Female).
29.06.2019 46
Similarity measure scores
Similarity measure scores for different attributes
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation C5C2 C3 C4C1
K-Link scores F-Link scores
Hussein Hazimeh, Elena Mugellini, Simon Ruffieux, Omar Abou Khaled, Philippe Cudré-Mauroux.« Automatic Embedding of Social Network Profile Links into Knowledge
Graphs. » In 9th International Symposium on Info & Communication Technology - ACM (SoICT 2018). Da Nang, Vietnam.
Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «SocialMatching++: A Novel Approach for Interlinking User Profiles on Social Networks. » In
PROFILES@ISWC 2017. Vienna, Austria.
Conclusions and future works
29.06.2019 47| Hussein Hazimeh PhD presentation
• In this thesis, we introduced new methods to reinforce entities in a knowledge graph.
• Main contributions recap:
• (1) comparative review on user profile matching on OSNs.
• (2) profiling online social networking users.
• (3) embedding social network profile links to academic entities extracted from the Wikidata knowledge graph.
• (4) introduced a new method for linking social profiles across different OSNs.
• (5) calculated and added sentiment polarities for social event entities extracted from Wikidata knowledge graph. (Did not present
because of the time limit, however, can be opened for Q&A discussion).
29.06.2019 48
Conclusions
Introduction Challenges & RQs SOA Contributions Conclusions Future work
Google
scholar
Data sources
KB
Knowledge
bases
Existing
New KGs
Embedding
Life events
Find social link and sentimentProfiling
Sentiment
Biographies
Sentiment features
C2
C3
C3C4
C3C5
• Methods:
• We used new resources and features in all of our
methods, which are not used in the related work.
• Results:
• We show that our methods can outperform the existing
methods in terms of precision, recall, and accuracy.
| Hussein Hazimeh PhD presentation
29.06.2019 49
Lessons learned and limitations
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation
UPAD
• Why some
attributes lack
information.
• Which
attributes
contain
information
more than
others.
• User
engagement to
OSNs.
Finding social
links
• Life event
importance.
• Machine
learning model
efficiency.
Finding
sentiment
• Integrating
feature other
than text can
augment the
certainty of the
sentiment
polarity.
• Temporal
sentiment
tracking
showed
remarkable
changes.
Limitations
User profile level:
Location detection
profile matching level:
Matching failure between a particular
pair of events.
Multimedia contents were unstudied.
OSN API updates.
Structure modifications.
Recently: Facebook, Twitter,
and LinkedIn.
Lessons learned
29.06.2019 50
Open problems
Introduction Challenges & RQs SOA Contributions Conclusions Future work
| Hussein Hazimeh PhD presentation
Resources
Integrate
additional
resources.
More OSNs
(Medium,
Reddit, …).
Entities
Cover more
entities.
Measure the
quality of
links.
Features
Take into
consideration
- Images
- tags
- check-ins.
Matching
algorithms
Develop
matching
algorithms
for
multimedia
contents.
29.06.2019 51
Publications
Contribution
1. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «Linking user profiles in social
networks: a comparative review. » International Journal of Social Network Mining, Volume 2: 333-361 - 2017.
2. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled. « Reliable User Profile Analytics and Discovery on Social
Networks. » In 8th International Conference on Software and Computer Applications - ACM (ICSCA 2019).Penang,
Malaysia.
3. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «SocialMatching++: A Novel
Approach for Interlinking User Profiles on Social Networks. » In PROFILES@ISWC 2017. Vienna, Austria.
4. Hussein Hazimeh, Elena Mugellini, Simon Ruffieux, Omar Abou Khaled, Philippe Cudré-Mauroux. « Automatic
Embedding of Social Network Profile Links into Knowledge Graphs. » In 9th International Symposium on Info &
Communication Technology - ACM (SoICT 2018). Da Nang, Vietnam.
5. Hussein Hazimeh, Mohammad Harissa, Elena Mugellini, Omar Abou Khaled. « Temporal Sentiment Analysis and
Tracking of Large-scale Social Events. » In 8th International Conference on Software and Computer Applications -
ACM (ICSCA 2019). Penang, Malaysia.
C1C2C3C4C5
| Hussein Hazimeh PhD presentation
29.06.2019 52
Publications
6. Hussein Hazimeh, Ahmad Traboulsi, Hasan Noureddine, Elena Mugellini, Omar Abou Khaled. « Social Networks
Serving Web Feeds: An Approach for Web Feed Enrichment. » In 10th International Conference on Information
Management and Engineering - ACM (ICIME 2018). Manchester, UK.
7. Sajida Chamass, Hussein Hazimeh, Jawad Makki, Elena Mugellini, Omar Abou Khaled. «Lexicon-based sentiment
analysis approach for ranking event entities. » In International Journal of Services and Standards, Volume 12: 126-139.
(first author she was a master student under my supervision).
8. H Hussein, Y Iman, M Jawad, N Hassan, T Julien, AK Omar, M Elena. «Leveraging Co-authorship and Biographical
Information for Author Ambiguity Resolution in DBLP. » The 30-th IEEE International Conference on Advanced
Information Networking, AINA 2016. Crans-Montana, Switzerland.
| Hussein Hazimeh PhD presentation
[32] O. Goga, H. Lei, S. Hari, G. Friedland, R. Sommer, and R. Teixeir. Exploiting innocuous activity for correlating users across sites. In 22nd International World Wide Web
Conference, WWW 2013, pages 447–458.
[87] Y. Sha, Q. Liang, and K. Zheng. Matching user accounts across social networks based on user message. In International Conference on Computational Science, ICCS 2016, pages
2423–2427.
[112] R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM.
[33] O. Goga, P. Loiseau, R. Sommer, R. Teixeira, and K.P. Gummadi. On the reliability of profile matching across large online social networks. In KDD.
[#] N. Bennacer, C.N. Jipmo, A. Penta, and G. Quercini. Matching user profiles across social networks. In Advanced Information Systems Engineering - 26th International Conference,
CAiSE 2014, pages 424–438.
[95] T. Van Le, T.N. Truong, and T. Vu Pham. A content-based approach for user profile modeling and matching on social networks. In Multi-disciplinary Trends in Artificial
Intelligence - 8th International Workshop, MIWAI 2014, pages 232–243
[82] E. Raad, R. Chbeir, and A. Dipanda. User profile matching in social networks. In The 13th International Conference on Network-Based Information Systems, NBiS 2010, pages
297–304.
[41] P. Jain, P. Kumaraguru, and A. Joshi. @i seek ’fb.me’: identifying users across multiple online social networks. In WWW (Companion Volume).
[61] S. Liu, S. Wang, F. Zhu, J. Zhang, and R. Krishnan. Hydra: large scale social identity linkage via heterogeneous behavior modeling. In International Conference on Management of
Data, SIGMOD 2014, pages 51–62.
[113] R. Zafarani and H. Liu. Connecting users across social media sites: a behavioralmodeling approach. In The 19th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, KDD 2013, pages 41–49.
[73] A. Nunes, P. Calado, and B. Martins. Resolving user identities over social networks through supervised learning and rich similarity features. In Proceedings of the 27th Annual
ACM Symposium on Applied Computing SAC 2012, pages 728–729.
[66] M. Motoyama and G. Varghese. I seek you: searching and matching individuals in social networks. In Proceedings of the 5th ACM International Conference on Web Search and
Data Mining (WSDM) 2009, pages 67–75.
(NaBIC) 2015, pages 417–428.
29.06.2019 53
References (1/2)
| Hussein Hazimeh PhD presentation
[2] S. Bartunov, A. Korshunov, S. Taek Park, W. Ryu, and H. Lee. Joint link-attribute user identity resolution in online social networks. In SNAKDD Workshop.
[98] J. Vosecky, D. Hong, and V.Y. Shen. User identification across multiple social networks. In Networked Digital Technologies, First International Conference 2009
[88] ] Y. Shen and H. Jin. Controllable information sharing for user accounts linkage across multiple online social networks. In CIKM.
[54] W. Liang, B. Meng, and L. Xianchao. Gcm: A greedy-based cross-matching algorithm for identifying users across multiple online social networks. In PAISI.
[84] R. Roedler, D. Kergl, and G. Dreo Rodosek. Profile matching across online social networks based on geo-tags. In Proceedings of the 7th World Congress on Nature and
Biologically Inspired Computing
[75] A. Panchenko, D. Babaev, and S. Obiedkov. Large-scale parallel matching of social network profiles. In AIST.
[79] O. Peled, M. Fire, and Y. Elovici. Matching entities across online social networks. In Neurocomputing, page 91–206.
[40] P. Jain and P. Kumaraguru. Other times, other values: leveraging attribute history to link user profiles across online social networks. In ACM (HT).
[80] D. Perito, C. Castelluccia, M. Ali Kaafar, and P. Manils. How unique and traceable are usernames? privacy enhancing technologies. In PETS.
[93] M. Szomszor, I. Cantador, and H. Alani. Correlating user profiles from multiple folksonomies. In Proceedings of the 19th ACM Conference on Hypertext and Hypermedia 2008,
pages 33–42.
[39] T. Iofciu, P. Fankhauser, F. Abel, and K. Bischoff. Identifying users across social tagging systems. In Proceedings of the Fifth International Conference on Weblogs and Social
Media 2011.
[62] ] A. Malhotra, L.C. Totti, Meira Jr. W., P. Kumaraguru, A. Virgílio, and F. Almeida. Studying user footprints in different online social networks. In International Conference on
Advances in Social Networks Analysis and Mining, ASONAM 2012, pages 1065–1070.
[115] H. Zhang, M-Y. Kan, Y. Liu, and S. Ma. Online social network profile linkage. In Information Retrieval Technology - 10th Asia Information Retrieval Societies Conference,
AIRS 2014, pages 197–208.
[99] S. Vosoughi, H. Zhou, and D. Roy. Digital stylometry: linking profiles across social networks. In 8th International Conference Social Informatics, SocInfo.
29.06.2019 54
References (2/2)
| Hussein Hazimeh PhD presentation
PhD candidate: Hussein Hazimeh
Director: Prof. Philippe Cudré-Mauroux / UNI-FR
Co-Director: Prof. Elena Mugellini / HES-SO
28.06.2019
Automatic Knowledge Graph
Entity Reinforcement Based on
Social Networks

More Related Content

What's hot

VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4Davide Ceolin
 
Scholarly social media applications platforms for knowledge sharing and net...
Scholarly social media applications   platforms for knowledge sharing and net...Scholarly social media applications   platforms for knowledge sharing and net...
Scholarly social media applications platforms for knowledge sharing and net...tullemich
 
LAK13 Tutorial Social Network Analysis 4 Learning Analytics
LAK13 Tutorial Social Network Analysis 4 Learning AnalyticsLAK13 Tutorial Social Network Analysis 4 Learning Analytics
LAK13 Tutorial Social Network Analysis 4 Learning Analyticsgoehnert
 
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)Lora Aroyo
 
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...ijwscjournal
 
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...Manolo Farci
 
The Implementation of Social Media for Educational Objectives
The Implementation of Social Media for Educational ObjectivesThe Implementation of Social Media for Educational Objectives
The Implementation of Social Media for Educational Objectivestheijes
 
Social Networks and Social Capital
Social Networks and Social CapitalSocial Networks and Social Capital
Social Networks and Social CapitalGiorgos Cheliotis
 
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOA COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOijaia
 
Calrg2015 2015 06-15
Calrg2015 2015 06-15Calrg2015 2015 06-15
Calrg2015 2015 06-15Katy Jordan
 
Effects of Social Networking in Academic Literacy
Effects of Social Networking in Academic LiteracyEffects of Social Networking in Academic Literacy
Effects of Social Networking in Academic LiteracySteve Chilton
 
2006 www - lento welser gu smith - ties thatblog
2006   www - lento welser gu smith - ties thatblog2006   www - lento welser gu smith - ties thatblog
2006 www - lento welser gu smith - ties thatblogMarc Smith
 
Social Network Analysis To Blog Based Online Communities
Social Network Analysis To Blog Based Online CommunitiesSocial Network Analysis To Blog Based Online Communities
Social Network Analysis To Blog Based Online Communitiessubby88
 
Academic social networking sites
Academic social networking sitesAcademic social networking sites
Academic social networking sitesKaty Jordan
 
Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
 
2009-JCMC-Discussion catalysts-Himelboim and Smith
2009-JCMC-Discussion catalysts-Himelboim and Smith2009-JCMC-Discussion catalysts-Himelboim and Smith
2009-JCMC-Discussion catalysts-Himelboim and SmithMarc Smith
 
Lecture 1: Social Web Introduction (2014)
Lecture 1: Social Web Introduction (2014)Lecture 1: Social Web Introduction (2014)
Lecture 1: Social Web Introduction (2014)Lora Aroyo
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part ITHomas Plotkowiak
 
Alone Together: Patterns of collaboration in free and open source software de...
Alone Together: Patterns of collaboration in free and open source software de...Alone Together: Patterns of collaboration in free and open source software de...
Alone Together: Patterns of collaboration in free and open source software de...James Howison
 

What's hot (20)

VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4
 
Scholarly social media applications platforms for knowledge sharing and net...
Scholarly social media applications   platforms for knowledge sharing and net...Scholarly social media applications   platforms for knowledge sharing and net...
Scholarly social media applications platforms for knowledge sharing and net...
 
LAK13 Tutorial Social Network Analysis 4 Learning Analytics
LAK13 Tutorial Social Network Analysis 4 Learning AnalyticsLAK13 Tutorial Social Network Analysis 4 Learning Analytics
LAK13 Tutorial Social Network Analysis 4 Learning Analytics
 
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
 
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
AN INTEGRATED RANKING ALGORITHM FOR EFFICIENT INFORMATION COMPUTING IN SOCIAL...
 
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...
 
The Implementation of Social Media for Educational Objectives
The Implementation of Social Media for Educational ObjectivesThe Implementation of Social Media for Educational Objectives
The Implementation of Social Media for Educational Objectives
 
Social Networks and Social Capital
Social Networks and Social CapitalSocial Networks and Social Capital
Social Networks and Social Capital
 
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOA COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
 
Calrg2015 2015 06-15
Calrg2015 2015 06-15Calrg2015 2015 06-15
Calrg2015 2015 06-15
 
Effects of Social Networking in Academic Literacy
Effects of Social Networking in Academic LiteracyEffects of Social Networking in Academic Literacy
Effects of Social Networking in Academic Literacy
 
2006 www - lento welser gu smith - ties thatblog
2006   www - lento welser gu smith - ties thatblog2006   www - lento welser gu smith - ties thatblog
2006 www - lento welser gu smith - ties thatblog
 
Social Network Analysis To Blog Based Online Communities
Social Network Analysis To Blog Based Online CommunitiesSocial Network Analysis To Blog Based Online Communities
Social Network Analysis To Blog Based Online Communities
 
Academic social networking sites
Academic social networking sitesAcademic social networking sites
Academic social networking sites
 
Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)
 
Social Media for Researchers
Social Media for ResearchersSocial Media for Researchers
Social Media for Researchers
 
2009-JCMC-Discussion catalysts-Himelboim and Smith
2009-JCMC-Discussion catalysts-Himelboim and Smith2009-JCMC-Discussion catalysts-Himelboim and Smith
2009-JCMC-Discussion catalysts-Himelboim and Smith
 
Lecture 1: Social Web Introduction (2014)
Lecture 1: Social Web Introduction (2014)Lecture 1: Social Web Introduction (2014)
Lecture 1: Social Web Introduction (2014)
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part I
 
Alone Together: Patterns of collaboration in free and open source software de...
Alone Together: Patterns of collaboration in free and open source software de...Alone Together: Patterns of collaboration in free and open source software de...
Alone Together: Patterns of collaboration in free and open source software de...
 

Similar to Final PhD defense presentation

Extracting, Mining and Predicting Users’ Interests from Social Media
Extracting, Mining and Predicting Users’ Interests from Social MediaExtracting, Mining and Predicting Users’ Interests from Social Media
Extracting, Mining and Predicting Users’ Interests from Social MediaFattane Zarrinkalam
 
SRS presentation
SRS presentationSRS presentation
SRS presentationslavaxx
 
20130427 What's Your Social IQ?
20130427 What's Your Social IQ?20130427 What's Your Social IQ?
20130427 What's Your Social IQ?BlueMetalInc
 
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...GUANGYUAN PIAO
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
#Edu14 Seminar on the State of Social Media in Higher Ed
#Edu14 Seminar on the State of Social Media in Higher Ed#Edu14 Seminar on the State of Social Media in Higher Ed
#Edu14 Seminar on the State of Social Media in Higher EdLaura Pasquini
 
Social Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabusSocial Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabusJakub Ruzicka
 
Ejis Analysis
Ejis AnalysisEjis Analysis
Ejis Analysisu3037519
 
Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)ShankarPrasaadRajama
 
Exploration & Promotion: Implementation Strategies of Corporate Social Software
Exploration & Promotion: Implementation Strategies of Corporate Social SoftwareExploration & Promotion: Implementation Strategies of Corporate Social Software
Exploration & Promotion: Implementation Strategies of Corporate Social SoftwareAlexander Stocker
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkGeorge Konstantakopoulos
 
On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...Grial - University of Salamanca
 
TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012
TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012
TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012Daniel Dufourt
 
Modelling the Media Logic of Software Systems
Modelling the Media Logic of Software SystemsModelling the Media Logic of Software Systems
Modelling the Media Logic of Software SystemsJan Schmidt
 
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53IRJET Journal
 
Social Media Analytics for Official Statistics
Social Media Analytics for Official StatisticsSocial Media Analytics for Official Statistics
Social Media Analytics for Official StatisticsIsmail Fahmi
 
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET Journal
 
Influencing the MOOC agenda - analysis of #MOOC Twitter Data
Influencing the MOOC agenda - analysis of #MOOC Twitter Data  Influencing the MOOC agenda - analysis of #MOOC Twitter Data
Influencing the MOOC agenda - analysis of #MOOC Twitter Data Mairéad Nic Giolla Mhichíl
 
Finalpr 120507110612-phpapp01
Finalpr 120507110612-phpapp01Finalpr 120507110612-phpapp01
Finalpr 120507110612-phpapp01Hritesh Saha
 

Similar to Final PhD defense presentation (20)

Extracting, Mining and Predicting Users’ Interests from Social Media
Extracting, Mining and Predicting Users’ Interests from Social MediaExtracting, Mining and Predicting Users’ Interests from Social Media
Extracting, Mining and Predicting Users’ Interests from Social Media
 
SRS presentation
SRS presentationSRS presentation
SRS presentation
 
20130427 What's Your Social IQ?
20130427 What's Your Social IQ?20130427 What's Your Social IQ?
20130427 What's Your Social IQ?
 
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
#Edu14 Seminar on the State of Social Media in Higher Ed
#Edu14 Seminar on the State of Social Media in Higher Ed#Edu14 Seminar on the State of Social Media in Higher Ed
#Edu14 Seminar on the State of Social Media in Higher Ed
 
Social Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabusSocial Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabus
 
Ejis Analysis
Ejis AnalysisEjis Analysis
Ejis Analysis
 
Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)
 
Exploration & Promotion: Implementation Strategies of Corporate Social Software
Exploration & Promotion: Implementation Strategies of Corporate Social SoftwareExploration & Promotion: Implementation Strategies of Corporate Social Software
Exploration & Promotion: Implementation Strategies of Corporate Social Software
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social Network
 
On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...
 
TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012
TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012
TELECOM Bretagne Social Web and Web 2.0 25 janvier 2012
 
Web 2.0 2011_2012
Web 2.0 2011_2012Web 2.0 2011_2012
Web 2.0 2011_2012
 
Modelling the Media Logic of Software Systems
Modelling the Media Logic of Software SystemsModelling the Media Logic of Software Systems
Modelling the Media Logic of Software Systems
 
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
 
Social Media Analytics for Official Statistics
Social Media Analytics for Official StatisticsSocial Media Analytics for Official Statistics
Social Media Analytics for Official Statistics
 
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
 
Influencing the MOOC agenda - analysis of #MOOC Twitter Data
Influencing the MOOC agenda - analysis of #MOOC Twitter Data  Influencing the MOOC agenda - analysis of #MOOC Twitter Data
Influencing the MOOC agenda - analysis of #MOOC Twitter Data
 
Finalpr 120507110612-phpapp01
Finalpr 120507110612-phpapp01Finalpr 120507110612-phpapp01
Finalpr 120507110612-phpapp01
 

Recently uploaded

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Final PhD defense presentation

  • 1. PhD candidate: Hussein Hazimeh Director: Prof. Philippe Cudré-Mauroux / UNI-FR Co-Director: Prof. Elena Mugellini / HES-SO 28.06.2019 Automatic Knowledge Graph Entity Refinement Based on Social Networks
  • 2. Introduction Research problems Research questions Contributions Conclusions & open problems 29.06.2019 Hussein Hazimeh PhD presentation 2 Agenda
  • 4. • What is a knowledge graph?  Objectives: • KGs Allow users to visualize knowledge facts about real-world entities (nodes) and the interrelations between them (edges).  Data source: • Incorporate knowledge from structured repositories such as DBpedia. • Extract knowledge from semi-structured web resources such as Wikipedia.  Privacy: • KGs composed into private (can’t use/analyse its knowledge) and public (can use/analyse its knowledge). 29.06.2019 Hussein Hazimeh PhD presentation 4 Introduction: Knowledge Graphs (1/3) Introduction Challenges & RQs SOA Contributions Conclusions Future work
  • 5. 29.06.2019 Hussein Hazimeh PhD presentation 5 Introduction: Knowledge Graphs (2/3) Introduction Challenges & RQs SOA Contributions Conclusions Future work • Why using knowledge graphs?  Transfer data into knowledge.  Knowledge represented in the form of entity and relations.  Connect different types of data.  Improve decisions by finding things faster.  Re-use of publicly available industry graphs and ontologies.  Readable by humans and machines.  Have been in use for all types of industries (gas, pharmatical, banking, and retail).
  • 6. 29.06.2019 Hussein Hazimeh PhD presentation 6 Introduction: Knowledge Graphs facts (3/3) Introduction Challenges & RQs SOA Contributions Conclusions Future work Google knowledge panel Wikidata KG YAGO KG
  • 7. Research problems 29.06.2019 Hussein Hazimeh PhD presentation 7
  • 8. 29.06.2019 Hussein Hazimeh PhD presentation 8 Research problems – (1) knowledge graphs Introduction Challenges & RQs SOA Contributions Conclusions Future work • Importance of solving this problem: 1. Automation • Manual => time-consuming • Automatic => faster 2. Utility for other systems • Digital libraries • Recommendation systems • Recruiting systems • Problem (P1) – missed links in knowledge graphs: • Knowledge graphs (KGs) are missing entity's social links to (Facebook, Twitter). • Unsolving this problem leads to a time-consuming manual search for these links. • Consequently, because of many systems (data sources) rely on these KGs, they will be missing these links as well.
  • 9. • Problem (P2) – matching challenges on social networks: • Online Social Networks (OSNs) number of users is increasing, => user profile de-anonymization task becomes more hardly. 75,980 Facebook user’s named “John Smith”. • Privacy and access control policies users apply them, limit the access to certain information about individuals. • Un-updated user profile data (location, work, profile image, etc.). • Non-identical information (users can share different information on their OSN profiles). • Importance of solving this problem • Automate the user linking process in order to transfer into a time-efficient approach. • Find new privacy-aware attributes for matching (biographies). • Find new context-aware attributes for matching(life events). 29.06.2019 Hussein Hazimeh PhD presentation 9 Research Challenges – (2) online social networks Introduction Challenges & RQs SOA Contributions Conclusions Future work 2008<100M 2019>2B
  • 10. • 100 entities per each class are examined. • Both private (Google) and public (Wikidata) • Knowledge graphs are considered. • Examination results show that: • The average # of social links does not exceed 50%. • Academic entities have the lowest # of social links <4%. 29.06.2019 Hussein Hazimeh PhD presentation 10 Motivating scenario Introduction Challenges & RQs SOA Contributions Conclusions Future work Motivation scenario
  • 11. Research questions 29.06.2019 Hussein Hazimeh PhD presentation 11
  • 12. 1. General research question to address: • Given a certain knowledge graph, how can we reinforce such entities inside this graph? 2. Specific research questions to address: • RQ1: How to profile online social networking users platforms? • RQ2: Given an entity and its knowledge graph, what are the likely corresponding OSN profile links of this entity? • RQ3: Given a user profile on Facebook, what are the corresponding user profile links on other OSNs? 29.06.2019 Hussein Hazimeh PhD presentation 12 Research Questions Introduction Challenges & RQs SOA Contributions Conclusions Future work Concerns P1. Concerns P2.
  • 14. User Profile Analytics and Discovery on Social Networks.C1 Comparative Review of Social Network Profile Matching Methods.C2 Novel Method for Interlinking User Profiles on Social Networks.C3 Automatic Embedding of Academic Entity Social Links into Knowledge Graphs.C4 Automatic Embedding of Social Event Sentiment Polarities into Knowledge Graphs.C5 29.06.2019 Hussein Hazimeh PhD presentation 14 Contributions Introduction Challenges & RQs SOA Contributions Conclusions Future work Google scholar Data sources KB Knowledge bases Existing New KGs Embedding Building new KGs Life events Find social link and sentimentProfiling Sentiment Biographies Sentiment features C2 C3 C3C4 C3C5 Answers RQ1. Answers RQ3. Answers RQ2.
  • 15. • Knowledge graph embedding: • Approaches for knowledge graph completion: • General methods: internal and external. • Translation Based Methods: tanslate entity/relation from head to tail: TransE [9], TransH [102], TransR [56], TranSpace [43]. • Dataset profiling: • Social networks [65] YouTube, Flickr, Orkut. • Other datasets: knowledge graphs [78]. • Sentiment analysis on social events: • Lexicon-based [48, 105]. • Supervised-based [58]. 29.06.2019 Hussein Hazimeh PhD presentation 15 Related work Introduction Challenges & RQs SOA Contributions Conclusions Future work Reference Method Sources [103] External Search engines [46] Internal Reinforcement learning [96] External Social media [86] External DBLP, Microsoft academic search [31] Internal LSH [19] Internal, external External knowledge graphs [92] External Social networks [114, 89, 26] Internal Machine leraning car driverfarmer Farming Skills
  • 16. C1: User Profile Analytics and Discovery on Social Networks 29.06.2019 Hussein Hazimeh PhD presentation 16
  • 17. • Dataset profiling is the task of creating descriptive metadata about such entities. An entity can be a person, organization, database, etc. • Profiling methods and measures. • Global measures: measure the facts of a dataset; its size, its shape, etc. • Platform measures: study the dataset features according to its shape, database or graph dataset, for example. • Our task: analyze datasets of user profiles from 4 OSNs: • Facebook DF, Google+ DG+, LinkedIn DL, Twitter DT 29.06.2019 | Hussein Hazimeh PhD presentation 17 User Profile Analytics and Discovery (UPAD) Introduction Challenges & RQs SOA Contributions Conclusions Future work C2 C2 C3 C4 C5C1 I. Platform-based measure (1) Attribute availability (2) Activity frequency (3) Profile completeness (4) Mutability index II. Entity-based measure (1) Profile confidentiality
  • 18. • Study for each attribute in DS its % of availability. • Availability of AS on DS. • Results: • Highest available: screennames. • LinkedIn and Google+: highest and lowest image availability respectively. • Google+: lowest education compared to Facebook; location rarely available. • LinkedIn: location available; however, medium range between Facebook and Twitter. • Bio on LinkedIn and Google+ highly available respectively, and similarly exists on Facebook and Twitter. 29.06.2019 18 Platform-based Measure: (1) Attribute Availability Results Introduction Challenges & RQs SOA Contributions Conclusions Future work 35.6% 38.1% | Hussein Hazimeh PhD presentation C2 C3 C4 C5C1
  • 19. • Activity frequency: the amount of public content. • (tweets / posts / shares / retweets) published on a user timeline (wall). • 5 features analyzed • Only the highly available features are considered. • Results • Text highest available content on all OSNs except LinkedIn. • Facebook: highest content/user. 29.06.2019 19 Platform-based Measure: (2) Activity Frequency Results Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation Number of links (l) Text (t) Check-ins (c) Photos (p) Tags (t) Mentions (m) C2 C3 C4 C5C1
  • 20. • Profile completeness: the portion of public-only attributes. • Classes: • Results: • Surprisingly, only a very small portion of user profiles on OSNs are totally incomplete. 29.06.2019 20 Platform-based Measure: (3) User Profile Completeness Results Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation complete- without-A only-A fully complete uncomplete C2 C3 C4 C5C1
  • 21. • Mutability index: • A user U has an attribute A, where the old value of A = Ao and the new value of A = An. • Results: • Facebook, Google+, and Twitter: the biography is the highest mutable attribute. • LinkedIn: publication is highest mutable. • Contrary, screenname, gender, and birthdate were the lowest mutable profile attributes. 29.06.2019 21 Platform-based Measure: (4) Mutability Index Hypothesis Results Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C2 C3 C4 C5C1
  • 22. • Confidentiality model: responsible for calculating the probability of a user account if confident (real) or non-confident (fake). • To study the confidentiality: we consider a set of profile attributes and assign a weight for each one. • Results: • LinkedIn and Facebook profiles are the highest confident, compared to other OSNs having 95.3% and 83.78% respectively. • Google+ and Twitter: mostly “unlikely confident” having 22.2% non- confident score for Google+ and 37.76% for Twitter. 29.06.2019 22 Entity-based Measure: (1) Profile Confidentiality Model Results Introduction Challenges & RQs SOA Contributions Conclusions Future work 95.3% 83.7% 37.7% 22.2% | Hussein Hazimeh PhD presentation C2 C3 C4 C5C1 Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled. « Reliable User Profile Analytics and Discovery on Social Networks. » In 8th International Conference on Software and Computer Applications - ACM (ICSCA 2019). Penang, Malaysia
  • 23. C2: Comparative Review of Social Network Profile Matching Methods 29.06.2019 23| Hussein Hazimeh PhD presentation
  • 24. • Problem:  Existence of user profiles belonging to a single user across different social networking sites. • Challenge:  Link same profiles across OSNs, because of different information between these profiles. • How?  Leverage a set of features from user profiles and social network.  Introduce a similarity measure depending on the context of the attribute value (text, date, etc.).  Develop the matching algorithm. 29.06.2019 24 Profile matching definition Kiwifruit Problem: match the Kiwifruit on both baskets (b1 and b2). • color: green • Shape: oval • Seeds: edible • Texture: soft Features b1 b2 John profiles on multiple OSNs Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C3 C4 C5C1 C2
  • 25. • More than 16 distinct features have been used. • Attribute-based: • Location, Work, Image • Behavioral-based: • Timestamps, Writing style, Topic detection. • Do all features used by the SOA? • NO! • Do all features lead to efficient matching results? • NO! • Behavioral-based cons: • The Trade-off in activity between different accounts. • Attribute-based features cons: • Privacy and access control methods users apply. 29.06.2019 25 Profile matching approaches Comparative analysis Introduction Challenges & RQs SOA Contributions Conclusions Future work Faceboo k 36% Twitter 33% Google+ 9% LinkedIn 12% Flickr 10% FB<- >TW 44% FB<->Lin 17% TW<->Lin 22% TW<->FL 17% 1 1 3 2 3 3 5 6 6 0 1 2 3 4 5 6 7 2007 2009 2011 2013 2015 Content and behavioral 39% Profile attributes 55% Both 6% Resources Associations Feature types Publications by year Similarity methods tree | Hussein Hazimeh PhD presentation C3 C4 C5C1 C2 Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «Linking user profiles in social networks: a comparative review. » International Journal of Social Network Mining, Volume 2: 333-361 - 2017.
  • 26. C3: Automatic Embedding of Academic Entity Social Links into Knowledge Graphs 29.06.2019 26| Hussein Hazimeh PhD presentation
  • 27. • We propose a query-based approach for social link embedding. 1. Query: entity name A. Knowledge acquisition: i. Google Scholar mainly ii. Wikidata Metaphacts. 2. Profile matching A. Knowledge base to social network matching i. F-Link ii. K-Link 3. Machine learning A. Bottom-up paradigm i. Clustering ii. Classification 4. Enrichment and storage A. Results embedding to Wikidata B. Semantic storage and visualization 29.06.2019 27 Embedding social profile links to knowledge graphs Architecture Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1
  • 28. • F-Link is an algorithm that is responsible to find the Facebook profile link of a specific entity. • It is composed of 5 main steps: 1. Knowledge base construction. 2. Facebook search (by name). 3. Similarity calculation. 4. Classification/clustering. 5. Facebook profile link output (with knowledge base). 29.06.2019 28 Embedding social profile links to knowledge graphs F-Link algorithm (Matcher 1 – M1) Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link F-Link steps
  • 29. • Knowledge acquisition • The initial Knowledge Base (KB) is constructed from Google Scholar (GS) and Wikidata (WD). 𝐾𝐵 = 𝐺𝑆 ⊕ 𝑊𝐷 • Knowledge from both resources is integrated into one KB. • GS feature extraction: • Profile headers. • Profile publications. • Wikidata feature extraction • The Wikidata knowledge graph of the corresponding entity name: • Result: knowledge base. 29.06.2019 29 F-link method: Knowledge base construction Google scholar sample Wikidata sample Introduction Challenges & RQs SOA Contributions Conclusions Future work ⊕ KB | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link F-Link steps
  • 30. • F-link: F stands for Facebook; goal: finds the Facebook profile link. 1. Facebook (FB): • Profile data extraction. • Extracted data is stored in knowledge base 𝐾𝐵𝑓. • Other profile extractions, such as: workplace and living place. 2. Google scholar: • Profile headers. • Biography from PDF publications. 3. Wikidata 4. F-Link produces a link 𝐿 𝑓, 𝐿𝑓 = 𝐾𝐵 ⊗ 𝐾𝐵𝑓 29.06.2019 30 F-link algorithm Screenname first name and last name Biography short description about the profile owner Content collection of posts, sharing, etc… Introduction Challenges & RQs SOA Contributions Conclusions Future work KB KBf ⊗ 𝑛 𝑝𝑟𝑜𝑓𝑖𝑙𝑒𝑠 1 𝑝𝑟𝑜𝑓𝑖𝑙𝑒 𝑛𝑎𝑚𝑒 facebook.com/rob.tibshirani | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link F-Link steps
  • 31. • Each profile feature has a context: • Semantic • Syntactic • Pairs {𝑠1, 𝑠2} of features from 𝐾𝐵 𝑎𝑛𝑑 𝐾𝐵𝑓 are matched using the compatible similarity measure. • Entity name • N-gram • [𝐾𝐵 : “Robert Tibshirani”] , [𝐾𝐵𝑓: “Rob Tibshirani”] • Affiliation and biography • Stop words removal • NER • Cosine similarity 29.06.2019 31 F-link method: similarity measures Affiliation • Text Tokenize • NER Syntactic • Metric Affiliation similarity workflow Professor at the University of Stanford e = [“University of Stanford”: Organization, “Professor”: Object] [0, 1] Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link F-Link steps
  • 32. • K-link: K stands for Twitter and LinkedIn; goal: finds the Twitter and LinkedIn profile links (FB, TW) and (FB, LIn). • It is composed of 5 main steps: 1. Knowledge base construction. 2. Twitter/LinkedIn search (by name). 3. Similarity calculation. 4. Classification/clustering. 5. Twitter/LinkedIn profile link output. 29.06.2019 32 Embedding social profile links to knowledge graphs K-Link algorithm (Matcher 2 – M2) Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link K-Link steps
  • 33. • Matcher initial input: “reliable” Facebook set of profile links from the F-Link matcher. • The knowledge base is constructed from the data inside each profile. • Screenname, affiliation, life events, biography, etc. • The key features of this matcher: 1. Life events 2. Biographies 29.06.2019 33 K-link algorithm Introduction Challenges & RQs SOA Contributions Conclusions Future work KBf KBk ⊗ 𝑛 𝑝𝑟𝑜𝑓𝑖𝑙𝑒𝑠 1 𝑝𝑟𝑜𝑓𝑖𝑙𝑒 𝑛𝑎𝑚𝑒 linkedin.com/rob.tibshirani | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link K-Link steps
  • 34. • Facebook, Twitter, and LinkedIn 1. Basic profile attributes 2. Life events 3. Biography • Formally, let 𝑇 , 𝐿 and 𝐹 represent Twitter and LinkedIn OSNs, respectively. • The profile of a user i in either T , L or F is represented as 𝑃𝑖 𝑠 where s ∈ {T, L, F}. • The profile attributes of a user i is modeled as follows 𝑃𝑖 𝑠 = {𝑛𝑖 𝑠 , 𝑙𝑖 𝑠 , 𝑒𝑖 𝑠 , 𝑏𝑖 𝑠 , 𝑝𝑖 𝑠 , 𝑑𝑖 𝑠 }. • n denotes the screenname, l denotes the location, e denotes the life events, b denotes the profile biography, p denotes the profession and d denotes his birthday date. 29.06.2019 34 K-link method: feature extraction Introduction Challenges & RQs SOA Contributions Conclusions Future work Life event • Text Semantic • LDA Life event similarity workflow Started a new job at Google e = “new job” Life event search mechanism | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link K-Link steps
  • 35. • Similarity measure is conducted over a set of attributes: • Screennames • Life events • Biographies • Affiliations • Locations • Birthdates • IDs • Each attribute is matched with a specific similarity metric. 29.06.2019 35 K-link method: similarity measures Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link K-Link steps
  • 36. • In step 3, matching results is manipulated in a bottom-up machine learning paradigm. 1. Clustering 2. Classification • The dataset used for this task includes the similarity result of each pair of attributes from two social networks. • Why did we combine clustering and classification models? 29.06.2019 36 Combine clustering and classification methods to find the social profile links Introduction Challenges & RQs SOA Contributions Conclusions Future work ClassificationClustering • Label prediction • More Powerful • Class prediction Insufficientl labeleddata | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link K-Link steps Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link F-Link steps
  • 37. • Clustering • Each profile similarity represents a feature vector. • Only the cluster with the highest confidence (𝐶𝑓). • Each cluster average is calculated. • The clusters are filtered and the one with the highest average is considered in the classification task. 29.06.2019 37 1- Data clustering C1=0.2,0.2,0.5 C2=0.4,0.4,0.5 C3=0.8,0.8,0.8 Average C3 Introduction Challenges & RQs SOA Contributions Conclusions Future work 𝑪 𝒇 = 0.8 | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link K-Link steps Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link F-Link steps
  • 38. • Classification • Each profile in the previous confidence cluster is classified. • The binary classification task is composed of two classes: “match” or “not match”. • Only the cluster with the highest confidence (𝐶𝑓). • We use Bayesian naïve classifier (BNC). • The feature vector 𝑣 = [𝑥1, 𝑥2, … , 𝑥 𝑛]. 𝑥 𝑛 = similarity between a pair of attributes (life event for e.g.). • In addition, we compared the performance of BNC with two other methods: SVM and decision trees. • Result of the classifier: • Facebook link (𝐿 𝑓) and Twitter/LinkedIn link (𝐿 𝑘). 29.06.2019 38 2- Data classification BNC tree example Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 Knowledge base Twitter/LinkedIn search Similarity calculation Classification/ Clustering Twitter/LinkedIn profile link (𝐿 𝑘). K-Link steps Knowledge base Facebook search Similarity calculation Classification/ Clustering Facebook profile link (𝐿 𝑓) F-Link steps
  • 39. • Data sources • 5,694 are used in M1 and M2. • Maximum life events / user profile is 8 events. In addition, each class of event has a total of 2.2K at maximum. • Up to 83 name matches from Facebook. • Machine learning dataset: • Manually labeled 300 instances.. • Each new classified instance is added to the main dataset to increase its performance. 29.06.2019 39 Implementation: dataset facts Source Type Google scholar Scholary Wikidata Knowledge graph Facebook, Twitter, and LinkedIn Online social networks Life events statistics Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 M1 – name search matching results 83 50 25 10
  • 40. • We compare the precision and recall trade- offs across multiple domains. • Four domains are considered in our study: • (CS = Computer Science, P = Physics, C = Chemistry, M = Medicine). • Results • LIn has the highest precision and recall. • Platform is more structured. • User information is more confident. • Profiles are updated regularly. 29.06.2019 40 Evaluation on different domains precision and recall Multiple domain comparison results Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 We validate that our approach can match multiple domains
  • 41. • HYDRA [61]: a system for linking identical user accounts by analyzing and comparing the behavior of users. • BM25 [39]: an approach for identifying a user across social networks by comparing their tagging practice and usernames. • MOBIUS [113]: they connect user profiles across social networks by comparing the behavioral characteristics such as timestamp between posts • OPL [115]: an approach for connecting social networking user profiles using internal and external features. 29.06.2019 41 Benchmarking with baselines (1/2) precision and recall Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1
  • 42. • Comparison with other knowledge graphs • Automatic method better than manual. • Multiple domains are compared. • K-link baselines 29.06.2019 42 Benchmarking with baselines (2/2) precision and recall • Baseline: other approaches. • HYDRA [61]: link users by behavior matching. • Baseline: with/without named entities. Successfully founded links Comparing to other approaches Comparing to other approaches Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 We validate that our approach outperforms many baselines, and similar to HYDRA in precision and recall
  • 43. • Cases: only supervised methods, only un- supervised, combination. • With biographies (WB) and without (WoB). • With life events (WL) and without (WoL). • Using both methods yields to a higher precision and recall compared to using one only. 29.06.2019 43 Evaluation on supervised, un-supervised, & both precision and recall Comparing the matching results on 3 machines learning cases Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 We validate that biographies and life events enhance the precision and recall. Combined class/clus as well enhance the results in our case.
  • 44. • We study the impact of considering the profile biographies that exist inside the PDF publications. • We show how the #matches before using any biography is enhanced clearly after using one or more biographies. • Why the #matches is enhanced? • Extra information inside biographies. 29.06.2019 44 Positive impact of PDF biographies Comparing the results of matching 4 researchers before and after using biographies Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 We validate that biographies can enhance the accuracy of the matching results. Name #matches before using biographies (b=0) (Total=100) #matches after using biographies (b≥1) (Total=100) Jeff Offutt 64 78 Trevor Hastie 55 66 Eric Yu 61 70 Robert Tibshirani 71 84 Average 62.75 74.5 11.75%
  • 45. • Pros of including biographies: • Solve the problem of private user profile information. • Image is the highest available attribute. • Related approaches did not cosider it. • Pros of including life events: • Content characteristic: • Solve the content trade-offs among OSNs (text-only on Facebook VS image-only on Twitter). • Solve the problem of un-updated profiles (last post on Twitter 3 months before the last Facebook’s post date). • Solve the volume trade-offs problem: if we have zero tweets on Twitter compared to a n posts on Facebook and LinkedIn. • Solve the Language difference issue (English on Facebook VS Chinese on Twitter). 29.06.2019 45 Pros of our approaches case studies Biography VS attribute-based approaches Life events VS behavioral-based approaches Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1
  • 46. • We study the accuracy of each similarity function. • We consider 6 profile attributes: screenname, location, life event, biography, profession, date of birth, and gender. • Similarity scores for biographies are closed to [0.4, 0.7]. • Location, birthdate, and profession usually have scores closed to [0.8, 1]. • Specific and limited value (e.g., gender (Male, Female). 29.06.2019 46 Similarity measure scores Similarity measure scores for different attributes Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation C5C2 C3 C4C1 K-Link scores F-Link scores Hussein Hazimeh, Elena Mugellini, Simon Ruffieux, Omar Abou Khaled, Philippe Cudré-Mauroux.« Automatic Embedding of Social Network Profile Links into Knowledge Graphs. » In 9th International Symposium on Info & Communication Technology - ACM (SoICT 2018). Da Nang, Vietnam. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «SocialMatching++: A Novel Approach for Interlinking User Profiles on Social Networks. » In PROFILES@ISWC 2017. Vienna, Austria.
  • 47. Conclusions and future works 29.06.2019 47| Hussein Hazimeh PhD presentation
  • 48. • In this thesis, we introduced new methods to reinforce entities in a knowledge graph. • Main contributions recap: • (1) comparative review on user profile matching on OSNs. • (2) profiling online social networking users. • (3) embedding social network profile links to academic entities extracted from the Wikidata knowledge graph. • (4) introduced a new method for linking social profiles across different OSNs. • (5) calculated and added sentiment polarities for social event entities extracted from Wikidata knowledge graph. (Did not present because of the time limit, however, can be opened for Q&A discussion). 29.06.2019 48 Conclusions Introduction Challenges & RQs SOA Contributions Conclusions Future work Google scholar Data sources KB Knowledge bases Existing New KGs Embedding Life events Find social link and sentimentProfiling Sentiment Biographies Sentiment features C2 C3 C3C4 C3C5 • Methods: • We used new resources and features in all of our methods, which are not used in the related work. • Results: • We show that our methods can outperform the existing methods in terms of precision, recall, and accuracy. | Hussein Hazimeh PhD presentation
  • 49. 29.06.2019 49 Lessons learned and limitations Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation UPAD • Why some attributes lack information. • Which attributes contain information more than others. • User engagement to OSNs. Finding social links • Life event importance. • Machine learning model efficiency. Finding sentiment • Integrating feature other than text can augment the certainty of the sentiment polarity. • Temporal sentiment tracking showed remarkable changes. Limitations User profile level: Location detection profile matching level: Matching failure between a particular pair of events. Multimedia contents were unstudied. OSN API updates. Structure modifications. Recently: Facebook, Twitter, and LinkedIn. Lessons learned
  • 50. 29.06.2019 50 Open problems Introduction Challenges & RQs SOA Contributions Conclusions Future work | Hussein Hazimeh PhD presentation Resources Integrate additional resources. More OSNs (Medium, Reddit, …). Entities Cover more entities. Measure the quality of links. Features Take into consideration - Images - tags - check-ins. Matching algorithms Develop matching algorithms for multimedia contents.
  • 51. 29.06.2019 51 Publications Contribution 1. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «Linking user profiles in social networks: a comparative review. » International Journal of Social Network Mining, Volume 2: 333-361 - 2017. 2. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled. « Reliable User Profile Analytics and Discovery on Social Networks. » In 8th International Conference on Software and Computer Applications - ACM (ICSCA 2019).Penang, Malaysia. 3. Hussein Hazimeh, Elena Mugellini, Omar Abou Khaled, Philippe Cudré-Mauroux. «SocialMatching++: A Novel Approach for Interlinking User Profiles on Social Networks. » In PROFILES@ISWC 2017. Vienna, Austria. 4. Hussein Hazimeh, Elena Mugellini, Simon Ruffieux, Omar Abou Khaled, Philippe Cudré-Mauroux. « Automatic Embedding of Social Network Profile Links into Knowledge Graphs. » In 9th International Symposium on Info & Communication Technology - ACM (SoICT 2018). Da Nang, Vietnam. 5. Hussein Hazimeh, Mohammad Harissa, Elena Mugellini, Omar Abou Khaled. « Temporal Sentiment Analysis and Tracking of Large-scale Social Events. » In 8th International Conference on Software and Computer Applications - ACM (ICSCA 2019). Penang, Malaysia. C1C2C3C4C5 | Hussein Hazimeh PhD presentation
  • 52. 29.06.2019 52 Publications 6. Hussein Hazimeh, Ahmad Traboulsi, Hasan Noureddine, Elena Mugellini, Omar Abou Khaled. « Social Networks Serving Web Feeds: An Approach for Web Feed Enrichment. » In 10th International Conference on Information Management and Engineering - ACM (ICIME 2018). Manchester, UK. 7. Sajida Chamass, Hussein Hazimeh, Jawad Makki, Elena Mugellini, Omar Abou Khaled. «Lexicon-based sentiment analysis approach for ranking event entities. » In International Journal of Services and Standards, Volume 12: 126-139. (first author she was a master student under my supervision). 8. H Hussein, Y Iman, M Jawad, N Hassan, T Julien, AK Omar, M Elena. «Leveraging Co-authorship and Biographical Information for Author Ambiguity Resolution in DBLP. » The 30-th IEEE International Conference on Advanced Information Networking, AINA 2016. Crans-Montana, Switzerland. | Hussein Hazimeh PhD presentation
  • 53. [32] O. Goga, H. Lei, S. Hari, G. Friedland, R. Sommer, and R. Teixeir. Exploiting innocuous activity for correlating users across sites. In 22nd International World Wide Web Conference, WWW 2013, pages 447–458. [87] Y. Sha, Q. Liang, and K. Zheng. Matching user accounts across social networks based on user message. In International Conference on Computational Science, ICCS 2016, pages 2423–2427. [112] R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM. [33] O. Goga, P. Loiseau, R. Sommer, R. Teixeira, and K.P. Gummadi. On the reliability of profile matching across large online social networks. In KDD. [#] N. Bennacer, C.N. Jipmo, A. Penta, and G. Quercini. Matching user profiles across social networks. In Advanced Information Systems Engineering - 26th International Conference, CAiSE 2014, pages 424–438. [95] T. Van Le, T.N. Truong, and T. Vu Pham. A content-based approach for user profile modeling and matching on social networks. In Multi-disciplinary Trends in Artificial Intelligence - 8th International Workshop, MIWAI 2014, pages 232–243 [82] E. Raad, R. Chbeir, and A. Dipanda. User profile matching in social networks. In The 13th International Conference on Network-Based Information Systems, NBiS 2010, pages 297–304. [41] P. Jain, P. Kumaraguru, and A. Joshi. @i seek ’fb.me’: identifying users across multiple online social networks. In WWW (Companion Volume). [61] S. Liu, S. Wang, F. Zhu, J. Zhang, and R. Krishnan. Hydra: large scale social identity linkage via heterogeneous behavior modeling. In International Conference on Management of Data, SIGMOD 2014, pages 51–62. [113] R. Zafarani and H. Liu. Connecting users across social media sites: a behavioralmodeling approach. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pages 41–49. [73] A. Nunes, P. Calado, and B. Martins. Resolving user identities over social networks through supervised learning and rich similarity features. In Proceedings of the 27th Annual ACM Symposium on Applied Computing SAC 2012, pages 728–729. [66] M. Motoyama and G. Varghese. I seek you: searching and matching individuals in social networks. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM) 2009, pages 67–75. (NaBIC) 2015, pages 417–428. 29.06.2019 53 References (1/2) | Hussein Hazimeh PhD presentation
  • 54. [2] S. Bartunov, A. Korshunov, S. Taek Park, W. Ryu, and H. Lee. Joint link-attribute user identity resolution in online social networks. In SNAKDD Workshop. [98] J. Vosecky, D. Hong, and V.Y. Shen. User identification across multiple social networks. In Networked Digital Technologies, First International Conference 2009 [88] ] Y. Shen and H. Jin. Controllable information sharing for user accounts linkage across multiple online social networks. In CIKM. [54] W. Liang, B. Meng, and L. Xianchao. Gcm: A greedy-based cross-matching algorithm for identifying users across multiple online social networks. In PAISI. [84] R. Roedler, D. Kergl, and G. Dreo Rodosek. Profile matching across online social networks based on geo-tags. In Proceedings of the 7th World Congress on Nature and Biologically Inspired Computing [75] A. Panchenko, D. Babaev, and S. Obiedkov. Large-scale parallel matching of social network profiles. In AIST. [79] O. Peled, M. Fire, and Y. Elovici. Matching entities across online social networks. In Neurocomputing, page 91–206. [40] P. Jain and P. Kumaraguru. Other times, other values: leveraging attribute history to link user profiles across online social networks. In ACM (HT). [80] D. Perito, C. Castelluccia, M. Ali Kaafar, and P. Manils. How unique and traceable are usernames? privacy enhancing technologies. In PETS. [93] M. Szomszor, I. Cantador, and H. Alani. Correlating user profiles from multiple folksonomies. In Proceedings of the 19th ACM Conference on Hypertext and Hypermedia 2008, pages 33–42. [39] T. Iofciu, P. Fankhauser, F. Abel, and K. Bischoff. Identifying users across social tagging systems. In Proceedings of the Fifth International Conference on Weblogs and Social Media 2011. [62] ] A. Malhotra, L.C. Totti, Meira Jr. W., P. Kumaraguru, A. Virgílio, and F. Almeida. Studying user footprints in different online social networks. In International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012, pages 1065–1070. [115] H. Zhang, M-Y. Kan, Y. Liu, and S. Ma. Online social network profile linkage. In Information Retrieval Technology - 10th Asia Information Retrieval Societies Conference, AIRS 2014, pages 197–208. [99] S. Vosoughi, H. Zhou, and D. Roy. Digital stylometry: linking profiles across social networks. In 8th International Conference Social Informatics, SocInfo. 29.06.2019 54 References (2/2) | Hussein Hazimeh PhD presentation
  • 55. PhD candidate: Hussein Hazimeh Director: Prof. Philippe Cudré-Mauroux / UNI-FR Co-Director: Prof. Elena Mugellini / HES-SO 28.06.2019 Automatic Knowledge Graph Entity Reinforcement Based on Social Networks