SlideShare a Scribd company logo
1 of 60
Download to read offline
What sets Verified Users
apart?
Insights into, Analysis of and Prediction of
Verified Users on Twitter
Indraneil Paul
Masters Thesis Defense
14th September 2019
IIIT Hyderabad
Committee Members
Dr. Ponnurangam Kumaraguru (Advisor)
Dr. Kamalakar Karlapalem
Dr. Vikram Pudi
Outline
A: PROBLEM AND MOTIVATION
➢ Perceived influence of verification
➢ Understanding what sets verified
users apart
2
B: DATASET DESCRIPTION
➢ Description of data collection
➢ Summary data statistics
D: TOPIC ANALYSIS
➢ Study divergence between verified
users and the rest for tweet topics
➢ Study divergence in topic diversity
C: METADATA/ACTIVITY
ANALYSIS
➢ Study divergence of verified users
from the rest for temporal activity
and metadata signatures
➢ Deconstruct users into profiles
Outline
A: PROBLEM AND MOTIVATION
➢ Perceived influence of verification
➢ Understanding what sets verified
users apart
3
B: DATASET DESCRIPTION
➢ Description of data collection
➢ Summary data statistics
D: TOPIC ANALYSIS
➢ Study divergence between verified
users and the rest for tweet topics
➢ Study divergence in topic diversity
C: METADATA/ACTIVITY
ANALYSIS
➢ Study divergence of verified users
from the rest for temporal activity
and metadata signatures
➢ Deconstruct users into profiles
Ambiguity in Perception
● Twitter, Facebook and Instagram have incorporated a verification
process to authenticate handles they deem important enough to be
worth impersonating.
● However, despite repeated statements by Twitter about verification not
being equivalent to endorsement, users have misinterpreted the
authenticity it is meant to convey with credibility.
● The rarity of the status and its prominent visual signalling are the likely
causes for this.
4
This perception of verification lending credence has led Twitter to receive a lot
of flak in recent times, especially for harbouring bias against certain groups.
We try to demonstrate that the attainment of verified status by users can be
explained away by less insidious factors based on user activity trajectory,
tweet sentiment and tweet contents.
5
Ambiguity in Perception
Visual Incentive
1. Presence of authority and authenticity indicators:
Lends further credibility to the Tweets made by a user handle
2. Presentation over relevance:
Psychological testing reveals that credibility evaluation of online
content is influenced by its presentation rather than its relevance or
apparent credulity
Attaining verified status might lead to a user’s content being more frequently
liked and retweeted.
6
Heuristic Models
The average user devotes only three seconds of attention per Tweet. This is
symptomatic of users resorting to content evaluation heuristics.
One such relevant heuristic is the Endorsement heuristic, which is
associated with credibility conferred to content by visual markers.
The presence of a marker such as a verified badge could hence, be the
difference between a user reading a Tweet in a congested feed or
completely ignoring it.
7
Heuristic Models
Another pertinent heuristic is the Consistency heuristic, which stems from
endorsements by several authorities. This is important because a verified
user on one social media platform is likelier to be verified on other platforms
as well.
Hence, we posit that possessing a verified status can make a world of
difference in the outreach/influence of a brand or individual in terms of the
extent and quality.
8
Coveted Nature
Unsurprisingly, a verified status is highly
sought after by preeminent entities and
businesses, as evidenced by the
prevalence of get-verified-quick
schemes.
Instead of resorting to questionable
schemes, accounts can follow our insights
to increase their platform reach and
improve their chances of verification.
9
Outline
A: PROBLEM AND MOTIVATION
➢ Perceived influence of verification
➢ Understanding what sets verified
users apart
10
B: DATASET DESCRIPTION
➢ Description of data collection
➢ Summary data statistics
D: TOPIC ANALYSIS
➢ Study divergence between verified
users and the rest for tweet topics
➢ Study divergence in topic diversity
C: METADATA/ACTIVITY
ANALYSIS
➢ Study divergence of verified users
from the rest for temporal activity
and metadata signatures
➢ Deconstruct users into profiles
Collection Approach
We queried the Twitter REST API for the following:
1. The @verified handle on Twitter follows all accounts on the platform
that are currently verified. We queried this handle on the 18th of July
2018 and extracted the user IDs.
2. We obtained the user objects for all verified users and subsetted for
English speaking users obtaining 231,235 users.
3. Additionally, we leveraged Twitter’s Firehose API – a near real-time
stream of public tweets and accompanying author metadata.
11
Collection Approach
We used the Firehose to sample a set of
175,930 non-verified users by controlling
for number of followers - a conventional
metric of public interest.
This was done by ensuring that the number
of followers of every non-verified user was
within 2% of that of a unique verified user
we had previously acquired. For each of
the aforementioned user, data and
metadata including friends, tweet content
and sentiment, activity time series, and
profile reach trajectories was gathered.
12
Collected Features
13
Collected Metadata
For each verified member, we collected the following metadata:
1. Followers count
2. Friends count
3. Status count
4. Public list memberships
14
Collected Features
15
494 million
Tweets collected over a one year period
175,930
English language Twitter non-verified users
16
Verified User Network
English language Twitter verified users
231,235
Class Imbalance
To prevent any effects of a skewed
class distribution from affecting
results, we applied two class
rebalancing methods to rectify this.
A minority oversampling technique
called ADASYN was used. It creates
synthetic minority samples based
on interpolation between already
existing samples.
17
Class Imbalance
Additionally, we use a hybrid over and under sampling technique called
SMOTE Tomek that also eliminates samples of the overrepresented class.
For a pair of opposing class points that are each other's closest
neighbours (tomek link), the majority class point is eliminated.
18
0.1583
Low avg. clustering coefficient
-0.04
Degree assortativity
19
Miscellaneous Trivia
Most connected user:
Influencer @6BillionPeople
114,815
Outline
A: PROBLEM AND MOTIVATION
➢ Perceived influence of verification
➢ Understanding what sets verified
users apart
20
B: DATASET DESCRIPTION
➢ Description of data collection
➢ Summary data statistics
D: TOPIC ANALYSIS
➢ Study divergence between verified
users and the rest for tweet topics
➢ Study divergence in topic diversity
C: METADATA/ACTIVITY
ANALYSIS
➢ Study divergence of verified users
from the rest for temporal activity
and metadata signatures
➢ Deconstruct users into profiles
Attracting Components
Attracting components are components in
a directed graph in which, if a random walk
enters, it can never leave.
The acquired network consists of 6091
attracting components.
At the core of these components lie
famous personalities (high in-degree users)
who do not follow any other handle.
21
Reciprocity
The verified network has a reciprocity rate of 33.7%.
This is lower than usually seen in other social networks such as Flickr (68%).
The likely cause of this is the prevalence of brands and third-party sources
of curated and crawled information, which typically do not reciprocate
engagements.
22
Power Law
Power-law is a key component in characterizing degree distribution of
networks gathered from various sources. It refers to the presence of the
following distributional property:
This is closely related to the concept of the Pareto distribution or the 80-20
rule, where 20 percent of an entity is responsible for 80 percent of its
characteristics.
We explore the presence of power laws in the network degree distribution
and laplacian eigenvalue distribution.
23
Power Law Inference
The modern statistically-sound methodology for testing the presence of
power laws involves a two step procedure for estimating xmin and then α.
The value of the exponent is given using the closed form expression after
performing the MLE inference shown below:
24
Power Law Inference
The optimal value of xmin is determined by iterating over various candidate
intervals and choosing the value for which the Kolmogorov-Smirnov
distance between the CDF of the data in the interval and the power law that
best approximates it is the smallest.
25
Eigenvalue Distribution
We computed the 10,000 largest eigenvalues of the Laplacian matrix. The
eigenvalues were computed using the power iteration method in existing
solvers.
Inference of power-law parameters α and xmin is done using the continuous
maximum-likelihood algorithm. Continuous MLE inference for the degree
distribution yields parameter estimates of 3.18 for α and 9377.26 for xmin with
a p value of 0.3
This is in keeping with earlier such findings in Laplacian eigenvalue
distributions of synthetic and real world undirected social network datasets.
26
Degree Distribution
Further, we carry out a similar inference procedure for the out degree
distribution of the nodes.
Inference of power-law parameters α and xmin is done using the discrete
maximum-likelihood algorithm. Discrete MLE inference for the degree
distribution yields parameter estimates of 3.24 for α and 1334 for xmin with a p
value of 0.13
Our findings are in contrast with the absence of a power-law in the degree
distribution when analyzing the whole Twitter network, as reported by
existing work.
27
Degrees of Separation
Existing work such as the 6 degrees of
separation and the small-world model
after named after findings that many
social and technological networks
possessed small average path lengths.
The verified network is even more
extreme in this aspect with an average
node distance of 2.74 which is much
lower than previous sampling estimates
for all of Twitter (3.43, 4.12)
28
Bio Analysis
Each user on Twitter can have a biography (or bio) allowing him/her to
describe themselves using a limited number of characters.
We attempt to gain insights from some of the most popular unigrams,
bigrams and trigrams occurring in the bios of verified users.
We also filter out n-grams constituted largely of non-informative words.
A running theme common to all three cases is the dominance of journalists
and news and weather outlets. Being a preeminent journalist in an English
media outlet seems to be one of the surest ways to get verified on Twitter.
29
Bio Analysis
The most frequent unigrams portray
several underlying themes such as:
1. They include cross-links to other
social media handles (e.g. Instagram)
2. Professional descriptors (e.g. Tech)
Bigrams and trigrams reiterate a largely
similar narrative, dominated by generic
descriptors (e.g. Official Account) and
business descriptors (e,g, Weather Alerts)
30
Bio Analysis
31
Network Centrality
We delve into how a user’s centrality in this network correlates with
conventional metrics of reach such as follower and list membership count.
Public list membership has been shown to be a robust predictor of
influence and topical relevance on Twitter.
32
Network Centrality
We observe that public list membership and follower count in the entire
Twitter network is positively correlated with PageRank and Betweenness
centrality of that user in the English verified user sub-graph.
This backs up the general perception that a verified status is afforded, not
just as a mark of authenticity, but also sufficient public interest.
33
Autocorrelation
We check for existing auto correlations in the time series using the Ljung-
Box and the Box-Pierce portmanteau tests.
If the p values returned by the test are greater than 0.05, then the time-
lagged correlation cannot be ruled out with a 95% significance level. The
Ljung-Box and Box-Pierce test results indicate a maximum p value of
3.81×10-38 and 7.57×10-38 respectively, thus strongly ruling out any lagged
correlation.
This counters intuitive expectations that there would be a significant auto
correlation in a week’s lag given that activity rates on Sundays are reliably
lower than those on weekdays.
34
Tweet Activity Pattern
35
Stationarity
We next inquire whether the activity time series is stationary or not using a
time series changepoint detection mechanism called Pruned Exact Linear
Time (PELT).
We assume that this time series is drawn from a normal distribution, with
mean and variance that can change at a discrete number of change-points.
We use the PELT algorithm to maximize the log-likelihood for the means
and variances of the underlying distribution with a penalty for the number of
change-points.
36
Stationarity
Results from several runs of the algorithm are recorded while cooling down
the penalty factor and ramping up the number of change-points. Dates that
fall in the change-point list in a significant number of runs of the algorithm
are considered viable change-point candidates
We only find weak evidence for a changepoint around Christmas of 2017.
37
Stationarity
Existing work on smaller social networks, such as Gab, reveal that the
activity time series drastically change in response to socio-political events
occurring outside the network.
Hence, to investigate further, we employ an Augmented Dickey-Fuller test
with both a constant term and a trend term. For upwards of 250
observations (we have 366) the critical value of the test is −3.42 when using a
constant and a trend term at the 95% significance level. If the test statistic
value is more negative than the critical threshold, we reject the null
hypothesis of a unit root and conclude the presence of stationarity.
Our test, returns a test statistic of −3.86 which is significantly more negative
than the critical threshold, thus strongly suggesting stationarity
38
User Data Classification
We commence our analysis by eliminating all features that could be
deemed surplus to requirements. To this end, we employed an all-relevant
feature selection model which classifies features into three categories:
confirmed, tentative and rejected. We only retain features that the model
is able to confirm over 100 iterations.
Using the rich set of features collected, we are able to consistently attain a
near-perfect classification accuracy of over 98%. Our results suggest that a
very competent classification of the Twitter user verification status is
possible without resorting to complex deep-learning pipelines that sacrifice
interpretability.
39
User Data Classification
40
Feature Importance
To compare the usefulness of various
categories of features, we trained
gradient boosting classifier, our most
competitive model, using each category
of features alone.
Evaluated on randomized train-test
splits of our dataset, user metadata and
content features were both able to
consistently surpass 0.88 AUC. Also,
temporal features alone are able to
consistently attain an AUC of over 0.79.
41
Feature Importance
The individual feature importances
were determined using the Gini
impurity reduction metric output by the
gradient boosting model.
To rank the most important features
reliably, the model was trained 100 times
with varying combinations of
hyperparameters.
The most reliable discriminative features
are shown.
42
Feature Importance
Some features are intuitively separable,
making an informed prediction possible.
The top 6 features are sufficient to
attain 0.9 AUC on their own right.
For instance, the very highest public list
membership counts and prevalence of
positive sentiment in Tweets are
populated exclusively by verified users
while the very lowest propensities for
authoritative speech as indicated by
LIWC Clout summary scores are
exclusively shown by non-verified users.
43
Profile Clustering
In order to characterize accounts with a
higher resolution, we attempt to cluster
them. We apply K-Means++ on the
normalized user vectors selecting the 30
most discriminative features indicated by
the XGBoost model, eventually settling
on 8 different clusters by tuning the
perplexity metric.
In the interest of intuitive visualization,
two dimensional embeddings obtained
via t-SNE are shown alongside.
44
Strongly Non-Verified
Cluster C0 can largely be characterized
as the Twitter layman with a high
proportion of experiential tweets. They
have short tweets, high incidence of
verb usage and score very high in the
LIWC Authenticity summary.
Cluster C2 can be characterized as an
amalgamation of accounts exhibiting bot-
like behavior. Members of this cluster
scored highly on the network and
content automation scores in our feature
set. Extensive usage of hashtags and
outlinks are observed.
45
Strongly Verified
Cluster C4 having a tendency to post
longer tweets and retweet more
frequently than author content, while
members of Cluster C6 almost
exclusively retweet on the platform.
Cluster C5 is nearly entirely comprised of
verified users and includes elite Twitter
users that comprise the core of verified
users on the platform. These users have
by far the highest list memberships on
average.
46
Mixed Clusters
Clusters C1, C3 and C7 are comprised of
a mix of verified and non-verified users.
Members of cluster C1 are ascendant
both in terms of reach and activity levels
as evidenced by the proportion of their
followers gained and statuses authored
recently. Many users in C1 have obtained
verification in the data collection period.
Members of C3 and C7 who are either
stagnant or declining in their reach and
activity levels and show very low
engagement with the rest of the platform
in terms of retweets and mentions.
47
Outline
A: PROBLEM AND MOTIVATION
➢ Perceived influence of verification
➢ Understanding what sets verified
users apart
48
B: DATASET DESCRIPTION
➢ Description of data collection
➢ Summary data statistics
D: TOPIC ANALYSIS
➢ Study divergence between verified
users and the rest for tweet topics
➢ Study divergence in topic diversity
C: METADATA/ACTIVITY
ANALYSIS
➢ Study divergence of verified users
from the rest for temporal activity
and metadata signatures
➢ Deconstruct users into profiles
Topic Classification
To glean into Tweet topics we ran the Gibbs Sampling based LDA over
1000 iterations of sampling.
The number of topics was optimally fine tuned to 100 after trying out various
values from 30 to 300 using perplexity values.
Instead of topic modelling on a per-Tweet basis and aggregating per user
we apply the author-topic model collating all of a user’s Tweets and topic
modelling in one go. This is done to work around the fact that most Tweets
are too short to meaningfully infer topics.
We use the default document-topic densities as well as term-topic
densities as suggested in prior topic modelling studies.
49
Topic Classification
50
Our classification models demonstrate that it is eminently possible to infer the
verification status of a user purely using the distribution across topics they tweet
about, with a high accuracy.
The most competitive classifier attained a classification accuracy of 88.2 %.
Topic Importance
In the interest of interpretability, we
evaluate the predictive power of each
topic with respect to verification status.
We obtain individual topic importances
using the ANOVA F-Scores output by
GAM.
The procedure is run on 50 random
train-test splits of the dataset and the
topics with the lowest F-Scores noted.
Most discriminative topics with their top
3 keywords were noted.
51
Topic Importance
Though there is some overlap between
topics, there are clear patterns to be
observed on some topics using which an
informed prediction can be made.
For instance the users who tweet most
frequently about consequential topics
like climate change and national
politics are all verified while
controversial topics like middle-east
geopolitics and mundane topics like
online sales are something verified
users devote limited attention to.
52
Topical Span
We next inquire about the diversity of
Tweet topics.
In order to obtain an optimal mix of the
number of topics per user in an
unsupervised manner, we leveraged
the use of an Hierarchical Dirichlet
Process.
Inference is done using an Online
Variational Bayes estimation using the
previously stated hyperparameters.
53
Topical Span
A trend is observed with non-verified users
clearly being over-represented in the
lower reaches of the distribution (1–4
topics), while a comparatively substantial
portion of verified users are situated in the
middle of the distribution (5–10 topics).
Also noteworthy is the fact that the very
upper echelons of topical variety in tweets
are occupied exclusively by verified users.
Shown are the two most topically diverse
handles with 13 and 21 topics respectively.
54
Wrapping Up
Summary of contributions and possible future applications
Key Contributions
Full Featured
Dataset
Released a fully
featured dataset of
407k+ users, containing
79+ million edges and
494+ million time
stamped Tweets.
Successful
Classification
We are the first study to
successfully attempt at
discerning as well as
classifying verification
worthy users on Twitter.
We obtain a near perfect
classifier in the process.
Actionable
Findings
We unravel the aspects
of a profile’s activity and
presence that have the
greatest bearing on a
user’s verification status.
56
Future Applications
1. Superior verification heuristic
Aforementioned deviations likely constitute a unique fingerprint for
verified users which can be leveraged gauge the strength of a user’s
case for such status
2. Actionable insights to improve online presence
Obtained insights can be used to significantly enhance the quality and
reach of one’s online presence before resorting to prohibitively priced
social media management solutions
3. Realistic synthetic influential profile generation
57
Related Publications
1. Elites Tweet? Characterizing the Twitter Verified User Network
ICDE Workshop on Large Scale Graph Data Analytics 2019, Macau SAR
1. What sets Verified Users apart? Insights, Analysis and Prediction of
Verified Users on Twitter
WebSci 2019, Boston
58
Research
Acknowledgements
59
IIIT
Hyderabad
IIIT
Delhi
Microsoft
India
60
Thanks!
Any questions?
Find me at ineil77.github.io
Contact me at indraneil.paul@research.iiit.ac.in

More Related Content

What's hot

Probabilistic Relational Models for Link Prediction Problem
Probabilistic Relational Models for Link Prediction ProblemProbabilistic Relational Models for Link Prediction Problem
Probabilistic Relational Models for Link Prediction ProblemSina Sajadmanesh
 
Tweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity RecognitionTweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity Recognition1crore projects
 
Graph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in TwitterGraph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in Twitterraghavr186
 
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?Molly Gibbons (she/her)
 
IJSRED-V2I2P09
IJSRED-V2I2P09IJSRED-V2I2P09
IJSRED-V2I2P09IJSRED
 
Tweet sentiment analysis
Tweet sentiment analysisTweet sentiment analysis
Tweet sentiment analysisAnil Shrestha
 
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...pharmaindexing
 
A Survey Of Collaborative Filtering Techniques
A Survey Of Collaborative Filtering TechniquesA Survey Of Collaborative Filtering Techniques
A Survey Of Collaborative Filtering Techniquestengyue5i5j
 
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELINGDYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELINGAndry Alamsyah
 
Prediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social NetworksPrediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social NetworksMohamed El-Geish
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)es712
 
IRJET- Quantify Mutually Dependent Privacy Risks with Locality Data
IRJET- Quantify Mutually Dependent Privacy Risks with Locality DataIRJET- Quantify Mutually Dependent Privacy Risks with Locality Data
IRJET- Quantify Mutually Dependent Privacy Risks with Locality DataIRJET Journal
 
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...
 KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai... KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai...
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...Wookjae Maeng
 
Safeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksSafeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksParang Saraf
 

What's hot (19)

Probabilistic Relational Models for Link Prediction Problem
Probabilistic Relational Models for Link Prediction ProblemProbabilistic Relational Models for Link Prediction Problem
Probabilistic Relational Models for Link Prediction Problem
 
Tweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity RecognitionTweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity Recognition
 
Graph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in TwitterGraph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in Twitter
 
Link prediction
Link predictionLink prediction
Link prediction
 
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
IJSRED-V2I2P09
IJSRED-V2I2P09IJSRED-V2I2P09
IJSRED-V2I2P09
 
Tweet sentiment analysis
Tweet sentiment analysisTweet sentiment analysis
Tweet sentiment analysis
 
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
 
A Survey Of Collaborative Filtering Techniques
A Survey Of Collaborative Filtering TechniquesA Survey Of Collaborative Filtering Techniques
A Survey Of Collaborative Filtering Techniques
 
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELINGDYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
 
Who gives a tweet
Who gives a tweetWho gives a tweet
Who gives a tweet
 
Prediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social NetworksPrediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social Networks
 
Ijsea04031005
Ijsea04031005Ijsea04031005
Ijsea04031005
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
 
IRJET- Quantify Mutually Dependent Privacy Risks with Locality Data
IRJET- Quantify Mutually Dependent Privacy Risks with Locality DataIRJET- Quantify Mutually Dependent Privacy Risks with Locality Data
IRJET- Quantify Mutually Dependent Privacy Risks with Locality Data
 
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...
 KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai... KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai...
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...
 
Safeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksSafeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist Networks
 
Evaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social NetworkEvaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social Network
 

Similar to What Sets Verified Users apart? Insights Into, Analysis of and Prediction of Verified Users on Twitter

Final Poster for Engineering Showcase
Final Poster for Engineering ShowcaseFinal Poster for Engineering Showcase
Final Poster for Engineering ShowcaseTucker Truesdale
 
DP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDivyaPatel729457
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Serge Beckers
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Serge Beckers
 
PURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRID
PURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRIDPURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRID
PURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRIDijngnjournal
 
An Efficient Trust Evaluation using Fact-Finder Technique
An Efficient Trust Evaluation using Fact-Finder TechniqueAn Efficient Trust Evaluation using Fact-Finder Technique
An Efficient Trust Evaluation using Fact-Finder TechniqueIJCSIS Research Publications
 
Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13ICDEcCnferenece
 
IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...
IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...
IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...IRJET Journal
 
IMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPING
IMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPINGIMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPING
IMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPINGijp2p
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual informationeSAT Journals
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Editor IJAIEM
 
AIST 2015 Conference Paper Presentation
AIST 2015 Conference Paper PresentationAIST 2015 Conference Paper Presentation
AIST 2015 Conference Paper PresentationFalguni Roy
 
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGSENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGIRJET Journal
 
Marketing analysis
Marketing analysisMarketing analysis
Marketing analysisGaurav Dubey
 
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...reshma reshu
 
Participatory Sensing through Social Networks: The Tension between Participat...
Participatory Sensing through Social Networks: The Tension between Participat...Participatory Sensing through Social Networks: The Tension between Participat...
Participatory Sensing through Social Networks: The Tension between Participat...Ioannis Krontiris
 
Quality of Claim Metrics in Social Sensing Systems: A case study on IranDeal
Quality of Claim Metrics in Social Sensing Systems: A case study on IranDealQuality of Claim Metrics in Social Sensing Systems: A case study on IranDeal
Quality of Claim Metrics in Social Sensing Systems: A case study on IranDealMaynooth University
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service iiKan-Han (John) Lu
 
Questions about questions
Questions about questionsQuestions about questions
Questions about questionsmoresmile
 

Similar to What Sets Verified Users apart? Insights Into, Analysis of and Prediction of Verified Users on Twitter (20)

Final Poster for Engineering Showcase
Final Poster for Engineering ShowcaseFinal Poster for Engineering Showcase
Final Poster for Engineering Showcase
 
DP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptx
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
PURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRID
PURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRIDPURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRID
PURGING OF UNTRUSTWORTHY RECOMMENDATIONS FROM A GRID
 
An Efficient Trust Evaluation using Fact-Finder Technique
An Efficient Trust Evaluation using Fact-Finder TechniqueAn Efficient Trust Evaluation using Fact-Finder Technique
An Efficient Trust Evaluation using Fact-Finder Technique
 
Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13
 
IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...
IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...
IRJET- Personalised Privacy-Preserving Social Recommendation based on Ranking...
 
IMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPING
IMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPINGIMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPING
IMPROVING HYBRID REPUTATION MODEL THROUGH DYNAMIC REGROUPING
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual information
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
 
AIST 2015 Conference Paper Presentation
AIST 2015 Conference Paper PresentationAIST 2015 Conference Paper Presentation
AIST 2015 Conference Paper Presentation
 
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGSENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
 
Marketing analysis
Marketing analysisMarketing analysis
Marketing analysis
 
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
 
Participatory Sensing through Social Networks: The Tension between Participat...
Participatory Sensing through Social Networks: The Tension between Participat...Participatory Sensing through Social Networks: The Tension between Participat...
Participatory Sensing through Social Networks: The Tension between Participat...
 
Quality of Claim Metrics in Social Sensing Systems: A case study on IranDeal
Quality of Claim Metrics in Social Sensing Systems: A case study on IranDealQuality of Claim Metrics in Social Sensing Systems: A case study on IranDeal
Quality of Claim Metrics in Social Sensing Systems: A case study on IranDeal
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
 
Questions about questions
Questions about questionsQuestions about questions
Questions about questions
 

More from IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 

More from IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 

Recently uploaded

Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 

Recently uploaded (20)

Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 

What Sets Verified Users apart? Insights Into, Analysis of and Prediction of Verified Users on Twitter

  • 1. What sets Verified Users apart? Insights into, Analysis of and Prediction of Verified Users on Twitter Indraneil Paul Masters Thesis Defense 14th September 2019 IIIT Hyderabad Committee Members Dr. Ponnurangam Kumaraguru (Advisor) Dr. Kamalakar Karlapalem Dr. Vikram Pudi
  • 2. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 2 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  • 3. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 3 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  • 4. Ambiguity in Perception ● Twitter, Facebook and Instagram have incorporated a verification process to authenticate handles they deem important enough to be worth impersonating. ● However, despite repeated statements by Twitter about verification not being equivalent to endorsement, users have misinterpreted the authenticity it is meant to convey with credibility. ● The rarity of the status and its prominent visual signalling are the likely causes for this. 4
  • 5. This perception of verification lending credence has led Twitter to receive a lot of flak in recent times, especially for harbouring bias against certain groups. We try to demonstrate that the attainment of verified status by users can be explained away by less insidious factors based on user activity trajectory, tweet sentiment and tweet contents. 5 Ambiguity in Perception
  • 6. Visual Incentive 1. Presence of authority and authenticity indicators: Lends further credibility to the Tweets made by a user handle 2. Presentation over relevance: Psychological testing reveals that credibility evaluation of online content is influenced by its presentation rather than its relevance or apparent credulity Attaining verified status might lead to a user’s content being more frequently liked and retweeted. 6
  • 7. Heuristic Models The average user devotes only three seconds of attention per Tweet. This is symptomatic of users resorting to content evaluation heuristics. One such relevant heuristic is the Endorsement heuristic, which is associated with credibility conferred to content by visual markers. The presence of a marker such as a verified badge could hence, be the difference between a user reading a Tweet in a congested feed or completely ignoring it. 7
  • 8. Heuristic Models Another pertinent heuristic is the Consistency heuristic, which stems from endorsements by several authorities. This is important because a verified user on one social media platform is likelier to be verified on other platforms as well. Hence, we posit that possessing a verified status can make a world of difference in the outreach/influence of a brand or individual in terms of the extent and quality. 8
  • 9. Coveted Nature Unsurprisingly, a verified status is highly sought after by preeminent entities and businesses, as evidenced by the prevalence of get-verified-quick schemes. Instead of resorting to questionable schemes, accounts can follow our insights to increase their platform reach and improve their chances of verification. 9
  • 10. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 10 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  • 11. Collection Approach We queried the Twitter REST API for the following: 1. The @verified handle on Twitter follows all accounts on the platform that are currently verified. We queried this handle on the 18th of July 2018 and extracted the user IDs. 2. We obtained the user objects for all verified users and subsetted for English speaking users obtaining 231,235 users. 3. Additionally, we leveraged Twitter’s Firehose API – a near real-time stream of public tweets and accompanying author metadata. 11
  • 12. Collection Approach We used the Firehose to sample a set of 175,930 non-verified users by controlling for number of followers - a conventional metric of public interest. This was done by ensuring that the number of followers of every non-verified user was within 2% of that of a unique verified user we had previously acquired. For each of the aforementioned user, data and metadata including friends, tweet content and sentiment, activity time series, and profile reach trajectories was gathered. 12
  • 14. Collected Metadata For each verified member, we collected the following metadata: 1. Followers count 2. Friends count 3. Status count 4. Public list memberships 14
  • 16. 494 million Tweets collected over a one year period 175,930 English language Twitter non-verified users 16 Verified User Network English language Twitter verified users 231,235
  • 17. Class Imbalance To prevent any effects of a skewed class distribution from affecting results, we applied two class rebalancing methods to rectify this. A minority oversampling technique called ADASYN was used. It creates synthetic minority samples based on interpolation between already existing samples. 17
  • 18. Class Imbalance Additionally, we use a hybrid over and under sampling technique called SMOTE Tomek that also eliminates samples of the overrepresented class. For a pair of opposing class points that are each other's closest neighbours (tomek link), the majority class point is eliminated. 18
  • 19. 0.1583 Low avg. clustering coefficient -0.04 Degree assortativity 19 Miscellaneous Trivia Most connected user: Influencer @6BillionPeople 114,815
  • 20. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 20 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  • 21. Attracting Components Attracting components are components in a directed graph in which, if a random walk enters, it can never leave. The acquired network consists of 6091 attracting components. At the core of these components lie famous personalities (high in-degree users) who do not follow any other handle. 21
  • 22. Reciprocity The verified network has a reciprocity rate of 33.7%. This is lower than usually seen in other social networks such as Flickr (68%). The likely cause of this is the prevalence of brands and third-party sources of curated and crawled information, which typically do not reciprocate engagements. 22
  • 23. Power Law Power-law is a key component in characterizing degree distribution of networks gathered from various sources. It refers to the presence of the following distributional property: This is closely related to the concept of the Pareto distribution or the 80-20 rule, where 20 percent of an entity is responsible for 80 percent of its characteristics. We explore the presence of power laws in the network degree distribution and laplacian eigenvalue distribution. 23
  • 24. Power Law Inference The modern statistically-sound methodology for testing the presence of power laws involves a two step procedure for estimating xmin and then α. The value of the exponent is given using the closed form expression after performing the MLE inference shown below: 24
  • 25. Power Law Inference The optimal value of xmin is determined by iterating over various candidate intervals and choosing the value for which the Kolmogorov-Smirnov distance between the CDF of the data in the interval and the power law that best approximates it is the smallest. 25
  • 26. Eigenvalue Distribution We computed the 10,000 largest eigenvalues of the Laplacian matrix. The eigenvalues were computed using the power iteration method in existing solvers. Inference of power-law parameters α and xmin is done using the continuous maximum-likelihood algorithm. Continuous MLE inference for the degree distribution yields parameter estimates of 3.18 for α and 9377.26 for xmin with a p value of 0.3 This is in keeping with earlier such findings in Laplacian eigenvalue distributions of synthetic and real world undirected social network datasets. 26
  • 27. Degree Distribution Further, we carry out a similar inference procedure for the out degree distribution of the nodes. Inference of power-law parameters α and xmin is done using the discrete maximum-likelihood algorithm. Discrete MLE inference for the degree distribution yields parameter estimates of 3.24 for α and 1334 for xmin with a p value of 0.13 Our findings are in contrast with the absence of a power-law in the degree distribution when analyzing the whole Twitter network, as reported by existing work. 27
  • 28. Degrees of Separation Existing work such as the 6 degrees of separation and the small-world model after named after findings that many social and technological networks possessed small average path lengths. The verified network is even more extreme in this aspect with an average node distance of 2.74 which is much lower than previous sampling estimates for all of Twitter (3.43, 4.12) 28
  • 29. Bio Analysis Each user on Twitter can have a biography (or bio) allowing him/her to describe themselves using a limited number of characters. We attempt to gain insights from some of the most popular unigrams, bigrams and trigrams occurring in the bios of verified users. We also filter out n-grams constituted largely of non-informative words. A running theme common to all three cases is the dominance of journalists and news and weather outlets. Being a preeminent journalist in an English media outlet seems to be one of the surest ways to get verified on Twitter. 29
  • 30. Bio Analysis The most frequent unigrams portray several underlying themes such as: 1. They include cross-links to other social media handles (e.g. Instagram) 2. Professional descriptors (e.g. Tech) Bigrams and trigrams reiterate a largely similar narrative, dominated by generic descriptors (e.g. Official Account) and business descriptors (e,g, Weather Alerts) 30
  • 32. Network Centrality We delve into how a user’s centrality in this network correlates with conventional metrics of reach such as follower and list membership count. Public list membership has been shown to be a robust predictor of influence and topical relevance on Twitter. 32
  • 33. Network Centrality We observe that public list membership and follower count in the entire Twitter network is positively correlated with PageRank and Betweenness centrality of that user in the English verified user sub-graph. This backs up the general perception that a verified status is afforded, not just as a mark of authenticity, but also sufficient public interest. 33
  • 34. Autocorrelation We check for existing auto correlations in the time series using the Ljung- Box and the Box-Pierce portmanteau tests. If the p values returned by the test are greater than 0.05, then the time- lagged correlation cannot be ruled out with a 95% significance level. The Ljung-Box and Box-Pierce test results indicate a maximum p value of 3.81×10-38 and 7.57×10-38 respectively, thus strongly ruling out any lagged correlation. This counters intuitive expectations that there would be a significant auto correlation in a week’s lag given that activity rates on Sundays are reliably lower than those on weekdays. 34
  • 36. Stationarity We next inquire whether the activity time series is stationary or not using a time series changepoint detection mechanism called Pruned Exact Linear Time (PELT). We assume that this time series is drawn from a normal distribution, with mean and variance that can change at a discrete number of change-points. We use the PELT algorithm to maximize the log-likelihood for the means and variances of the underlying distribution with a penalty for the number of change-points. 36
  • 37. Stationarity Results from several runs of the algorithm are recorded while cooling down the penalty factor and ramping up the number of change-points. Dates that fall in the change-point list in a significant number of runs of the algorithm are considered viable change-point candidates We only find weak evidence for a changepoint around Christmas of 2017. 37
  • 38. Stationarity Existing work on smaller social networks, such as Gab, reveal that the activity time series drastically change in response to socio-political events occurring outside the network. Hence, to investigate further, we employ an Augmented Dickey-Fuller test with both a constant term and a trend term. For upwards of 250 observations (we have 366) the critical value of the test is −3.42 when using a constant and a trend term at the 95% significance level. If the test statistic value is more negative than the critical threshold, we reject the null hypothesis of a unit root and conclude the presence of stationarity. Our test, returns a test statistic of −3.86 which is significantly more negative than the critical threshold, thus strongly suggesting stationarity 38
  • 39. User Data Classification We commence our analysis by eliminating all features that could be deemed surplus to requirements. To this end, we employed an all-relevant feature selection model which classifies features into three categories: confirmed, tentative and rejected. We only retain features that the model is able to confirm over 100 iterations. Using the rich set of features collected, we are able to consistently attain a near-perfect classification accuracy of over 98%. Our results suggest that a very competent classification of the Twitter user verification status is possible without resorting to complex deep-learning pipelines that sacrifice interpretability. 39
  • 41. Feature Importance To compare the usefulness of various categories of features, we trained gradient boosting classifier, our most competitive model, using each category of features alone. Evaluated on randomized train-test splits of our dataset, user metadata and content features were both able to consistently surpass 0.88 AUC. Also, temporal features alone are able to consistently attain an AUC of over 0.79. 41
  • 42. Feature Importance The individual feature importances were determined using the Gini impurity reduction metric output by the gradient boosting model. To rank the most important features reliably, the model was trained 100 times with varying combinations of hyperparameters. The most reliable discriminative features are shown. 42
  • 43. Feature Importance Some features are intuitively separable, making an informed prediction possible. The top 6 features are sufficient to attain 0.9 AUC on their own right. For instance, the very highest public list membership counts and prevalence of positive sentiment in Tweets are populated exclusively by verified users while the very lowest propensities for authoritative speech as indicated by LIWC Clout summary scores are exclusively shown by non-verified users. 43
  • 44. Profile Clustering In order to characterize accounts with a higher resolution, we attempt to cluster them. We apply K-Means++ on the normalized user vectors selecting the 30 most discriminative features indicated by the XGBoost model, eventually settling on 8 different clusters by tuning the perplexity metric. In the interest of intuitive visualization, two dimensional embeddings obtained via t-SNE are shown alongside. 44
  • 45. Strongly Non-Verified Cluster C0 can largely be characterized as the Twitter layman with a high proportion of experiential tweets. They have short tweets, high incidence of verb usage and score very high in the LIWC Authenticity summary. Cluster C2 can be characterized as an amalgamation of accounts exhibiting bot- like behavior. Members of this cluster scored highly on the network and content automation scores in our feature set. Extensive usage of hashtags and outlinks are observed. 45
  • 46. Strongly Verified Cluster C4 having a tendency to post longer tweets and retweet more frequently than author content, while members of Cluster C6 almost exclusively retweet on the platform. Cluster C5 is nearly entirely comprised of verified users and includes elite Twitter users that comprise the core of verified users on the platform. These users have by far the highest list memberships on average. 46
  • 47. Mixed Clusters Clusters C1, C3 and C7 are comprised of a mix of verified and non-verified users. Members of cluster C1 are ascendant both in terms of reach and activity levels as evidenced by the proportion of their followers gained and statuses authored recently. Many users in C1 have obtained verification in the data collection period. Members of C3 and C7 who are either stagnant or declining in their reach and activity levels and show very low engagement with the rest of the platform in terms of retweets and mentions. 47
  • 48. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 48 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  • 49. Topic Classification To glean into Tweet topics we ran the Gibbs Sampling based LDA over 1000 iterations of sampling. The number of topics was optimally fine tuned to 100 after trying out various values from 30 to 300 using perplexity values. Instead of topic modelling on a per-Tweet basis and aggregating per user we apply the author-topic model collating all of a user’s Tweets and topic modelling in one go. This is done to work around the fact that most Tweets are too short to meaningfully infer topics. We use the default document-topic densities as well as term-topic densities as suggested in prior topic modelling studies. 49
  • 50. Topic Classification 50 Our classification models demonstrate that it is eminently possible to infer the verification status of a user purely using the distribution across topics they tweet about, with a high accuracy. The most competitive classifier attained a classification accuracy of 88.2 %.
  • 51. Topic Importance In the interest of interpretability, we evaluate the predictive power of each topic with respect to verification status. We obtain individual topic importances using the ANOVA F-Scores output by GAM. The procedure is run on 50 random train-test splits of the dataset and the topics with the lowest F-Scores noted. Most discriminative topics with their top 3 keywords were noted. 51
  • 52. Topic Importance Though there is some overlap between topics, there are clear patterns to be observed on some topics using which an informed prediction can be made. For instance the users who tweet most frequently about consequential topics like climate change and national politics are all verified while controversial topics like middle-east geopolitics and mundane topics like online sales are something verified users devote limited attention to. 52
  • 53. Topical Span We next inquire about the diversity of Tweet topics. In order to obtain an optimal mix of the number of topics per user in an unsupervised manner, we leveraged the use of an Hierarchical Dirichlet Process. Inference is done using an Online Variational Bayes estimation using the previously stated hyperparameters. 53
  • 54. Topical Span A trend is observed with non-verified users clearly being over-represented in the lower reaches of the distribution (1–4 topics), while a comparatively substantial portion of verified users are situated in the middle of the distribution (5–10 topics). Also noteworthy is the fact that the very upper echelons of topical variety in tweets are occupied exclusively by verified users. Shown are the two most topically diverse handles with 13 and 21 topics respectively. 54
  • 55. Wrapping Up Summary of contributions and possible future applications
  • 56. Key Contributions Full Featured Dataset Released a fully featured dataset of 407k+ users, containing 79+ million edges and 494+ million time stamped Tweets. Successful Classification We are the first study to successfully attempt at discerning as well as classifying verification worthy users on Twitter. We obtain a near perfect classifier in the process. Actionable Findings We unravel the aspects of a profile’s activity and presence that have the greatest bearing on a user’s verification status. 56
  • 57. Future Applications 1. Superior verification heuristic Aforementioned deviations likely constitute a unique fingerprint for verified users which can be leveraged gauge the strength of a user’s case for such status 2. Actionable insights to improve online presence Obtained insights can be used to significantly enhance the quality and reach of one’s online presence before resorting to prohibitively priced social media management solutions 3. Realistic synthetic influential profile generation 57
  • 58. Related Publications 1. Elites Tweet? Characterizing the Twitter Verified User Network ICDE Workshop on Large Scale Graph Data Analytics 2019, Macau SAR 1. What sets Verified Users apart? Insights, Analysis and Prediction of Verified Users on Twitter WebSci 2019, Boston 58
  • 60. 60 Thanks! Any questions? Find me at ineil77.github.io Contact me at indraneil.paul@research.iiit.ac.in