Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What Sets Verified Users apart? Insights Into, Analysis of and Prediction of Verified Users on Twitter

31 views

Published on

Social network and publishing platforms, such as Twitter, support the concept of verification. Veri-
fied accounts are deemed worthy of platform-wide public interest and are separately authenticated by the platform itself. There have been repeated assertions by these platforms about verification not being tan-
tamount to endorsement. However, a significant body of prior work suggests that possessing a verified
status symbolizes enhanced credibility in the eyes of the platform audience. As a result, such a station
is highly coveted among public figures and influencers. Hence, we attempt to characterize the network
of verified users on Twitter and compare the results to similar analyses performed for the entire Twit-
ter network. We extracted the whole graph of verified users on Twitter (as of July 2018) and obtained
231,246 English user-profiles and 79,213,811 connections. Subsequently, in the network analysis, we
found that the sub-graph of verified users mirrors the full Twitter users graph in some aspects, such as
possessing a short diameter. However, our findings contrast with earlier results on multiple fronts, such
as the possession of a power-law out-degree distribution, slight dissortativity, and a significantly higher
reciprocity rate, as elucidated in the paper. Moreover, we attempt to gauge the presence of salient com-
ponents within this sub-graph and detect the absence of homophily with respect to popularity, which
again is in stark contrast to the full Twitter graph. Finally, we demonstrate stationarity in the time series
of verified user activity levels.

It is in this backdrop that we attempt to deconstruct the extent to which Twitter’s verification policy
mingles the notions of authenticity and authority. To this end, we seek to unravel the aspects of a user’s
profile, which likely engender or preclude verification. The aim of the paper is two-fold: First, we test
if discerning the verification status of a handle from profile metadata and content features is feasible.
Second, we unravel the characteristics which have the most significant bearing on a handle’s verification
status. We augmented our dataset with all the 494 million tweets of the aforementioned users over a one
year collection period along with their temporal social reach and activity characteristics. Our proposed
models are able to reliably identify verification status (Area under curve AUC > 99%). We show that
the number of public list memberships, presence of neutral sentiment in tweets and an authoritative
language style are the most pertinent predictors of verification status.

To the best of our knowledge, this work represents the first quantitative attempt at characterizing
verified users on Twitter and also the first attempt at discerning and classifying verification worthy users
on Twitter.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

What Sets Verified Users apart? Insights Into, Analysis of and Prediction of Verified Users on Twitter

  1. 1. What sets Verified Users apart? Insights into, Analysis of and Prediction of Verified Users on Twitter Indraneil Paul Masters Thesis Defense 14th September 2019 IIIT Hyderabad Committee Members Dr. Ponnurangam Kumaraguru (Advisor) Dr. Kamalakar Karlapalem Dr. Vikram Pudi
  2. 2. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 2 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  3. 3. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 3 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  4. 4. Ambiguity in Perception ● Twitter, Facebook and Instagram have incorporated a verification process to authenticate handles they deem important enough to be worth impersonating. ● However, despite repeated statements by Twitter about verification not being equivalent to endorsement, users have misinterpreted the authenticity it is meant to convey with credibility. ● The rarity of the status and its prominent visual signalling are the likely causes for this. 4
  5. 5. This perception of verification lending credence has led Twitter to receive a lot of flak in recent times, especially for harbouring bias against certain groups. We try to demonstrate that the attainment of verified status by users can be explained away by less insidious factors based on user activity trajectory, tweet sentiment and tweet contents. 5 Ambiguity in Perception
  6. 6. Visual Incentive 1. Presence of authority and authenticity indicators: Lends further credibility to the Tweets made by a user handle 2. Presentation over relevance: Psychological testing reveals that credibility evaluation of online content is influenced by its presentation rather than its relevance or apparent credulity Attaining verified status might lead to a user’s content being more frequently liked and retweeted. 6
  7. 7. Heuristic Models The average user devotes only three seconds of attention per Tweet. This is symptomatic of users resorting to content evaluation heuristics. One such relevant heuristic is the Endorsement heuristic, which is associated with credibility conferred to content by visual markers. The presence of a marker such as a verified badge could hence, be the difference between a user reading a Tweet in a congested feed or completely ignoring it. 7
  8. 8. Heuristic Models Another pertinent heuristic is the Consistency heuristic, which stems from endorsements by several authorities. This is important because a verified user on one social media platform is likelier to be verified on other platforms as well. Hence, we posit that possessing a verified status can make a world of difference in the outreach/influence of a brand or individual in terms of the extent and quality. 8
  9. 9. Coveted Nature Unsurprisingly, a verified status is highly sought after by preeminent entities and businesses, as evidenced by the prevalence of get-verified-quick schemes. Instead of resorting to questionable schemes, accounts can follow our insights to increase their platform reach and improve their chances of verification. 9
  10. 10. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 10 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  11. 11. Collection Approach We queried the Twitter REST API for the following: 1. The @verified handle on Twitter follows all accounts on the platform that are currently verified. We queried this handle on the 18th of July 2018 and extracted the user IDs. 2. We obtained the user objects for all verified users and subsetted for English speaking users obtaining 231,235 users. 3. Additionally, we leveraged Twitter’s Firehose API – a near real-time stream of public tweets and accompanying author metadata. 11
  12. 12. Collection Approach We used the Firehose to sample a set of 175,930 non-verified users by controlling for number of followers - a conventional metric of public interest. This was done by ensuring that the number of followers of every non-verified user was within 2% of that of a unique verified user we had previously acquired. For each of the aforementioned user, data and metadata including friends, tweet content and sentiment, activity time series, and profile reach trajectories was gathered. 12
  13. 13. Collected Features 13
  14. 14. Collected Metadata For each verified member, we collected the following metadata: 1. Followers count 2. Friends count 3. Status count 4. Public list memberships 14
  15. 15. Collected Features 15
  16. 16. 494 million Tweets collected over a one year period 175,930 English language Twitter non-verified users 16 Verified User Network English language Twitter verified users 231,235
  17. 17. Class Imbalance To prevent any effects of a skewed class distribution from affecting results, we applied two class rebalancing methods to rectify this. A minority oversampling technique called ADASYN was used. It creates synthetic minority samples based on interpolation between already existing samples. 17
  18. 18. Class Imbalance Additionally, we use a hybrid over and under sampling technique called SMOTE Tomek that also eliminates samples of the overrepresented class. For a pair of opposing class points that are each other's closest neighbours (tomek link), the majority class point is eliminated. 18
  19. 19. 0.1583 Low avg. clustering coefficient -0.04 Degree assortativity 19 Miscellaneous Trivia Most connected user: Influencer @6BillionPeople 114,815
  20. 20. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 20 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  21. 21. Attracting Components Attracting components are components in a directed graph in which, if a random walk enters, it can never leave. The acquired network consists of 6091 attracting components. At the core of these components lie famous personalities (high in-degree users) who do not follow any other handle. 21
  22. 22. Reciprocity The verified network has a reciprocity rate of 33.7%. This is lower than usually seen in other social networks such as Flickr (68%). The likely cause of this is the prevalence of brands and third-party sources of curated and crawled information, which typically do not reciprocate engagements. 22
  23. 23. Power Law Power-law is a key component in characterizing degree distribution of networks gathered from various sources. It refers to the presence of the following distributional property: This is closely related to the concept of the Pareto distribution or the 80-20 rule, where 20 percent of an entity is responsible for 80 percent of its characteristics. We explore the presence of power laws in the network degree distribution and laplacian eigenvalue distribution. 23
  24. 24. Power Law Inference The modern statistically-sound methodology for testing the presence of power laws involves a two step procedure for estimating xmin and then α. The value of the exponent is given using the closed form expression after performing the MLE inference shown below: 24
  25. 25. Power Law Inference The optimal value of xmin is determined by iterating over various candidate intervals and choosing the value for which the Kolmogorov-Smirnov distance between the CDF of the data in the interval and the power law that best approximates it is the smallest. 25
  26. 26. Eigenvalue Distribution We computed the 10,000 largest eigenvalues of the Laplacian matrix. The eigenvalues were computed using the power iteration method in existing solvers. Inference of power-law parameters α and xmin is done using the continuous maximum-likelihood algorithm. Continuous MLE inference for the degree distribution yields parameter estimates of 3.18 for α and 9377.26 for xmin with a p value of 0.3 This is in keeping with earlier such findings in Laplacian eigenvalue distributions of synthetic and real world undirected social network datasets. 26
  27. 27. Degree Distribution Further, we carry out a similar inference procedure for the out degree distribution of the nodes. Inference of power-law parameters α and xmin is done using the discrete maximum-likelihood algorithm. Discrete MLE inference for the degree distribution yields parameter estimates of 3.24 for α and 1334 for xmin with a p value of 0.13 Our findings are in contrast with the absence of a power-law in the degree distribution when analyzing the whole Twitter network, as reported by existing work. 27
  28. 28. Degrees of Separation Existing work such as the 6 degrees of separation and the small-world model after named after findings that many social and technological networks possessed small average path lengths. The verified network is even more extreme in this aspect with an average node distance of 2.74 which is much lower than previous sampling estimates for all of Twitter (3.43, 4.12) 28
  29. 29. Bio Analysis Each user on Twitter can have a biography (or bio) allowing him/her to describe themselves using a limited number of characters. We attempt to gain insights from some of the most popular unigrams, bigrams and trigrams occurring in the bios of verified users. We also filter out n-grams constituted largely of non-informative words. A running theme common to all three cases is the dominance of journalists and news and weather outlets. Being a preeminent journalist in an English media outlet seems to be one of the surest ways to get verified on Twitter. 29
  30. 30. Bio Analysis The most frequent unigrams portray several underlying themes such as: 1. They include cross-links to other social media handles (e.g. Instagram) 2. Professional descriptors (e.g. Tech) Bigrams and trigrams reiterate a largely similar narrative, dominated by generic descriptors (e.g. Official Account) and business descriptors (e,g, Weather Alerts) 30
  31. 31. Bio Analysis 31
  32. 32. Network Centrality We delve into how a user’s centrality in this network correlates with conventional metrics of reach such as follower and list membership count. Public list membership has been shown to be a robust predictor of influence and topical relevance on Twitter. 32
  33. 33. Network Centrality We observe that public list membership and follower count in the entire Twitter network is positively correlated with PageRank and Betweenness centrality of that user in the English verified user sub-graph. This backs up the general perception that a verified status is afforded, not just as a mark of authenticity, but also sufficient public interest. 33
  34. 34. Autocorrelation We check for existing auto correlations in the time series using the Ljung- Box and the Box-Pierce portmanteau tests. If the p values returned by the test are greater than 0.05, then the time- lagged correlation cannot be ruled out with a 95% significance level. The Ljung-Box and Box-Pierce test results indicate a maximum p value of 3.81×10-38 and 7.57×10-38 respectively, thus strongly ruling out any lagged correlation. This counters intuitive expectations that there would be a significant auto correlation in a week’s lag given that activity rates on Sundays are reliably lower than those on weekdays. 34
  35. 35. Tweet Activity Pattern 35
  36. 36. Stationarity We next inquire whether the activity time series is stationary or not using a time series changepoint detection mechanism called Pruned Exact Linear Time (PELT). We assume that this time series is drawn from a normal distribution, with mean and variance that can change at a discrete number of change-points. We use the PELT algorithm to maximize the log-likelihood for the means and variances of the underlying distribution with a penalty for the number of change-points. 36
  37. 37. Stationarity Results from several runs of the algorithm are recorded while cooling down the penalty factor and ramping up the number of change-points. Dates that fall in the change-point list in a significant number of runs of the algorithm are considered viable change-point candidates We only find weak evidence for a changepoint around Christmas of 2017. 37
  38. 38. Stationarity Existing work on smaller social networks, such as Gab, reveal that the activity time series drastically change in response to socio-political events occurring outside the network. Hence, to investigate further, we employ an Augmented Dickey-Fuller test with both a constant term and a trend term. For upwards of 250 observations (we have 366) the critical value of the test is −3.42 when using a constant and a trend term at the 95% significance level. If the test statistic value is more negative than the critical threshold, we reject the null hypothesis of a unit root and conclude the presence of stationarity. Our test, returns a test statistic of −3.86 which is significantly more negative than the critical threshold, thus strongly suggesting stationarity 38
  39. 39. User Data Classification We commence our analysis by eliminating all features that could be deemed surplus to requirements. To this end, we employed an all-relevant feature selection model which classifies features into three categories: confirmed, tentative and rejected. We only retain features that the model is able to confirm over 100 iterations. Using the rich set of features collected, we are able to consistently attain a near-perfect classification accuracy of over 98%. Our results suggest that a very competent classification of the Twitter user verification status is possible without resorting to complex deep-learning pipelines that sacrifice interpretability. 39
  40. 40. User Data Classification 40
  41. 41. Feature Importance To compare the usefulness of various categories of features, we trained gradient boosting classifier, our most competitive model, using each category of features alone. Evaluated on randomized train-test splits of our dataset, user metadata and content features were both able to consistently surpass 0.88 AUC. Also, temporal features alone are able to consistently attain an AUC of over 0.79. 41
  42. 42. Feature Importance The individual feature importances were determined using the Gini impurity reduction metric output by the gradient boosting model. To rank the most important features reliably, the model was trained 100 times with varying combinations of hyperparameters. The most reliable discriminative features are shown. 42
  43. 43. Feature Importance Some features are intuitively separable, making an informed prediction possible. The top 6 features are sufficient to attain 0.9 AUC on their own right. For instance, the very highest public list membership counts and prevalence of positive sentiment in Tweets are populated exclusively by verified users while the very lowest propensities for authoritative speech as indicated by LIWC Clout summary scores are exclusively shown by non-verified users. 43
  44. 44. Profile Clustering In order to characterize accounts with a higher resolution, we attempt to cluster them. We apply K-Means++ on the normalized user vectors selecting the 30 most discriminative features indicated by the XGBoost model, eventually settling on 8 different clusters by tuning the perplexity metric. In the interest of intuitive visualization, two dimensional embeddings obtained via t-SNE are shown alongside. 44
  45. 45. Strongly Non-Verified Cluster C0 can largely be characterized as the Twitter layman with a high proportion of experiential tweets. They have short tweets, high incidence of verb usage and score very high in the LIWC Authenticity summary. Cluster C2 can be characterized as an amalgamation of accounts exhibiting bot- like behavior. Members of this cluster scored highly on the network and content automation scores in our feature set. Extensive usage of hashtags and outlinks are observed. 45
  46. 46. Strongly Verified Cluster C4 having a tendency to post longer tweets and retweet more frequently than author content, while members of Cluster C6 almost exclusively retweet on the platform. Cluster C5 is nearly entirely comprised of verified users and includes elite Twitter users that comprise the core of verified users on the platform. These users have by far the highest list memberships on average. 46
  47. 47. Mixed Clusters Clusters C1, C3 and C7 are comprised of a mix of verified and non-verified users. Members of cluster C1 are ascendant both in terms of reach and activity levels as evidenced by the proportion of their followers gained and statuses authored recently. Many users in C1 have obtained verification in the data collection period. Members of C3 and C7 who are either stagnant or declining in their reach and activity levels and show very low engagement with the rest of the platform in terms of retweets and mentions. 47
  48. 48. Outline A: PROBLEM AND MOTIVATION ➢ Perceived influence of verification ➢ Understanding what sets verified users apart 48 B: DATASET DESCRIPTION ➢ Description of data collection ➢ Summary data statistics D: TOPIC ANALYSIS ➢ Study divergence between verified users and the rest for tweet topics ➢ Study divergence in topic diversity C: METADATA/ACTIVITY ANALYSIS ➢ Study divergence of verified users from the rest for temporal activity and metadata signatures ➢ Deconstruct users into profiles
  49. 49. Topic Classification To glean into Tweet topics we ran the Gibbs Sampling based LDA over 1000 iterations of sampling. The number of topics was optimally fine tuned to 100 after trying out various values from 30 to 300 using perplexity values. Instead of topic modelling on a per-Tweet basis and aggregating per user we apply the author-topic model collating all of a user’s Tweets and topic modelling in one go. This is done to work around the fact that most Tweets are too short to meaningfully infer topics. We use the default document-topic densities as well as term-topic densities as suggested in prior topic modelling studies. 49
  50. 50. Topic Classification 50 Our classification models demonstrate that it is eminently possible to infer the verification status of a user purely using the distribution across topics they tweet about, with a high accuracy. The most competitive classifier attained a classification accuracy of 88.2 %.
  51. 51. Topic Importance In the interest of interpretability, we evaluate the predictive power of each topic with respect to verification status. We obtain individual topic importances using the ANOVA F-Scores output by GAM. The procedure is run on 50 random train-test splits of the dataset and the topics with the lowest F-Scores noted. Most discriminative topics with their top 3 keywords were noted. 51
  52. 52. Topic Importance Though there is some overlap between topics, there are clear patterns to be observed on some topics using which an informed prediction can be made. For instance the users who tweet most frequently about consequential topics like climate change and national politics are all verified while controversial topics like middle-east geopolitics and mundane topics like online sales are something verified users devote limited attention to. 52
  53. 53. Topical Span We next inquire about the diversity of Tweet topics. In order to obtain an optimal mix of the number of topics per user in an unsupervised manner, we leveraged the use of an Hierarchical Dirichlet Process. Inference is done using an Online Variational Bayes estimation using the previously stated hyperparameters. 53
  54. 54. Topical Span A trend is observed with non-verified users clearly being over-represented in the lower reaches of the distribution (1–4 topics), while a comparatively substantial portion of verified users are situated in the middle of the distribution (5–10 topics). Also noteworthy is the fact that the very upper echelons of topical variety in tweets are occupied exclusively by verified users. Shown are the two most topically diverse handles with 13 and 21 topics respectively. 54
  55. 55. Wrapping Up Summary of contributions and possible future applications
  56. 56. Key Contributions Full Featured Dataset Released a fully featured dataset of 407k+ users, containing 79+ million edges and 494+ million time stamped Tweets. Successful Classification We are the first study to successfully attempt at discerning as well as classifying verification worthy users on Twitter. We obtain a near perfect classifier in the process. Actionable Findings We unravel the aspects of a profile’s activity and presence that have the greatest bearing on a user’s verification status. 56
  57. 57. Future Applications 1. Superior verification heuristic Aforementioned deviations likely constitute a unique fingerprint for verified users which can be leveraged gauge the strength of a user’s case for such status 2. Actionable insights to improve online presence Obtained insights can be used to significantly enhance the quality and reach of one’s online presence before resorting to prohibitively priced social media management solutions 3. Realistic synthetic influential profile generation 57
  58. 58. Related Publications 1. Elites Tweet? Characterizing the Twitter Verified User Network ICDE Workshop on Large Scale Graph Data Analytics 2019, Macau SAR 1. What sets Verified Users apart? Insights, Analysis and Prediction of Verified Users on Twitter WebSci 2019, Boston 58
  59. 59. Research Acknowledgements 59 IIIT Hyderabad IIIT Delhi Microsoft India
  60. 60. 60 Thanks! Any questions? Find me at ineil77.github.io Contact me at indraneil.paul@research.iiit.ac.in

×