Mining and Comparing Engagement Dynamics
Across Multiple Social Media Platforms
Matthew Rowe
Lancaster University, UK
@hal...
Engagement in Social Media
Moving on …
§  How can we move on
from these (micro)
studies?
§  Are results consistent
across datasets, and
platforms?
...
Publications on "social media analysis”
0
100
200
300
400
500
600
2006 2007 2008 2009 2010 2011 2012 2013
Publications on ...
Papers studying single/multiple
social media platforms
Papers studying single/multiple
social media platforms
Papers studying single/multiple
social media platforms
Papers studying single/multiple
social media platforms
Apples and Oranges
§  We mix and compare
different features,
datasets, and platforms
§  Aim is to figure out their
simil...
Contributions
§  Examine replying dynamics as a modality of engagement
§  Define a framework of engagement analysis that...
7 datasets from 5 platforms
Platform Posts Users Seeds Non-seeds Replies
Boards.ie 6,120,008 65,528 398,508 81,273 5,640,2...
Data Balancing
Platform Seeds Non-seeds Instance Count
Boards.ie 398,508 81,273 162,546
Twitter Random 144,709 930,262 289...
Features
§  Post Length: number of words in
the post
§  Complexity: Measures the
cumulative entropy of terms in a
post
§...
Classification of Posts
Seed Posts
Non-Seed
Posts
§  Binary classification model
§  Trained with social, content,
and co...
Classification Results
Feature P R F1
Social 0.592 0.591 0.591
Content 0.664 0.660 0.658
Social+Content 0.670 0.666 0.665
...
Effect of features on engagement
Boards.ie
β
−2
−1
0
1
2
Twitter Random
β
−0.5
0.0
0.5
1.0
Twitter Haiti
−6e+16
−4e+16
−2e...
Significance of regression coefficients
Boards.ie
p
0.0
0.2
0.4
0.6
0.8
1.0
Titter Random
p
0.0
0.2
0.4
0.6
0.8
1.0
Titter...
Comparison
to literature
§  How performance
of our feature
compare to other
studies on different
datasets and
platforms?
Positive impact
Negative impact
Mismatch
Match
Positive impact
Negative impact
Mismatch
Match
Summary
§  We tested the consistency and applicability of engagement
patterns across multiple platforms
§  Used 12 socia...
So what’s Next!
§  LOTS!
§  Apply same study to more datasets from the same platforms, and from other
platforms
§  Expa...
Questions!
1.  Why those specific datasets and platforms?
2.  What about platform-specific features?
3.  Could we ever get...
@halani
harith-alani
@halani
http://people.kmi.open.ac.uk/harith/
ACM Web Science Conference (WebSci) 2014, middle of nowh...
Upcoming SlideShare
Loading in …5
×

Mining and Comparing Engagement Dynamics Across Multiple Social Media Platforms #websci14

1,109 views

Published on

Understanding what attracts users to engage with social media content is important in domains such as market analytics, advertising, and community management.
To date, many pieces of work have examined engagement dynamics in isolated platforms with little consideration or assessment of how these dynamics might vary between disparate social media systems. Additionally, such explorations have often used different features and notions of engagement, thus rendering the cross-platform comparison of engagement dynamics limited. In this paper we define a common framework of engagement analysis and examine and compare engagement dynamics across five social media platforms: Facebook, Twitter, Boards.ie, Stack Overflow and the SAP Community Network. We define a variety of common features (social and content) to capture the dynamics that correlate with engagement in multiple social media platforms, and present an evaluation pipeline intended to enable cross-platform comparison. Our comparison results demonstrate the varying factors at play in different platforms, while also exposing several similarities.

Published in: Social Media, Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,109
On SlideShare
0
From Embeds
0
Number of Embeds
107
Actions
Shares
0
Downloads
13
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Mining and Comparing Engagement Dynamics Across Multiple Social Media Platforms #websci14

  1. 1. Mining and Comparing Engagement Dynamics Across Multiple Social Media Platforms Matthew Rowe Lancaster University, UK @halani harith-alani @halani ACM Web Science Conference (WebSci) 2014, Bloomington, IND http://people.kmi.open.ac.uk/harith/ Harith Alani Knowledge Media institute, UK
  2. 2. Engagement in Social Media
  3. 3. Moving on … §  How can we move on from these (micro) studies? §  Are results consistent across datasets, and platforms? §  One way forward is: §  Multiple platforms §  Multiple topics
  4. 4. Publications on "social media analysis” 0 100 200 300 400 500 600 2006 2007 2008 2009 2010 2011 2012 2013 Publications on "social media analysis"
  5. 5. Papers studying single/multiple social media platforms
  6. 6. Papers studying single/multiple social media platforms
  7. 7. Papers studying single/multiple social media platforms
  8. 8. Papers studying single/multiple social media platforms
  9. 9. Apples and Oranges §  We mix and compare different features, datasets, and platforms §  Aim is to figure out their similarities and differences
  10. 10. Contributions §  Examine replying dynamics as a modality of engagement §  Define a framework of engagement analysis that fits multiple social platforms §  Show the varying features at play in different platforms, and where the similarities and differences are §  Contrast the role of different features on engagement likelihood across five social media platforms §  Compare results to relevant literature on same or different platforms and engagement indicators
  11. 11. 7 datasets from 5 platforms Platform Posts Users Seeds Non-seeds Replies Boards.ie 6,120,008 65,528 398,508 81,273 5,640,227 Twitter Random 1,468,766 753,722 144,709 930,262 390,795 Twitter (Haiti Earthquake) 65,022 45,238 1,835 60,686 2,501 Twitter (Obama State of Union Address) 81,458 67,417 11,298 56,135 14,025 SAP 427,221 32,926 87,542 7,276 332,403 Server Fault 234,790 33,285 65,515 6,447 162,828 Facebook 118,432 4,745 15,296 8,123 95,013 Seed posts are those that receive a reply Non-seed posts are those with no replies
  12. 12. Data Balancing Platform Seeds Non-seeds Instance Count Boards.ie 398,508 81,273 162,546 Twitter Random 144,709 930,262 289,418 Twitter (Haiti Earthquake) 1,835 60,686 3,670 Twitter (Obama State of Union Address) 11,298 56,135 22,596 SAP 87,542 7,276 14,552 Server Fault 65,515 6,447 12,894 Facebook 15,296 8,123 16,246 Total 521,922 For each dataset, an equal number of seeds and non-seed posts are used in the analysis.
  13. 13. Features §  Post Length: number of words in the post §  Complexity: Measures the cumulative entropy of terms in a post §  Readability: Gunning Fog index, gauges how hard the post is to parse by readers, and LIX Readability metric to determine complexity of words based on number of letters §  Referral Count: number of URLs in the post §  Informativeness: TF-IDF of the post §  Polarity: average sentiment polarity of the post (using SentiWordnet) §  In-degree: number of in-coming social connections (explicit or implicit) §  Out-degree: number of out-going social connections (explicit or implicit) §  Post Count: number of posts made in previous 6 months §  User Age: length of membership in community in days §  Post Rate: number of posts by the user per day Social Features Content Features
  14. 14. Classification of Posts Seed Posts Non-Seed Posts §  Binary classification model §  Trained with social, content, and combined features §  80/20 training/testing §  Compare results across platforms, to see how a change in each feature is associated with likelihood of engagement §  Compare engagement dynamics from our platforms against the literature
  15. 15. Classification Results Feature P R F1 Social 0.592 0.591 0.591 Content 0.664 0.660 0.658 Social+Content 0.670 0.666 0.665 (Random) (Haiti Earthquake) (Obama’s State Union Address) P R F1 0.561 0.561 0.560 0.612 0.612 0.611 0.628 0.628 0.628 P R F1 0.968 0.966 0.966 0.752 0.747 0.747 0.974 0.973 0.973 Feature P R F1 Social 0.542 0.540 0.539 Content 0.650 0.642 0.639 Social+Content 0.656 0.649 0.646 P R F1 0.650 0.631 0.628 0.575 0.541 0.521 0.652 0.632 0.629 P R F1 0.528 0.380 0.319 0.626 0.380 0.275 0.568 0.407 0.359 Feature P R F1 Social 0.635 0.632 0.632 Content 0.641 0.641 0.641 Social+Content 0.660 0.660 0.660 §  Performance of the logistic regression classifier trained over different feature sets and applied to the test set.
  16. 16. Effect of features on engagement Boards.ie β −2 −1 0 1 2 Twitter Random β −0.5 0.0 0.5 1.0 Twitter Haiti −6e+16 −4e+16 −2e+16 0e+00 2e+16 4e+16 6e+16 Twitter Union β −0.8 −0.6 −0.4 −0.2 0.0 0.2 Server Fault β −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 SAP β −10 −5 0 5 Facebook β −0.1 0.0 0.1 0.2 0.3 0.4 0.5 In−degree Out−degree Post Count Age Post Rate Post Length Referrals Count Polarity Complexity Readability Readability Fog Informativeness Logistic regression coefficients for each platform's features
  17. 17. Significance of regression coefficients Boards.ie p 0.0 0.2 0.4 0.6 0.8 1.0 Titter Random p 0.0 0.2 0.4 0.6 0.8 1.0 Titter Haiti p 0.0 0.2 0.4 0.6 0.8 1.0 Titter Union p 0.0 0.2 0.4 0.6 0.8 1.0 Server Fault p 0.0 0.2 0.4 0.6 0.8 1.0 SAP p 0.0 0.2 0.4 0.6 0.8 1.0 Facebook p 0.0 0.2 0.4 0.6 0.8 1.0 In−degree Out−degree Post Count Age Post Rate Post Length Referrals Count Polarity Complexity Readability Readability Fog Informativeness
  18. 18. Comparison to literature §  How performance of our feature compare to other studies on different datasets and platforms?
  19. 19. Positive impact Negative impact Mismatch Match
  20. 20. Positive impact Negative impact Mismatch Match
  21. 21. Summary §  We tested the consistency and applicability of engagement patterns across multiple platforms §  Used 12 social/content features that map to 5 platforms §  Studied the impact of those features on engagement across these platforms §  Compared the impact of our features against generally relevant studies in the literature §  Showed that same features could play a different roles in different platforms, or different non-random datasets
  22. 22. So what’s Next! §  LOTS! §  Apply same study to more datasets from the same platforms, and from other platforms §  Expand from replies to other engagement indicators §  Improve classification of seeds/non-seeds with more common features §  Further study on impact of topics and non-randomness on engagement dynamics §  Take user type into account – e.g. posts from new agencies are more likely to be tweeted than replied to
  23. 23. Questions! 1.  Why those specific datasets and platforms? 2.  What about platform-specific features? 3.  Could we ever get a full understanding of these dynamics across all social platforms? 4.  Could these findings be used to increase engagement? 5.  Who’s right/wrong when the same feature appears to have conflicting impact on the same platform? 6.  Couldn’t be the case that the same feature is used differently in different platforms? 7.  How could we study event-specific engagement dynamics?
  24. 24. @halani harith-alani @halani http://people.kmi.open.ac.uk/harith/ ACM Web Science Conference (WebSci) 2014, middle of nowhere!

×