SlideShare a Scribd company logo
1 of 72
Gaining, Retaining
and Losing
Influence in Online
Communities
Professor Adam N. Joinson
UWE Bristol
www.joinson.com
28th
Feb 2014, SIIA conference
Acknowledgements
• Researchers: Simon Jones and James Dove
• Collaborators & advisors: Yorick Wilks,
Louise Guthrie, Arthur Thomas, Martin
Groen, Jan Noyes, Dawn Eubanks, Sam
Hunter, John Horgan, Leon Watts, Andy
Swarbrick, Niqi Cummings.
• Sponsors
What is ‘influence?’
What is online influence?
5
Active
Open
Connected
Quality Content
Consistent
Active
Open
Connected
Quality Content
Consistent
Bath /
UWE
studies
Who are influentials?
How do they behave?
Who becomes influential?
Who loses influence? Why?
Samples
• RL
– ca. 2 million posts
– 35 subforums
– 7,000 active users
– 10 years archive
• IA
– ca. 500,000 posts
– 30 subforums
– 900 active users
– 7 year archive
• LH
- ca. 21,000 posts
- 20 subforums
- 250 active members
- 3 years archive
• Enron
- 520k emails, 150 people,
4 years
9
10
11
Data collection
• Developed our own ‘scraping’
system that works with circa 70%
forums on the Internet
• Hosted on remote, off shore server
• Now contains > 5 million posts
• System runs via TOR encryption
12
Meta-data collected & derived
• Structural Features
• In Degree
• Out Degree
• Reciprocity Features
• % of bi-directional Neighbours
• % of threads with reciprocal communication
• Persistence Features
• Average Posts per Thread
• Std Dev. Posts per Thread
• Average post length
• Std Dev. post length
• Content Features
• % of quotes posted
• % of posts containing ?’s
• % of posts containing URLs
Additional Meta-data
Time since joining
Tendency (RL only)
Initialisation Features
% of threads initiated by user
Diversity Features
% of threads participated in
% of sub forums participated in
Reply-Network Construction
15
HOW DO ‘INFLUENTIALS’ USE
LANGUAGE?
Part one
16
17
18
Sampling
• All posters with single
post removed
• Top 10% rep power and
reputation chosen
• Stratified sample across
remaining 90% matched
• 245 leaders / non leaders
from IA, 353 leaders /
non from RL
19
%
Study 1: The language of ‘opinion leaders’
online
• Three step regression equations to predict leader vs.
non-leader in IA / RL using:
– Linguistic markers (e.g. less 1st
person, more past tense,
more readability)
– Meta-data (URLs, question marks, num. posts)
• 90/10 split used for training / validating
• Final prediction accuracy: 85% (IA) and 94% (RL)
21
Group Higher in opinion
leaders
Lower in opinion
leaders
Shared (RL & IA) Past tense
Number of posts
(total)
2nd person (‘you’)
Flesch Readability
Work words
1st person singular
(‘I’)
Religion words
Ave. Word Count
Question marks
RL only Negative Emotion
Adverbs
Words 6ltrs or more
Assent words
1st person plural (we)
Positive Emotion
IA only Assent words
Non-fluencies
Fillers
URL links
Words 6ltrs or more
Study 2: Language + Networks
• SNA metrics: Centrality, Page Rank, Clustering
• Meta-data: Activity levels
• LIWC: all 83 features
• Naïve Bayesian Classifier using 10-fold validation
(unsupervised machine learning)
• IA and LH sample
22
Background
• ‘The masses do not now take their opinions from dignitaries
in Church or State, from ostensible leaders, or from books.
Their thinking is done for them by men much like
themselves, addressing or speaking in their name, on the
spur of the moment . . .’ (John Stuart Mill, On Liberty)
• “. . . leadership at its simplest: it is casually exercised,
sometimes unwitting and unbeknown, within the smallest
groupings of friends, family members, and neighbors. It is
not leadership on the high level of Churchill, nor of a local
politico; it is the almost invisible, certainly inconspicuous
form of leadership at the person-to-person level of ordinary,
intimate, informal, everyday contact.” (Katz & Lazarsfeld,
1955)
Influence and the 2-step model
• Katz and Lazarsfeld
(1955)
– Messages are
intercepted by
‘influentials’
– “from radio and print
to opinion leaders and
from them to less
active sections of the
population” (p.32)
55
57
Accidental influencers: influence is due to position in the network (a contagion
approach)
‘Cascades’ don’t differ in pattern, just speed and scale due to position (Watts & Dodds,
2007)
SNA measures
• Betweenness Centrality
– The number of shortest paths that pass through a vertex divided by the number of
shortest paths in the network
– e.g. In a network of spies; who is the spy through which most confidential information is
likely to flow?
• Eigenvector Centrality
– A vertices’ eigenvector centrality is proportional to the sum of the eigenvector centralities
of all vertices connected to it
– e.g. In a network of citations who is the author that is most cited by other well cited
authors?
• PageRank
– The rank value indicates an importance of a particular page. A hyperlink to a page
counts as a vote of support. The PageRank of a page is defined recursively and depends
on the number and PageRank metric of all pages that link to it 
• Clustering Coefficient
– The local clustering coefficient of a node in a network graph quantifies how close its
neighbors are to being a clique (complete graph).
26
Study 2: Classifier results
Community % correctly
classified
% non-
leaders
incorrectly
classified as
leaders
% leaders
incorrectly
classified as non-
leaders
RL 88.5% 10.5% 1%
IA 83.0% 13.9% 3.1%
Predictors: RL
Predictors: IA
So….
• Users with high reputation are characterised
by:
• high activity / posting level
• highly connected, diverse networks
• relatively few 1st person singular, 3 person
depends on the context
• rhetorical flourishes
• Aristotle (350BC): ethos, pathos and/or
logos
31
Study 3: Enron leadership
• Can these techniques be used to identify
leaders elsewhere?
• Enron dataset: 150 users, 1.5m emails
• Cleaned to ‘pre 1991’ and ---original
message--- removed.
• Job titles identified via existing sources,
LinkedIn, court records.
32
Study 3: Method
• LIWC variables based on previous work (e.g.
pronouns, tense, tentativeness, argumentation).
• Entered into Logistic Regression – Leaders
(CEO -> Manager: n=69) vs. Non-Leaders
(Traders -> Employee: n=71). Lawyers
removed from data set.
• 74% accuracy
33
Study 3: Predictors
• Final model: R2
= .41: 74% accuracy
• Pronouns: Less 1st
person (‘I’), more 2nd
person
(‘you’)
• Argumentation: Less certainty, less ‘exclusion’
works, more tentativeness
• Emotion: More anxiety words
• Similar pattern then re: 1st
person, negative
emotion, ‘soft’ argumentation
34
Summary, Conclusions
• Possible to automate
identification of opinion leaders
online using:
– Meta-data, language, SNA: upwards
of 85% accuracy
• Leadership more nuanced than
expected:
– Softer – more tentative, less ‘bossy’
– Credible: readability, knowledge
• Methods might allow additional
insights into the nature of opinion
leadership and influence…
BECOMING INFLUENTIAL
Part two
36
Weakness of previous studies
• Reputation and reputation power based on
vBulletin algorithm that is unknown
• You retain the highest level you’ve ever had -
but people gain, and lose, position
• So, we adopted a behavioural approach to
studying influence, using social roles
37
Study 4: Social roles
• Over 100 years research on the topic. Social
roles:
– Impose structure on interaction and organise
behaviour
– They are, “recognized, accepted, and used to
accomplish pragmatic interaction goals in a
community” (Callero, 1994, p. ,232)
– Most easily identified through behavioural
regularities and patterns of relationships – a
‘structural signature’ (Gleave et al., 2009)
Previous work (behaviours)
• Brush et al: (2005)
– Key contributor, Low volume replier, Questioner, Reader,
Disengaged observer
• Golder & Donath (2004)
– Newbie, Celebrity, Lurker, Flamer, Troll, Ranter
• Kim (2000)
– Visitors, Novices, Regulars, Leaders, Elders
• Chan & Heyes (2010)
– Joining Conversationalists, Grunt, Taciturn, Popular
Participants, Popular Initiator, Supporter, Ignored
• Waters & Gasson (2005)
– Initiator, Contributor, Facilitator, Knowledge-elicitor,
39
Roles in communities
Previous work (relations, SNA)
• Usenet: answer vs. discussion people on
Usenet
• Wikipedia – substantive experts vs. technical
editors
41
Role = what you do + who you
do it with
42
i.e. both behaviour and network
Method
• Expectation Maximization (EM) Clustering
based on all meta-data (introduced earlier)
using Weka 3.5
• Identifies clusters based on Gaussians &
mean and covariance matrices)
• Assigns a probability distribution to each
instance.
• Validated using 10 folds
Classifier Results
RL Classifier - 92% correctly classifiedIA Classifier - 88% correctly classified
MAINTAINING INFLUENCE
Part three
Study 5: Do these roles
change?
Active Population of IA Per Time
Slice
Role Composition of IA Per Time
Slice
Time slice (Oldest -> Newest)
Active Population of RL Per Time Slice
Role Composition of RL Per Time
Slice
Time slice (Oldest -> Newest)
Conversationalist Popular Participant
Supporter
Reciprocating
Popular Participant
Information Provider
Low Volume
Supporter
Newbie Questioner
Taciturn
Contributor Collaborator Leader
Predicting movement between roles
• Multinomial logistic regression (RL only)
– Leader -> Leader
– Collaborator -> Leader
– Contributor -> Leader
– Contributor / Collaborator -> Contributor / Collaborator
(reference category)
• DVs:
– Network metrics (centrality, page rank)
– Language (Harvard Inquirer categories for influence),
Vector similarity, readability
– Some simple meta-data (e.g. URL posting)
Results
• Always a leader vs. never a leader – highly accurate
(98%)
– More central, influential ties (‘page rank’), active (word
count), readable, more ‘thanks’, ‘strong’ / ‘political’
language, less URLS, ‘submissive’ language and personal
pronouns.
• Promotion from contributor (note: only 6% of
contributors ever made this move)
– SNA centrality/page rank, word count, ‘political’ words,
readability. 88% accurate in matched sample.
• Promotion from collaborator (around 70% accuracy)
– SNA page rank / centrality, vector similarity to expert texts,
readability, thank rate, ‘political’ words.
63
Summary, Conclusions
• Identifiable social roles within these communities
• Most shared between the three, similar behavioural
characteristics
• Large churn – sample role composition
– Suggests that a way of identifying the resilience of a
community may be via the roles users’ inhabit
– And, targeting specific roles might improve / degrade the
community functioning.
• It’s possible to predict who will become a leader using
SNA + language + some behaviour.
64
LOSING INFLUENCE
Part four
Predicting movement between roles
• Multinomial logistic regression (RL only)
–Leader -> Leader (reference category)
–Leader -> Another role
–Leader -> Inactive
• DVs (from 6 months previous to move):
–Network metrics (centrality, page rank)
–Some simple meta-data (e.g. URL posting)
–Language (Harvard Inquirer categories for influence)
• R2
.35 (SNA/meta-data), .43 (SNA/meta-data, language)
Results
• Influential -> Drop out
• Reduced betweenness, page rank
• Increased clustering
• Reduced thank rate (i.e. % thanks per post)
• Increased bi-directional conversations
• Increased question marks
• Increased use of: Weak language
• No other language differences
Results
• Influential -> Another role
• Reduced betweenness, pagerank
• Increased clustering
• Increased bi-directional conversations (for
those who became joining conversationalists)
• Increased question marks
• Increased use of: weak language
• No other changes in language use
69
More on meta-data
RoI (thanks by num posts)
Summary
• Gaining influence is about:-
• being active, on topic, credible
• Maintaining influence:-
• Varied, active, useful, interested
• Losing influence:-
• Clustered, insular, doubtful
• Giving up all together:-
• Lack of recognition?
• Next steps - the impact on behaviour71
Thanks…Questions?
adam.Joinson@uwe.ac.uk
or @joinson

More Related Content

What's hot

13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)dnac
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Denis Parra Santander
 
09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memorydnac
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018Arsalan Khan
 
05 Communities in Network
05 Communities in Network05 Communities in Network
05 Communities in Networkdnac
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysisdnac
 
04 Diffusion and Peer Influence
04 Diffusion and Peer Influence04 Diffusion and Peer Influence
04 Diffusion and Peer Influencednac
 
Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012CameliaN
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on NetworksMason Porter
 
4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"rhetoricked
 
10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studiesdnac
 
11 Network Experiments and Interventions
11 Network Experiments and Interventions11 Network Experiments and Interventions
11 Network Experiments and Interventionsdnac
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisSujoy Bag
 
Mathematics and Social Networks
Mathematics and Social NetworksMathematics and Social Networks
Mathematics and Social NetworksMason Porter
 

What's hot (20)

13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
 
09 Diffusion Models & Peer Influence
09 Diffusion Models & Peer Influence09 Diffusion Models & Peer Influence
09 Diffusion Models & Peer Influence
 
20 Network Experiments
20 Network Experiments20 Network Experiments
20 Network Experiments
 
09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018
 
05 Communities in Network
05 Communities in Network05 Communities in Network
05 Communities in Network
 
12 SN&H Keynote: Thomas Valente, USC
12 SN&H Keynote: Thomas Valente, USC12 SN&H Keynote: Thomas Valente, USC
12 SN&H Keynote: Thomas Valente, USC
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysis
 
04 Diffusion and Peer Influence
04 Diffusion and Peer Influence04 Diffusion and Peer Influence
04 Diffusion and Peer Influence
 
Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on Networks
 
NRES
NRESNRES
NRES
 
13 Community Detection
13 Community Detection13 Community Detection
13 Community Detection
 
4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"
 
10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies
 
11 Network Experiments and Interventions
11 Network Experiments and Interventions11 Network Experiments and Interventions
11 Network Experiments and Interventions
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
01 Network Data Collection (2017)
01 Network Data Collection (2017)01 Network Data Collection (2017)
01 Network Data Collection (2017)
 
Mathematics and Social Networks
Mathematics and Social NetworksMathematics and Social Networks
Mathematics and Social Networks
 

Similar to Gaining, retaining and losing influence in online communities

TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkLora Aroyo
 
Network analyses of psychological science
Network analyses of psychological scienceNetwork analyses of psychological science
Network analyses of psychological scienceKevin Lanning
 
Studying archives of online behavior
Studying archives of online behaviorStudying archives of online behavior
Studying archives of online behaviorJames Howison
 
Epistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsEpistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsSimon Knight
 
Curation roles in theory and practice
Curation roles in theory and practiceCuration roles in theory and practice
Curation roles in theory and practiceMark Parsons
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Vala Ali Rohani
 
A Community of Quality: Using Social Network Analysis to Study University-Wid...
A Community of Quality: Using Social Network Analysis to Study University-Wid...A Community of Quality: Using Social Network Analysis to Study University-Wid...
A Community of Quality: Using Social Network Analysis to Study University-Wid...Stephanie Richter
 
Social Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to ToolsSocial Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to ToolsPatti Anklam
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network AnalysisPremsankar Chakkingal
 
the rhythms of occupy: broadcasting and listening practices on #ows
the rhythms of occupy: broadcasting and listening practices on #owsthe rhythms of occupy: broadcasting and listening practices on #ows
the rhythms of occupy: broadcasting and listening practices on #owsZizi Papacharissi
 
Adapting Test Teams to Organizational Power Structures
Adapting Test Teams to Organizational Power StructuresAdapting Test Teams to Organizational Power Structures
Adapting Test Teams to Organizational Power StructuresTechWell
 
Cite track presentation
Cite track presentationCite track presentation
Cite track presentationAmir Razmjou
 
Centrality in Time- Dependent Networks
Centrality in Time- Dependent NetworksCentrality in Time- Dependent Networks
Centrality in Time- Dependent NetworksMason Porter
 
Online Learning to Rank
Online Learning to RankOnline Learning to Rank
Online Learning to Rankewhuang3
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited TalkRalf Klamma
 
Making More Sense Out of Social Data
Making More Sense Out of Social DataMaking More Sense Out of Social Data
Making More Sense Out of Social DataThe Open University
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networkspjing2
 

Similar to Gaining, retaining and losing influence in online communities (20)

TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social Network
 
Network analyses of psychological science
Network analyses of psychological scienceNetwork analyses of psychological science
Network analyses of psychological science
 
Studying archives of online behavior
Studying archives of online behaviorStudying archives of online behavior
Studying archives of online behavior
 
JPSPstructure2015
JPSPstructure2015JPSPstructure2015
JPSPstructure2015
 
Epistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsEpistemic networks for Epistemic Commitments
Epistemic networks for Epistemic Commitments
 
Curation roles in theory and practice
Curation roles in theory and practiceCuration roles in theory and practice
Curation roles in theory and practice
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)
 
A Community of Quality: Using Social Network Analysis to Study University-Wid...
A Community of Quality: Using Social Network Analysis to Study University-Wid...A Community of Quality: Using Social Network Analysis to Study University-Wid...
A Community of Quality: Using Social Network Analysis to Study University-Wid...
 
Social Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to ToolsSocial Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to Tools
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
the rhythms of occupy: broadcasting and listening practices on #ows
the rhythms of occupy: broadcasting and listening practices on #owsthe rhythms of occupy: broadcasting and listening practices on #ows
the rhythms of occupy: broadcasting and listening practices on #ows
 
Adapting Test Teams to Organizational Power Structures
Adapting Test Teams to Organizational Power StructuresAdapting Test Teams to Organizational Power Structures
Adapting Test Teams to Organizational Power Structures
 
Cite track presentation
Cite track presentationCite track presentation
Cite track presentation
 
Centrality in Time- Dependent Networks
Centrality in Time- Dependent NetworksCentrality in Time- Dependent Networks
Centrality in Time- Dependent Networks
 
ARlab RESEARCH | Social search
ARlab RESEARCH | Social searchARlab RESEARCH | Social search
ARlab RESEARCH | Social search
 
Online Learning to Rank
Online Learning to RankOnline Learning to Rank
Online Learning to Rank
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited Talk
 
Aahb workshop
Aahb workshopAahb workshop
Aahb workshop
 
Making More Sense Out of Social Data
Making More Sense Out of Social DataMaking More Sense Out of Social Data
Making More Sense Out of Social Data
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networks
 

Recently uploaded

LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptxBasil Achie
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptssuser319dad
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSebastiano Panichella
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 

Recently uploaded (20)

LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.ppt
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 

Gaining, retaining and losing influence in online communities

  • 1. Gaining, Retaining and Losing Influence in Online Communities Professor Adam N. Joinson UWE Bristol www.joinson.com 28th Feb 2014, SIIA conference
  • 2. Acknowledgements • Researchers: Simon Jones and James Dove • Collaborators & advisors: Yorick Wilks, Louise Guthrie, Arthur Thomas, Martin Groen, Jan Noyes, Dawn Eubanks, Sam Hunter, John Horgan, Leon Watts, Andy Swarbrick, Niqi Cummings. • Sponsors
  • 4. What is online influence?
  • 5. 5
  • 7. Bath / UWE studies Who are influentials? How do they behave? Who becomes influential? Who loses influence? Why?
  • 8. Samples • RL – ca. 2 million posts – 35 subforums – 7,000 active users – 10 years archive • IA – ca. 500,000 posts – 30 subforums – 900 active users – 7 year archive • LH - ca. 21,000 posts - 20 subforums - 250 active members - 3 years archive • Enron - 520k emails, 150 people, 4 years
  • 9. 9
  • 10. 10
  • 11. 11 Data collection • Developed our own ‘scraping’ system that works with circa 70% forums on the Internet • Hosted on remote, off shore server • Now contains > 5 million posts • System runs via TOR encryption
  • 12. 12
  • 13. Meta-data collected & derived • Structural Features • In Degree • Out Degree • Reciprocity Features • % of bi-directional Neighbours • % of threads with reciprocal communication • Persistence Features • Average Posts per Thread • Std Dev. Posts per Thread • Average post length • Std Dev. post length • Content Features • % of quotes posted • % of posts containing ?’s • % of posts containing URLs Additional Meta-data Time since joining Tendency (RL only) Initialisation Features % of threads initiated by user Diversity Features % of threads participated in % of sub forums participated in
  • 15. 15
  • 16. HOW DO ‘INFLUENTIALS’ USE LANGUAGE? Part one 16
  • 17. 17
  • 18. 18
  • 19. Sampling • All posters with single post removed • Top 10% rep power and reputation chosen • Stratified sample across remaining 90% matched • 245 leaders / non leaders from IA, 353 leaders / non from RL 19 %
  • 20. Study 1: The language of ‘opinion leaders’ online • Three step regression equations to predict leader vs. non-leader in IA / RL using: – Linguistic markers (e.g. less 1st person, more past tense, more readability) – Meta-data (URLs, question marks, num. posts) • 90/10 split used for training / validating • Final prediction accuracy: 85% (IA) and 94% (RL)
  • 21. 21 Group Higher in opinion leaders Lower in opinion leaders Shared (RL & IA) Past tense Number of posts (total) 2nd person (‘you’) Flesch Readability Work words 1st person singular (‘I’) Religion words Ave. Word Count Question marks RL only Negative Emotion Adverbs Words 6ltrs or more Assent words 1st person plural (we) Positive Emotion IA only Assent words Non-fluencies Fillers URL links Words 6ltrs or more
  • 22. Study 2: Language + Networks • SNA metrics: Centrality, Page Rank, Clustering • Meta-data: Activity levels • LIWC: all 83 features • Naïve Bayesian Classifier using 10-fold validation (unsupervised machine learning) • IA and LH sample 22
  • 23. Background • ‘The masses do not now take their opinions from dignitaries in Church or State, from ostensible leaders, or from books. Their thinking is done for them by men much like themselves, addressing or speaking in their name, on the spur of the moment . . .’ (John Stuart Mill, On Liberty) • “. . . leadership at its simplest: it is casually exercised, sometimes unwitting and unbeknown, within the smallest groupings of friends, family members, and neighbors. It is not leadership on the high level of Churchill, nor of a local politico; it is the almost invisible, certainly inconspicuous form of leadership at the person-to-person level of ordinary, intimate, informal, everyday contact.” (Katz & Lazarsfeld, 1955)
  • 24. Influence and the 2-step model • Katz and Lazarsfeld (1955) – Messages are intercepted by ‘influentials’ – “from radio and print to opinion leaders and from them to less active sections of the population” (p.32) 55
  • 25. 57 Accidental influencers: influence is due to position in the network (a contagion approach) ‘Cascades’ don’t differ in pattern, just speed and scale due to position (Watts & Dodds, 2007)
  • 26. SNA measures • Betweenness Centrality – The number of shortest paths that pass through a vertex divided by the number of shortest paths in the network – e.g. In a network of spies; who is the spy through which most confidential information is likely to flow? • Eigenvector Centrality – A vertices’ eigenvector centrality is proportional to the sum of the eigenvector centralities of all vertices connected to it – e.g. In a network of citations who is the author that is most cited by other well cited authors? • PageRank – The rank value indicates an importance of a particular page. A hyperlink to a page counts as a vote of support. The PageRank of a page is defined recursively and depends on the number and PageRank metric of all pages that link to it  • Clustering Coefficient – The local clustering coefficient of a node in a network graph quantifies how close its neighbors are to being a clique (complete graph). 26
  • 27.
  • 28. Study 2: Classifier results Community % correctly classified % non- leaders incorrectly classified as leaders % leaders incorrectly classified as non- leaders RL 88.5% 10.5% 1% IA 83.0% 13.9% 3.1%
  • 31. So…. • Users with high reputation are characterised by: • high activity / posting level • highly connected, diverse networks • relatively few 1st person singular, 3 person depends on the context • rhetorical flourishes • Aristotle (350BC): ethos, pathos and/or logos 31
  • 32. Study 3: Enron leadership • Can these techniques be used to identify leaders elsewhere? • Enron dataset: 150 users, 1.5m emails • Cleaned to ‘pre 1991’ and ---original message--- removed. • Job titles identified via existing sources, LinkedIn, court records. 32
  • 33. Study 3: Method • LIWC variables based on previous work (e.g. pronouns, tense, tentativeness, argumentation). • Entered into Logistic Regression – Leaders (CEO -> Manager: n=69) vs. Non-Leaders (Traders -> Employee: n=71). Lawyers removed from data set. • 74% accuracy 33
  • 34. Study 3: Predictors • Final model: R2 = .41: 74% accuracy • Pronouns: Less 1st person (‘I’), more 2nd person (‘you’) • Argumentation: Less certainty, less ‘exclusion’ works, more tentativeness • Emotion: More anxiety words • Similar pattern then re: 1st person, negative emotion, ‘soft’ argumentation 34
  • 35. Summary, Conclusions • Possible to automate identification of opinion leaders online using: – Meta-data, language, SNA: upwards of 85% accuracy • Leadership more nuanced than expected: – Softer – more tentative, less ‘bossy’ – Credible: readability, knowledge • Methods might allow additional insights into the nature of opinion leadership and influence…
  • 37. Weakness of previous studies • Reputation and reputation power based on vBulletin algorithm that is unknown • You retain the highest level you’ve ever had - but people gain, and lose, position • So, we adopted a behavioural approach to studying influence, using social roles 37
  • 38. Study 4: Social roles • Over 100 years research on the topic. Social roles: – Impose structure on interaction and organise behaviour – They are, “recognized, accepted, and used to accomplish pragmatic interaction goals in a community” (Callero, 1994, p. ,232) – Most easily identified through behavioural regularities and patterns of relationships – a ‘structural signature’ (Gleave et al., 2009)
  • 39. Previous work (behaviours) • Brush et al: (2005) – Key contributor, Low volume replier, Questioner, Reader, Disengaged observer • Golder & Donath (2004) – Newbie, Celebrity, Lurker, Flamer, Troll, Ranter • Kim (2000) – Visitors, Novices, Regulars, Leaders, Elders • Chan & Heyes (2010) – Joining Conversationalists, Grunt, Taciturn, Popular Participants, Popular Initiator, Supporter, Ignored • Waters & Gasson (2005) – Initiator, Contributor, Facilitator, Knowledge-elicitor, 39
  • 41. Previous work (relations, SNA) • Usenet: answer vs. discussion people on Usenet • Wikipedia – substantive experts vs. technical editors 41
  • 42. Role = what you do + who you do it with 42 i.e. both behaviour and network
  • 43. Method • Expectation Maximization (EM) Clustering based on all meta-data (introduced earlier) using Weka 3.5 • Identifies clusters based on Gaussians & mean and covariance matrices) • Assigns a probability distribution to each instance. • Validated using 10 folds
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52. Classifier Results RL Classifier - 92% correctly classifiedIA Classifier - 88% correctly classified
  • 53.
  • 55. Study 5: Do these roles change?
  • 56. Active Population of IA Per Time Slice
  • 57. Role Composition of IA Per Time Slice Time slice (Oldest -> Newest)
  • 58. Active Population of RL Per Time Slice
  • 59. Role Composition of RL Per Time Slice Time slice (Oldest -> Newest)
  • 60.
  • 61. Conversationalist Popular Participant Supporter Reciprocating Popular Participant Information Provider Low Volume Supporter Newbie Questioner Taciturn Contributor Collaborator Leader
  • 62. Predicting movement between roles • Multinomial logistic regression (RL only) – Leader -> Leader – Collaborator -> Leader – Contributor -> Leader – Contributor / Collaborator -> Contributor / Collaborator (reference category) • DVs: – Network metrics (centrality, page rank) – Language (Harvard Inquirer categories for influence), Vector similarity, readability – Some simple meta-data (e.g. URL posting)
  • 63. Results • Always a leader vs. never a leader – highly accurate (98%) – More central, influential ties (‘page rank’), active (word count), readable, more ‘thanks’, ‘strong’ / ‘political’ language, less URLS, ‘submissive’ language and personal pronouns. • Promotion from contributor (note: only 6% of contributors ever made this move) – SNA centrality/page rank, word count, ‘political’ words, readability. 88% accurate in matched sample. • Promotion from collaborator (around 70% accuracy) – SNA page rank / centrality, vector similarity to expert texts, readability, thank rate, ‘political’ words. 63
  • 64. Summary, Conclusions • Identifiable social roles within these communities • Most shared between the three, similar behavioural characteristics • Large churn – sample role composition – Suggests that a way of identifying the resilience of a community may be via the roles users’ inhabit – And, targeting specific roles might improve / degrade the community functioning. • It’s possible to predict who will become a leader using SNA + language + some behaviour. 64
  • 66.
  • 67. Predicting movement between roles • Multinomial logistic regression (RL only) –Leader -> Leader (reference category) –Leader -> Another role –Leader -> Inactive • DVs (from 6 months previous to move): –Network metrics (centrality, page rank) –Some simple meta-data (e.g. URL posting) –Language (Harvard Inquirer categories for influence) • R2 .35 (SNA/meta-data), .43 (SNA/meta-data, language)
  • 68. Results • Influential -> Drop out • Reduced betweenness, page rank • Increased clustering • Reduced thank rate (i.e. % thanks per post) • Increased bi-directional conversations • Increased question marks • Increased use of: Weak language • No other language differences
  • 69. Results • Influential -> Another role • Reduced betweenness, pagerank • Increased clustering • Increased bi-directional conversations (for those who became joining conversationalists) • Increased question marks • Increased use of: weak language • No other changes in language use 69
  • 70. More on meta-data RoI (thanks by num posts)
  • 71. Summary • Gaining influence is about:- • being active, on topic, credible • Maintaining influence:- • Varied, active, useful, interested • Losing influence:- • Clustered, insular, doubtful • Giving up all together:- • Lack of recognition? • Next steps - the impact on behaviour71