Your SlideShare is downloading. ×
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems

1,304
views

Published on

Presented at: …

Presented at:
-Aston Business School, Birmingham, UK. 2011
-Keynote presentation at Detecting and Exploiting Cultural Diversity on the Social Web Workshop, 20th Annual Conference on Information and Knowledge Management 2011

Published in: Technology, Education, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,304
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
29
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Myriad social web systems exists, at the heart of such systems is the user: who drives action and contentThese systems differ
  • Solitary feature sets: Content features produce the best predictive performance on both systems!All features: produces best performance
  • Twitter:Time in day: no-reply zoneHigher out-degree = more likely to get a replyBoards.ieHigherreferral counts correlated with non-seedsBoth:Lower informativeness correlated with seedsUse language that is familiar to the platform’s usersReadability lower for seeds although harder to see
  • Twitter: Content quality once again importantBoards.ie: User features more important this time, unlike content beforeSCN: Similar to Boards, user features most importantDigg: focus information of the user most important in this caseTwitter: best model=ContentBoards.ie: best model = content and focusSCN: best model = user and contentDigg: best model = content and focus. Like Boards.ie
  • In-degree ratio = concentrationPosts Replied ratio = popularityThread initiation ratio = propensity to initiate discussionsBi-directional threads ratio = reciprocity and interactionBi-directional neighbours ratio = reciprocityAverage posts per thread = level of discussionSD of posts per thread = captures variance of discussions
  • Maintain a mapping between feature and a level (low, mid, high)Enables dynamic derivation of the feature levels
  • Increase in Elitists and Participants is associated with increased activityUsers who communicate often with other usersIncrease in Taciturns and ignored is associated with decreased activityTaciturns contribute little
  • Common patterns across all three forums analysedCertain roles more important that others in differing communities:Conversationalists important in commuting and transport and rugby, not in mobile phones and PDAs – conversation not a driving factor in the forumsSupporters found to negatively impact upon activity in forum 411 – again because conversation is not a common action in the community: more interested in support
  • Increase in Joining Conversationalists and Popular Particiants correlates with increased activityDecrease in supporters and Ignored users correlates with increased activityNo Elitists or Grunts!Lower correlations, behaviour roles that fit well in one system are not the same as another! Different behaviour
  • Activity increases as the composition reaches a relatively stable settingi.e. little variation and fluctuation in the roles
  • Composition stability is associated with increased activity in 246 and 411Fluctuation in activity in rugby forum correlated with variation in roles
  • No Elitists or Grunts!
  • Best results for 246 – steady increase in activity over timeWorst results for 388 – fluctuation in composition and activity making it hard to perform predictionsCross community patterns are not reliable – idiosyncratic behaviour in each community
  • 895 (Celebrity & Showbiz) and 452 (For Sale: Computer Hardware)Outliers:7: After hours 47: Motors151: Soccer908: Beer Guts and Receding Hair
  • Transcript

    • 1. Using Behaviour Analysis toDetect Cultural Aspects in Social Web Systems Dr Matthew Rowe Knowledge Media Institute, The Open University, Milton Keynes, United Kingdomhttp://people.kmi.open.ac.uk/rowe | http://www.matthew-rowe.com
    • 2. Web 1.0 • Web of documents • Web presence constrained to HTML ‘experts’ • Fixed categories • Static content http://www.flickr.com/photos/complexify/97303317/Using Behaviour Analysis to Detect Cultural Aspects in 1Social Web Systems
    • 3. Web 2.0 • Data access through APIs • Collective Intelligence • User generated content • Web presence for all • Tagging http://www.flickr.com/photos/9119028@N05/591163479Using Behaviour Analysis to Detect Cultural Aspects in 2Social Web Systems
    • 4. A Social Web A Social Web System is an online platform that offers a useful service, normally for free, to users, through which they can interact and network http://mmt.me.uk/slides/deri20110401/images/walledgardens.jpgUsing Behaviour Analysis to Detect Cultural Aspects in 3Social Web Systems
    • 5. Example 1Using Behaviour Analysis to Detect Cultural Aspects in 4Social Web Systems
    • 6. Example 2Using Behaviour Analysis to Detect Cultural Aspects in 5Social Web Systems
    • 7. Δs of Social Web Systems• Social Web Systems differ in their: – Domain • Flickr = photos • Facebook = social networking • Twitter = microblogging – Audience • SAP Community Network = programmers • Slashdot = technology enthusiasts• How else do they differ?• What are the Δs?Using Behaviour Analysis to Detect Cultural Aspects in 6Social Web Systems
    • 8. The Utility of Behaviour Analysis• WeGov – Investigating the role of social networks in eGovernment – Enabling: • Tracking of political discussions and topics • Injection of policy content to maximise exposure• ROBUST – Risk and opportunity management in online communities – Enabling • Assessment of user churn in online communities • Community evolution prediction • Monitoring of community health• Behaviour analysis is required to understand: – What behaviour drives content creation – How behaviour is associated with community evolutionUsing Behaviour Analysis to Detect Cultural Aspects in 7Social Web Systems
    • 9. Thesis: Microcultures Social Web Systems contain micro-cultures that differ in terms of a) user behaviour b) how attention is generated c) role compositions in such systemsUsing Behaviour Analysis to Detect Cultural Aspects in 8Social Web Systems
    • 10. Outline• Analysis 1: Generating Attention – Understanding Attention Factors – Approach – Experiments – Findings• Analysis 2: Behaviour Role Compositions – Analysing Community Evolution – Approach – Experiments – Findings• Microcultures: EvidenceUsing Behaviour Analysis to Detect Cultural Aspects in 9Social Web Systems
    • 11. Analysis IGenerating AttentionUsing Behaviour Analysis to Detect Cultural Aspects in 10Social Web Systems
    • 12. Shared Content• Social Web Systems are now used to: – Ask questions – Post opinions and ideas – Discuss events and current issues• Content analysis in online communities is attractive for: – Market analysis – Brand consensus and product opinion• Social network analytics in the US is predicted to reach $1 billion by 2014 (Forrester 2009)• Masses of data is now being published in social web systems: – Facebook has more than 60 million status updates per day (Facebook statistics 2010)Using Behaviour Analysis to Detect Cultural Aspects in 11Social Web Systems
    • 13. Using Behaviour Analysis to Detect Cultural Aspects in 12Social Web Systems
    • 14. The Need for Analysis• Analysts need to know which piece of content will generate the most activity – i.e. the most auspicious or influential – Helps focus the attention of human and computerised analysts • What to track?• Need to understand the effect features (community and content) have on attention to content Which features are key to stimulating activity? How do these features influence activity length?How do Social Web Systems differ in how attention is generated?Using Behaviour Analysis to Detect Cultural Aspects in 13Social Web Systems
    • 15. Approach: Attention Prediction• Two-stage approach to predict attention to content: 1. Identify seed posts • E.g. thread starters on a message board • Will a given post start a discussion? • What are the properties that seed posts exhibit? – What parameters tend to trigger a discussion? 2. Predict discussion activity levels • From the identified seed posts • What is the level of activity that a seed post will generate? • What features correlate with heightened activity?Using Behaviour Analysis to Detect Cultural Aspects in 14Social Web Systems
    • 16. Features Which features are key to stimulating activity?• For each post, model: a) the author, b) the content and c) the topical concentration of the author• F1: User Features – In-degree, out-degree: social network properties of the author – Post count, age, post rate: participation information of the author• F2: Content Features – Post length, referral count, time in day: surface features of the post – Complexity: cumulative entropy of terms in the post – Readability: Gunning Fog index of the post – Informativeness: TF-IDF measure of terms within the post – Polarity: average sentiment of terms in the postUsing Behaviour Analysis to Detect Cultural Aspects in 15Social Web Systems
    • 17. Features (2)• F3: Focus Features – Topic entropy: the concentration of the author across community forums • Higher entropy indicates a wider spread of forum activity • More random distribution, less concentrated – Topic Likelihood: the likelihood that a user posts in a specific forum given his post history • Measures the affinity that a user has with a given forum • Lower likelihood indicates a user posting on an unfamiliar topicUsing Behaviour Analysis to Detect Cultural Aspects in 16Social Web Systems
    • 18. Social Web Systems: Datasets• Microblogging Platform: Twitter – Collected a random subset over 24-hour period – Attention measure: length of @reply chain• Community Message Board: Boards.ie – Analysed all posts and forums in 2006 – Attention measure: number of posts in a thread• Support Forum: SAP Community Network – Attention measure: number of replies to a question• News-sharing Platform: Digg – Used previous dataset of ‘popular’ stories – Attention measure: number of comments (and replies) to a storyUsing Behaviour Analysis to Detect Cultural Aspects in 17Social Web Systems
    • 19. Experiments• Experiment 1: Identifying Seed Posts – Will this post yield a reply? – Experiment 1(a): Model Selection • Which model performs best? – Experiment 1(b): Feature Assessment • How do features correlate with seed posts? – Datasets: Twitter and Boards.ie• Experiment 2: Activity Level Prediction – What is the level of activity that seed posts yield? – Experiment 2(a): Model Selection – Experiment 2(b): Feature Assessment • How do features correlate with heightened attention? – Datasets: Twitter, Boards.ie, SCN and DiggUsing Behaviour Analysis to Detect Cultural Aspects in 18Social Web Systems
    • 20. Experiments• Experiment 1: Identifying Seed Posts – Will this post yield a reply – Experiment 1(a): Model Selection • Which model performs best? – Experiment 1(b): Feature Assessment • How do features correlate with seed posts? – Datasets: Twitter and Boards.ie• Experiment 2: Activity Level Prediction – What is the level of activity that seed posts yield? – Experiment 2(a): Model Selection – Experiment 2(b): Feature Assessment • How do features correlate with heightened attention? – Datasets: Twitter, Boards.ie, SCN and DiggUsing Behaviour Analysis to Detect Cultural Aspects in 19Social Web Systems
    • 21. Results: 1(a) Model Selection• Which model performs best? Twitter Boards.ieUsing Behaviour Analysis to Detect Cultural Aspects in 20Social Web Systems
    • 22. Results: 1(b) Feature Assessment• How do features correlate with seed posts?Using Behaviour Analysis to Detect Cultural Aspects in 21Social Web Systems
    • 23. Results: 1(b) Feature Assessment TwitterBoards.ie Using Behaviour Analysis to Detect Cultural Aspects in 22 Social Web Systems
    • 24. Experiments• Experiment 1: Identifying Seed Posts – Will this post yield a reply – Experiment 1(a): Model Selection • Which model performs best? – Experiment 1(b): Feature Assessment • How do features correlate with seed posts? – Datasets: Twitter and Boards.ie• Experiment 2: Activity Level Prediction – What is the level of activity that seed posts yield? – Experiment 2(a): Model Selection – Experiment 2(b): Feature Assessment • How do features correlate with heightened attention? – Datasets: Twitter, Boards.ie, SCN and DiggUsing Behaviour Analysis to Detect Cultural Aspects in 23Social Web Systems
    • 25. Activity Distribution Twitter Boards.ie1. Predict a ranking2. Compare ranking against ground truth3. Measure using Normalised Discounted Cumulative Gain @ varying ranks (k) • k={1,5,10,20,50,100}4. Best model: highest nDCG averaged over k SCN DiggUsing Behaviour Analysis to Detect Cultural Aspects in 24Social Web Systems
    • 26. Results: 2(a) Model Selection• Which model performs best?Using Behaviour Analysis to Detect Cultural Aspects in 25Social Web Systems
    • 27. Results: 2(b) Feature Assessment• How do features correlate with heightened attention?Using Behaviour Analysis to Detect Cultural Aspects in 26Social Web Systems
    • 28. Results: 2(b) Feature Assessment• How do features correlate with heightened attention? • Heightened Activity on Twitter= • Shorter posts • Denser vocabulary • Fewer hyperlinks • Earlier in the day!Using Behaviour Analysis to Detect Cultural Aspects in 27Social Web Systems
    • 29. Results: 2(b) Feature Assessment• How do features correlate with heightened attention? • Heightened Activity on Boards.ie= • Concentrated topics • Longer posts • Wider vocabulary • Fewer referrals • Negative sentimentUsing Behaviour Analysis to Detect Cultural Aspects in 28Social Web Systems
    • 30. Results: 2(b) Feature Assessment• How do features correlate with heightened attention? • Heightened Activity on SCN= • Less author participation • Contacted fewer people • User contacted by many people • Longer posts • Wider vocabulary • More hyperlinksUsing Behaviour Analysis to Detect Cultural Aspects in 29Social Web Systems
    • 31. Results: 2(b) Feature Assessment• How do features correlate with heightened attention? • Heightened Activity on Digg= • Concentrated topics • Longer posts • Later in the day • Familiar community termsUsing Behaviour Analysis to Detect Cultural Aspects in 30Social Web Systems
    • 32. Generating Attention: Findings How do Social Web Systems differ in how attention is generated? • Commonalities – Fewer hyperlinks for Microblogging platforms and discussion message boards – Use familiar language to the community – Negative content yields more activity – Activity distribution What drives attention in one system is not the • Idiosyncrasies same as another – More hyperlinks on support forums – Lower topic affinity on news-sharing system – Models differ: a) best performing, b) coefficients: • Content: Twitter • User: Boards.ie, SCN • Focus: DiggAnticipating Discussion Activity on Community Forums. M Rowe, S Angeletou and H Alani. TheThird IEEE International Conference on Social Computing. Boston, USA. (2011) Using Behaviour Analysis to Detect Cultural Aspects in 31 Social Web Systems
    • 33. Analysis IIBehaviour RoleCompositionsUsing Behaviour Analysis to Detect Cultural Aspects in 32Social Web Systems
    • 34. Online Communities in Social Web Systems• Social Web Systems support online communities to function and grow, enabling: – Idea generation – Customer support – Problem solving• Managing and hosting communities can be – Expensive – Time-consuming• Social Web Systems have large investments, therefore they must: – flourish and remain active – remain… ‘healthy’Using Behaviour Analysis to Detect Cultural Aspects in 33Social Web Systems
    • 35. Increased Community Activity What did the community look like at the point?Using Behaviour Analysis to Detect Cultural Aspects in 34Social Web Systems
    • 36. Decreased Community Activity What were the conditions at this point?Using Behaviour Analysis to Detect Cultural Aspects in 35Social Web Systems
    • 37. The Need to Assess Behaviour• How can we gauge community health? – Post Count? – Communication/Interaction? – Behaviour?• Domination of one behaviour could lead to churn – Preece, 2000• Behaviour in online community is influenced by the roles that users assume – Preece, 2001• To provide health insights we need to monitor behaviour over time – Combined with basic health metrics (e.g. post count)• Enabling detection of how behaviour differs between systemsUsing Behaviour Analysis to Detect Cultural Aspects in 36Social Web Systems
    • 38. Modelling, Representing and Tracking Behaviour: How?• Users exhibit different behaviour in different contexts: – How can we model user behaviour and represent its change over time?• According to [Chan et al, 2010] users can be classified by their community role: – What behaviour correlates with community roles? – How can we label users as the system changes?• Communities evolve and change over time: – Is there a correlation between community composition and health? – Can we predict community changes based on composition data? How do Social Web Systems differ in terms of behaviour?Using Behaviour Analysis to Detect Cultural Aspects in 37Social Web Systems
    • 39. Behaviour Ontology• How can we model user behaviour and represent its change over time? http://purl.org/net/oubo/0.3Using Behaviour Analysis to Detect Cultural Aspects in 38Social Web Systems
    • 40. Behaviour Features• In-degree Ratio – Proportion of users that reply to user ui• Posts Replied Ratio – Proportion of posts by ui that yield a reply• Thread Initiation Ratio – Proportion of threads started by ui• Bi-directional Threads Ratio – Proportion of threads where ui is involved in a reciprocal action• Bi-directional Neighbours Ratio – Proportion of ui‘s neighbours with whom a reciprocal action has taken place• Average Posts per Thread – Mean number of posts in the threads that ui has participated in• Standard Deviation of Posts per Thread – Standard deviation of posts in the threads that ui has posted inUsing Behaviour Analysis to Detect Cultural Aspects in 39Social Web Systems
    • 41. Behaviour Roles Elitist Grunt Joining Conversationalist Popular Initiator Popular Participant Supporter Taciturn Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing discussion forums using common user roles. In Proc. Web Science Ignored Conf. (WebSci10), Raleigh, NC: US, 2010.Using Behaviour Analysis to Detect Cultural Aspects in 40Social Web Systems
    • 42. Behaviour Roles (2)• What behaviour correlates with community roles? T abl e 1. Roles and t he feat ure-t o-level mappings R ol e Feat ur e L evel E l i t i st I n-D egr ee R at i o l ow B i -di r ect i onal T hr eads R at i o hi gh B i -di r ect i onal N ei ghb our s R at i o l ow G r unt B i -di r ect i onal T hr eads R at i o m ed B i -di r ect i onal N ei ghb our s R at i o m ed A ver age Post s p er T hr ead l ow ST D of Post s p er T hr ead l ow Joi ni ng Conver sat i onal i st T hr ead I ni t i at i on R at i o l ow A ver age Post s p er T hr ead hi gh ST D of Post s p er T hr ead hi gh Popul ar I ni t i at or I n-D egr ee R at i o hi gh T hr ead I ni t i at i on R at i o hi gh Popul ar Par t i ci pant s I n-D egr ee R at i o hi gh T hr ead I ni t i at i on R at i o l ow A ver age Post s p er T hr ead m ed ST D of Post s p er T hr ead m ed Supp or t er I n-D egr ee R at i o m ed B i -di r ect i onal T hr eads R at i o m ed B i -di r ect i onal N ei ghb our s R at i o m ed T aci t ur n B i -di r ect i onal T hr eads R at i o l ow B i -di r ect i onal N ei ghb our s R at i o l ow A ver age Post s p er T hr ead l ow ST D of Post s p er T hr ead l ow I gnor ed Post s R epl i ed R at i o l owUsing Behaviour Analysis to Detect Cultural Aspects in 41Social Web Systems
    • 43. Constructing and Applying Rules• How can we label users as the system changes?Structural, social network, Feature levels change with thereciprocity, persistence, participation dynamics of the communityRun rules over each user’s features Based on related work, we associateand derive the community role composition roles with a collection of feature-to-level mappings e.g. in-degree -> high, out-degree -> highUsing Behaviour Analysis to Detect Cultural Aspects in 42Social Web Systems
    • 44. Composition vs Activity• Is there a correlation between community composition and health?• Community Message board: Boards.ie – All posts used from 2004 – 2006 – Selected 3 forums for analysis • F246: Commuting and Transport • F388: Rugby • F411: Mobile Phones and PDAs• Support Forum: Tiddlywiki – Software development forum used by BT’s development team• Measured at 12-week increments: – Forum composition (% of roles) • E.g. 20% elitists, 10% grunts, etc – Number of postsUsing Behaviour Analysis to Detect Cultural Aspects in 43Social Web Systems
    • 45. Correlation Results (1): Boards.ie Forum 246 – Commuting and TransportUsing Behaviour Analysis to Detect Cultural Aspects in 44Social Web Systems
    • 46. Correlation Results (2): Boards.ieForum 246 – Commuting Forum 388 – Rugby Forum 411 – Mobile Phones and Transport and PDAs Using Behaviour Analysis to Detect Cultural Aspects in 45 Social Web Systems
    • 47. Correlation Results: TiddlywikiUsing Behaviour Analysis to Detect Cultural Aspects in 46Social Web Systems
    • 48. Evolution Results (1): Boards.ie Forum 246 – Commuting and TransportUsing Behaviour Analysis to Detect Cultural Aspects in 47Social Web Systems
    • 49. Evolution Results (2): Boards.ieForum 246 – Commuting Forum 388 – Rugby Forum 411 – Mobile Phones and Transport and PDAs Using Behaviour Analysis to Detect Cultural Aspects in 48 Social Web Systems
    • 50. Evolution Results: TiddlywikiUsing Behaviour Analysis to Detect Cultural Aspects in 49Social Web Systems
    • 51. Predicting Community Health• Can we predict community changes based on composition data?1. Activity Change Detection: – Predict either an increase or decrease in activity – Features: roles and percentages – Class label: increase/decrease – Performed 10-fold cross validation with J48 decision tree2. Post Count Prediction: – Predict post count from role composition – Independent variables: roles and percentages – Dependent variable: post count – Induced linear regression model and assessed the modelUsing Behaviour Analysis to Detect Cultural Aspects in 50Social Web Systems
    • 52. Activity Change Detection Boards.ie TiddlywikiUsing Behaviour Analysis to Detect Cultural Aspects in 51Social Web Systems
    • 53. Post Count Prediction Boards.ie TiddlywikiUsing Behaviour Analysis to Detect Cultural Aspects in 52Social Web Systems
    • 54. Post Count Prediction Boards.ie Tiddlywiki • Increased Community Activity on Boards.ie = • More initiators • More participants • Less supporters • Fewer ignoredUsing Behaviour Analysis to Detect Cultural Aspects in 53Social Web Systems
    • 55. Post Count Prediction Boards.ie • Increased Community Activity on Tiddlywiki = • More conversationalists • More initiators • Fewer supporters • Fewer ignored TiddlywikiUsing Behaviour Analysis to Detect Cultural Aspects in 54Social Web Systems
    • 56. Clustering Communities by CompositionUsing Behaviour Analysis to Detect Cultural Aspects in 55Social Web Systems
    • 57. Behaviour Role Compositions: Findings • How do Social Web Systems differ in terms of behaviour? • Commonalities – No grunts in either system – Increase in ignored users and supporters decreases health – Increase in initiators increases activity • Idiosyncrasies – No elitists found on support-forum – Conversationalists improve activity in certain cases – Optimum behaviour compositions differModelling and Analysis of User Behaviour in Online Communities. S Angeletou, M Rowe and HAlani. International Semantic Web Conference. Bonn, Germany. (2011) Using Behaviour Analysis to Detect Cultural Aspects in 56 Social Web Systems
    • 58. Thesis: MicroculturesRecap Social Web Systems contain micro-cultures that differ in terms of a) user behaviour b) how attention is generated c) role compositions in such systemsUsing Behaviour Analysis to Detect Cultural Aspects in 57Social Web Systems
    • 59. Microcultures: Evidence• Social Web Systems contain micro-cultures that differ in terms of – a) User behaviour • Non-existence of roles in certain communities • Conversation behaviour important in certain communities – b) How attention is generated • Differences in optimum prediction models • Factors differ in driving activity – E.g. referrals, topic affinity – c) Role compositions in such systems • Intra and inter composition differencesUsing Behaviour Analysis to Detect Cultural Aspects in 58Social Web Systems
    • 60. Questions?Web: http://people.kmi.open.ac.uk/rowe http://www.matthew-rowe.comEmail: m.c.rowe@open.ac.ukTwitter: @mattroweshowUsing Behaviour Analysis to Detect Cultural Aspects in 59Social Web Systems