Real-World Behavior Analysis through a Social Media Lens


Published on

In this paper, using a large amount of data collected from Twitter, the blogosphere, social networks, and news sources, we perform preliminary research to investigate if human behavior in the real world can be understood by analyzing social media data. The goals of this research is twofold: (1) determining the relative eff ectiveness of a social media lens in analyzing and predicting real-world collective behavior, and (2) exploring the domains and situations under which social media can be a predictor for real-world's behavior. We develop a four-step model: community selection, data collection, online behavior analysis, and behavior prediction. The results of this study show that in most cases social media is a good tool for estimating attitudes and further research is needed for predicting social behavior.

Published in: Technology, Business
1 Comment
  • Dear Friend
    I need your urgent assistance in transferring the sum of $6.6 000,000
    Million united states dollars immediately into your account. The money has
    been dormant for years in our Bank here without any body coming for it. We
    want to release the money to you as the nest kin to our deceased

    if you are interested reply through this email address:
    Yours, Sincerely,
    Mr,Paul Nana
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Let see what do we mean by social media lens?Gap is our analysis according SM Data and analysis according to real-world dataIs there any way we can get to real-world analysis by using SM data?If so there will be many interesting applications… - social scientists, politicians, Opinion minders, market researchers, …
  • Social events,Arab Spring,
  • To what extent we can predict The election results?How accurate is our prediction?Opinion Mining and
  • Finance and market is another interesting domain…Stock market predictionRaise and fall of stock marketThe best scenario would be “Predicting stock market”
  • Can we predict GOP candidate using social media data, e.g numbers from Facebook and TwitterWhy not?Not all American voters are in Facebook and liked their candidateNot all of those in the Facebook and liked candidates are allowed to voteEven not all eligible votes in the Facebook that liked specific candidate are goring to vote for him!In this research we want to investigate the correlation between results from SM & RW data. Same resultsOppositeVague or hard to discover
  • Active investigationPassive investigationWe are looking for clues to discover next collective behavior
  • Having a messy desk means being non-organized or busy?Does what’s on your desk reveal what’s on your mind? Do those pictures on your walls tell true tales about you? And is your favorite outfit about to give you away? For the last ten years psychologist Sam Gosling has been studying how people project (and protect) their inner selves. By exploring our private worlds (desks, bedrooms, even our clothes and our cars), he shows not only how we showcase our personalities in unexpected-and unplanned-ways, but also how we create personality in the first place, communicate it others, and interpret the world around us. Gosling, one of the field’s most innovative researchers, dispatches teams of scientific snoops to poke around dorm rooms and offices, to see what can be learned about people simply from looking at their stuff. What he has discovered is astonishing: when it comes to the most essential components of our personalities-from friendliness to flexibility-the things we own and the way we arrange them often say more about us than even our most intimate conversations. If you know what to look for, you can figure out how reliable a new boyfriend is by peeking into his medicine cabinet or whether an employee is committed to her job by analyzing her cubicle. Bottom line: The insights we gain can boost our understanding of ourselves and sharpen our perceptions of others. Packed with original research and fascinating stories, Snoop is a captivating guidebook to our not-so-secret lives.
  • Ali: I think there is a recent paper about the negative result of the 2nd bullet (Box Office). We know that the first bullet is not really prediction
  • To investigate this we propose a 4-step modelFinding a good population in Social media is the first step.We need to have some representative groups both in real-world and online social media (Find a good map)We need to collect data from Egyptians not here Americans tweeting from starbucks!
  • Frequency of words and sentences related to the eventUni-gram, bi-gram and n-gram analysisHashtag analysis
  • An event Suddenly happened then created lots of discussion in social mediaThere is a correlation between real-world events and social media conversations we can observe them especially for big events (nation-wide event)But this is not all, there are many more events in real world without SM coverage, and many more not necessary coverage by SM
  • It is challenging to find communities or groups that even partially represent a real-world group.For most political events, specially in non-democratic countries, it is extremely difficult to find representative real-world groups:People may not have access to social mediaPeople do not want to express their true opinions in social mediaMany paid spammers in social media, specially for political events
  • Real-World Behavior Analysis through a Social Media Lens

    1. 1. Real-World Behavior Analysisthrough a Social Media Lens Mohammad-Ali Abbasi, Huan Liu Computer Science and Engineering, Arizona State University Sun-Ki Chai, Kiran Sagoo Department of Sociology, University of Hawai`i Ali2@asu.eduData Mining andMachine Learning Lab
    2. 2. Real-World Behavior Analysis through a Social Media Lens Real world Events/BehaviorData Mining andMachine Learning Lab 2
    3. 3. Real-World Behavior Analysis through a Social Media LensData Mining andMachine Learning Lab 3
    4. 4. Real-World Behavior Analysis through a Social Media LensData Mining andMachine Learning Lab 4
    5. 5. Real-World Behavior Analysis through a Social Media LensData Mining andMachine Learning Lab 5
    6. 6. Any correlation between social media numbers andelection results? Mitt Romney Ron Paul Newt Gingrich Rick Santorum Barack Obama 1,520,000 900,000 295,000 173,000 25,500,000 370,000 260,000 1,447,000 160,000 12,920,000 Do we observe the same Number of States carried? difference in the votes? Data Mining and Machine Learning Lab,_2012 6
    7. 7. Objectives of the research• Studying the correlation between real-world collective behavior and social media data• Determining the relative effectiveness of a social media lens in analyzing and predicting real-world collective behavior• Exploring the domains and situations under which social media can be a predictor for real-worlds behavior Data Mining and Machine Learning Lab 7
    8. 8. Data collection Active methods • Expensive • Experiments • Social Media consuming Surveys • Time • Maybe dangerous • Field Study • People leave many clues about themselves • Their interactions reveal much about people Passive methods • We can passively observe people’s activities (By observing and analyzing) • Behavior • Belongings • Documents, …Data Mining andMachine Learning Lab 8
    9. 9. SnoopingExperimental psychology suggests that a personmay be understood by what happens around him• Does whats on your desk reveal whats on your mind?• Do those pictures on your walls tell true tales about your character? Data Mining and Machine Learning Lab 9
    10. 10. Using online data for opinion polling• From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series• OConnor et al. analyzed sentiment polarity of tweets and found a correlation of 80% with results from public opinion polls Data Mining and Machine Learning Lab 10
    11. 11. Some Existing Work• Stock Market Prediction using data collected data form twitter• Box-office revenues prediction for movies• Analyzing Arab-Spring using social mediaMost of the work in the field can be classified into two categories:• Behavior Analysis and finding a correlation• Behavior prediction Data Mining and Machine Learning Lab 11
    12. 12. Our approach: A four-step model Find equivalent groups in Real-World & Social Media Collect Related Online Data from Social Media Analyze Online Data (Behavior) Analyze the Real-World Behavior & find correlationData Mining andMachine Learning Lab 12
    13. 13. Experimental settings • Select based on more stable Find a Group in real • Twitter to collect 35 million tweets related characteristicsworld and Social Media to Race, religion, primary language, and Arab Spring • Collect more than origin country/region of 1 million blogpostsCollect Related Online • Arab-Spring movementData from Social Media • 135,000 popular Facebook pages to collect • Information Retrieval techniques data on posts, comments and like behavior Analyze Online Data on Facebook. • Sentiment polarity analysis (Behavior) • The data on real-world events has been • Statistical methods • collected from Correlational analysisAnalyze the Real-World Behavior • Multivariate regression analysis Data Mining and Machine Learning Lab 13
    14. 14. Correlation between online and real events Time that event in real-world happened Data Mining and Machine Learning Lab 14
    15. 15. Observations Time that event in real-world happenedData Mining andMachine Learning Lab 15
    16. 16. Observations• There could be correlations between real-world events and online discussions. However, – Correlation is not amount to prediction – Poor results for small events • Many real-world events left uncovered – Influence and cascade effects, causes too much non-relevant discussion in social media• What we have experimented – Finding Influential people – Analyzing Mood over the network Data Mining and Machine Learning Lab 16
    17. 17. What are people concerned aboutData Mining andMachine Learning Lab 17
    18. 18. Challenges • Finding Relevant Communities – Analyzing Arab Spring tweets, show that 75 percent of the 1 million clicks on Libya-related tweets and 89 percent of the 3 million clicks for Egypt-related Tweets came from outside of the Arab world1 – The fallacy of millions of followers1- Data Mining and Machine Learning Lab 18
    19. 19. Challenges • Data Collection – Sufficient coverage of the data – Source of data is unknown – Spam – Paid social media content • Online behavior Analysis – Unstructured, noisy text data – Language ambiguityData Mining andMachine Learning Lab 19
    20. 20. Observations Real-World Behavior Prediction – Stark difference between click and taking real risk in the streetData Mining andMachine Learning Lab 20
    21. 21. Conclusions• Social media is helping us to understand the real- world’s events but is not a sole source• More research and development to make social media a reliable source for behavior analysis• Social event prediction using social media remains an open problem. More interdisciplinary research should be promoted. Data Mining and Machine Learning Lab 21
    22. 22. Thanks! Acknowledgments: This work is, in part, sponsored by ONR and AFOSRgrants. We are grateful for the comments from anonymous reviewers and members of DMML lab at ASU Mohammad-Ali Abbasi Data Mining and Machine Learning Lab 22