0
Analyzing Events Through the            Lens of Social Media             Debanjan Mahata (dxmahata@ualr.edu)              ...
Outline•   Introduction•   Motivation•   Challenges•   Proposed Framework•   Data collection and processing•   Experiments...
Introduction: Socio-Political Events•   Tunisia Revolution•   Egypt Revolution•   Bahrain Protest•   Libya Revolution•   M...
Introduction: Economic Events• Spanish Indignants Movements (Spanish  protests, 15-M)• #Occupy worldwide
Introduction: Disaster-related Events• Japan Earthquake & Tsunami• Southeast Asia Floods (crocodile alerts)• Haiti Earthqu...
Social Media’s Influence2006                 2011
Social Media’s Influence•   Social media played a phenomenal role in organizing these events•   Citizen journalism at its ...
Goals of the Research• We study how social media can be leveraged  to analyze  – Events and their characteristics  – Cover...
Challenges•   Identifying the right social media sources•   Language barrier•   Colloquial usage, misspellings, sparse lin...
ChallengesC
Proposed Methodology• Identifying the right social media sourcesSpecificity (κ) of a source ‘S’ for an event ‘E’IG(E, S) =...
Proposed Methodology• Identifying the right social media sourcesCloseness (τ) of a term/entity ‘e’ to a source ‘E’   τ = P...
Construction of Event Dictionaries• Reference point to                 Egyptian revolution        Tahrir Square, Egyptian ...
Data collection• Collected using Google Blog Search• From blogspot.com      Event              Query Term         Number o...
Data DescriptionBlogger specific                        Blog post                                         specific        ...
Source-Entity Distribution:   Egyptian Revolution
Source-Entity Distribution                    Libyan                  Revolution   Tunisian  Revolution
Validation - Egyptian revolution
Validation                        Libyan RevolutionTunisian Revolution
Rank Comparison                      Blog Post URL                                   Specificity based    Google Search   ...
Scatter plotsTunisian revolution                  Egyptian                                     revolution                 ...
Further Analysis: Source Specificity vs. Location                                         All Sources
Further Analysis: Source Specificity vs. Location                                           Sources                       ...
Conclusions•   Relevance of social media in various events•   Methodology to analyze events via social media•   Associated...
Thank You
Observation• Socio-demographic   –   Location   –   Age   –   Gender   –   Profession (occupation, industry)   –   etc.• S...
Specificityκ = IG(Ei ,Sk ) = H(Ei )− H(Ei ,Sk )                     i= n         i= nκ = −H (Ei ,Sk ) =   ∑ fτ ∑ f        ...
Blog Post Url                        Our      Google Search                                                 Ranking   Engi...
Blog Post Url                 Our     Google Search Engine                                        Ranking         Rankingh...
Upcoming SlideShare
Loading in...5
×

Analyzing events through the lens of social media

1,044

Published on

32nd Subelt conference on Social Networks, organized by International Network for Social Network Analysis.

Published in: News & Politics, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,044
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Middle East and North Africa (MENA) region
  • Thailand, Philippines, Cambodia
  • 2006 time magazine selected “You” as the person of the year. in 2011. Web 2.0/social media enabled, or more precisely helped people topple decades-old authoritarian regimes in MENA region. Essentially, what people tried to do for the last 40 years social media helped to accomplish that in 5 years.
  • Social media has irreversibly transformed how people communicate, organize, mobilize, respond
  • Formation of collective action, manifestation of social movements, etc.
  • 100s of millions of blogs, billions of tweets, several thousands youtube videos. Tons of region-specific sources. Strictly network-based approaches do not usually perform well.
  • Georgian Cyber Campaign (2009) brought Internet traffic to a standstill in the Republic of Georgia. The attacks, which coincided with the Russian military ’s invasion of Georgia, were carried out in large part out by civilians and Russian crime gangs. The attacks were significant in that they made it almost impossible for citizens and officials to communicate about what was happening on the ground during the military operation. According to a US Cyber Consequences Unit (US-CCU) August 2009 special report on this cyber campaign, social networking forums were the primary means used to recruit and arm the attackers. Social media has a key role in monitoring and tracking cyber-threats.
  • Chicken and egg problem – to identify good event dictionary you need good source and to identify good source you need event dictionary
  • ef-iEf = Generalization of content analysis measure, tf-idf
  • Alchemy API used to extract entities Our approach looks at quality (closeness) of the entities and not just quantity, so it is robust to the skewed distribution depicted above.
  • Also motivates the need for studying region-specific often non-English language sources.
  • In order to select the highly specific sources, we propose a novel ‘specificity’ measure, which estimates the unique information that a source ( S k ) can offer vis-à-vis an event ( E i ). It is important to note here that a source’s specificity is always estimated with respect to a given event. The measure draws upon the theory of information gain and is defined as  . Mathematically, where IG(E i , S k ) denotes the information gain for a source related to an event E i   , where S k = k th source belonging to ; E i = i th event belonging to  ; denotes the set of sources for an event E i   , ;  denotes the set of events selected for the study; H(E i ) is the total entropy for the event E i   , H(E i ,S k ) is the total entropy of the source S k related to the event E i   . Since H(E i ) is constant for every event, IG(E i, S k ) is directly proportional to - H(E i , S k ) . So we only calculate the values for H(E i , S k ) in order to find  . The formulation of  discussed above is generic. Sections 6.3.1 explains a specific implementation.
  • Transcript of "Analyzing events through the lens of social media"

    1. 1. Analyzing Events Through the Lens of Social Media Debanjan Mahata (dxmahata@ualr.edu) Nitin Agarwal (nxagarwal@ualr.edu) University of Arkansas at Little RockThis work is supported in part by grants from the US Office of Naval Research (ONR) and US National Science Foundation (NSF)
    2. 2. Outline• Introduction• Motivation• Challenges• Proposed Framework• Data collection and processing• Experiments- Results and Analysis• Looking Ahead
    3. 3. Introduction: Socio-Political Events• Tunisia Revolution• Egypt Revolution• Bahrain Protest• Libya Revolution• Morocco Protest• Algeria Protest• Yemen Protest• …, among others.
    4. 4. Introduction: Economic Events• Spanish Indignants Movements (Spanish protests, 15-M)• #Occupy worldwide
    5. 5. Introduction: Disaster-related Events• Japan Earthquake & Tsunami• Southeast Asia Floods (crocodile alerts)• Haiti Earthquake
    6. 6. Social Media’s Influence2006 2011
    7. 7. Social Media’s Influence• Social media played a phenomenal role in organizing these events• Citizen journalism at its best
    8. 8. Goals of the Research• We study how social media can be leveraged to analyze – Events and their characteristics – Coverage differences from mainstream media – Socio-demographic, socio-technical behavioral patterns – and explore further implications of the research
    9. 9. Challenges• Identifying the right social media sources• Language barrier• Colloquial usage, misspellings, sparse links• Extracting relevant information from the sources – Entity extraction and resolution• Evaluation due to lack of benchmark datasets.
    10. 10. ChallengesC
    11. 11. Proposed Methodology• Identifying the right social media sourcesSpecificity (κ) of a source ‘S’ for an event ‘E’IG(E, S) = H (E) − H (E | S)  1   p(s)  = ∑ p(e) log − ∑  p(e)  e∈E,s∈S p(e, s) log   p(e, s)  e∈E
    12. 12. Proposed Methodology• Identifying the right social media sourcesCloseness (τ) of a term/entity ‘e’ to a source ‘E’ τ = P(e, E) = P(E)P(e | E) P(e | E) = efiEf = ef (e, E)*iEf (e)• Creating Event dictionaries
    13. 13. Construction of Event Dictionaries• Reference point to Egyptian revolution Tahrir Square, Egyptian specific dictionary government, Gigi construct event vocabulary Ibrahim, Alexandria,• Independent of the sources Wael Abbas, …• Globalvoicesonline.org Libyan revolution specific Tripoli, Muammar Al dictionary Gaddafi, North Atlantic• Extract entities from global Treaty Organization, voices online source Chad, United Kingdom, …• Use closeness measure to Tunisian revolution Tunisian government, Lin order the entities based on specific dictionary Ben Mhenni, Samir Feriani, Kasbah Square, relevance to the event RCD, … – Event-specific dictionary Socio-political (global) Twitter, Iranian – Event category-specific event dictionary Government, Tear gas dictionary devices, Facebook, Big Social network, … Top 5 entities in the event specific and Event category-specific dictionaries
    14. 14. Data collection• Collected using Google Blog Search• From blogspot.com Event Query Term Number of Blogs DatesEgyptian Revolution “egyptian 579 25th January, 2011 – revolution” OR 7th December, 2011 “egypt protest” Libyan Revolution “libyan revolution” 600 15th February, 2011 OR “libya protest” – 7th December, 2011Tunisian Revolution “tunisian 484 17th December, revolution” OR 2010 – 7th “tunisia protest” December, 2011
    15. 15. Data DescriptionBlogger specific Blog post specific Blog specific URL URL URL TimestampWork information Blogging tags Text Gender Outlinks Blogs followed Topic Category Blogs owned Language
    16. 16. Source-Entity Distribution: Egyptian Revolution
    17. 17. Source-Entity Distribution Libyan Revolution Tunisian Revolution
    18. 18. Validation - Egyptian revolution
    19. 19. Validation Libyan RevolutionTunisian Revolution
    20. 20. Rank Comparison Blog Post URL Specificity based Google Search Ranking Engine Ranking http://chinamatters.blogspot.com/2011/03/counterpunch-on- egyptian-revolution.html 1 59http://happyarabnewsservice.blogspot.com/2011/02/orange-county- womans-role-in-egyptian.html 2 286http://travel-and-immigration101.blogspot.com/2011/03/travel-news- egypt-tourism-revival.html 3 400http://travel-and-immigration101.blogspot.com/2011/03/travel-news- egypt-tourism-revival.html 4 277 http://geniusofinsanityworld.blogspot.com/2011/02/egyptian- revolution-vs-iraqi-regime.html 5 55http://jewssansfrontieres.blogspot.com/2011/03/egyptian-revolution- and-palestine.html 6 202 http://uprootedpalestinians.blogspot.com/2011/01/live-from- egyptian-revolution.html 7 6 http://mespectator.blogspot.com/2011/10/egyptian-revolution- between-citizens.html 8 9 http://yourheartsontheleft.blogspot.com/2011/06/egyptian- revolution-phase-one.html 9 313 http://sohabayoumi.blogspot.com/2011/02/egyptian-revolution- tuesday-february-1.html 10 374
    21. 21. Scatter plotsTunisian revolution Egyptian revolution Libyan revolution
    22. 22. Further Analysis: Source Specificity vs. Location All Sources
    23. 23. Further Analysis: Source Specificity vs. Location Sources localized to Egypt
    24. 24. Conclusions• Relevance of social media in various events• Methodology to analyze events via social media• Associated challenges• Proposed measures to identify specific sources with respect to atomic information units/entities• Evaluation framework• Popular sources may not be specific• Localized sources tend to be more specific• Expand the dataset, include more and various types of events• Use as apparatus to analyze social movements, collective actions, marketing research, etc.
    25. 25. Thank You
    26. 26. Observation• Socio-demographic – Location – Age – Gender – Profession (occupation, industry) – etc.• Socio-technical – Links – Devices – Other social media profiles• Network of bloggers from the extracted data
    27. 27. Specificityκ = IG(Ei ,Sk ) = H(Ei )− H(Ei ,Sk ) i= n i= nκ = −H (Ei ,Sk ) = ∑ fτ ∑ f i i i i=1 i=1
    28. 28. Blog Post Url Our Google Search Ranking Engine Rankinghttp://egyptianchronicles.blogspot.com/2011/0 1 8/libyan-revolution-mermaid-is-liberated.html 13 http://myblog- angeln.blogspot.com/2011/02/citys- 2 old-market-square-taken-over-by.html 329http://shadowlight9.blogspot.com/2011 /10/revolution-and-democracy-xv- 3 libyan_23.html 9http://libyanconflict.blogspot.com/20 11/03/were-libyan-rebels-in-zawiyah- 4 defeated.html 194http://egyptianchronicles.blogspot.com/2011/0 5 8/libyan-revolution-they-are-just-like.html 24http://realworldrants.blogspot.com/20 11/03/middle-eastafrican- 6 revolutionprotest_21.html 311http://usahmadawang.blogspot.com/2011 /03/libyas-first-lady-owns-20-tons- 7 of-gold.html 364http://sincerelyours1.blogspot.com/20 8 11/03/meddling-in-libya.html 204http://redrave.blogspot.com/2011/08/a fter-collapse-of-gaddafi-regime- 9 where.html 374http://simonlaub.blogspot.com/2011/02 /libyan-revolution-february-2011- 10 youtube.html 184
    29. 29. Blog Post Url Our Google Search Engine Ranking Rankinghttp://mideasti.blogspot.com/2011/01/president-pm-quit-rcd-does-it- 1mean.html 162http://israelmatzav.blogspot.com/2011/01/surprise-plo-walks-back-support- 2of.html 40http://boienwitkowski.blogspot.com/20 311/01/part-14.html 420 4http://machonneuse.blogspot.com/ 459http://thetunisianrevolution.blogspot.com/2011/02/situation-is-quite- 5fluid.html 72http://harakaproject.blogspot.com/2011/04/force-majeure-how-can-we- 6curate.html 181http://youchefayla.blogspot.com/2011/ 710/1789-reloaded.html 152http://rose4hillary.blogspot.com/2011/02/hillary-clintonwheels-up-for- 8geneva.html 440http://ibloga.blogspot.com/2011/01/qaradawi-hails-tunisian-revolution- 9says.html 99http://hassanposts.blogspot.com/2011_ 1001_01_archive.html 174
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×