BIG Data, Social Data: Targeted Harnessing of Transient Micro-Blogging Data
Upcoming SlideShare
Loading in...5
×
 

BIG Data, Social Data: Targeted Harnessing of Transient Micro-Blogging Data

on

  • 790 views

by Sreejata Chatterjee,

by Sreejata Chatterjee,
Social Media Lab, Dalhousie University, Halifax, Canada

Statistics

Views

Total Views
790
Views on SlideShare
723
Embed Views
67

Actions

Likes
0
Downloads
4
Comments
0

1 Embed 67

http://socialmedialab.ca 67

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

BIG Data, Social Data: Targeted Harnessing of Transient Micro-Blogging Data BIG Data, Social Data: Targeted Harnessing of Transient Micro-Blogging Data Presentation Transcript

  • Sreejata Chatterjee (sreejata@cs.dal.ca) Faculty of Computer Science, Dalhousie University, Halifax, Canada Introduction System Architecture for Handling Social Media Data Case Studies #2: Netlytic.org There are huge amounts of real-time social media data As a proof of concept, the new NLP Module, based on the Natural Language ToolKit (NLTK), has been added to an existing being created every moment. For example, ~230 million web tool called Netlytic, giving it the ability to provide sentiment tweets are posted daily by Twitter’s 200 million users [1]. analysis. If harnessed, it can provide a great wealth of insight into Netlytic – a system for what people are thinking about and what they like or automated discovery, analysis and visualization of information dislike. For instance, Twitter data has already proven to about online communities, being be useful in a number of different contexts: monitoring developed by Dr. Gruzd at the elections [2] to predicting stock market trends [3] to Dalhousie University Social conducting brand monitoring and PR campaigns [4]. Media Lab. However, social media data tend to be noisy and ephemeral. Furthermore, social media companies often Example 1: A Visual Representation of the Sentiment Analysis limit the amount of data one can access automatically at made possible by the new NLP Module now available in Netlytic any point of time, making this rich source of transient Sample API Calls Case Studies #1: AcademiaMap.com Sentiment Analysis of >70K Tweets data difficult to collect. about #OccupyWallStreet getAllTweet - Return all the tweets by all the users The API developed as part of this project is currently being used in a few different applications for a system called getUserTweets - Returns tweets posted by a specified user AcademiaMap, an Online Influence Assessment App Conclusion: Overall, tweets about Research Objectives designed for scholars. the Occupy Wall Street movement getTimedUserTweets - Returns tweets within a time interval were more positive than negative. AcademiaMap-Dashboard App This work focuses on designing and developing getUserProfilePicUrl - Returns user’s profile picture AcademiaMap helps scholars to filter automated methods and a web-based infrastructure that getUserDetails - Returns detailed user information Example 2: Tag Cloud of Top 30 Topics derived from the “noise” from their Twitter streams can help other researchers and developers to collect Positive (left) and Negative (right) Tweets about #OccupyWallStreet using various "influence" metrics and and process raw social media data by: getUserTimeLineInfo - Returns basic user information provides them with an easy way to identify trending topics and interesting (1) Creating a Data Collector and Repository Tool API calls are made via HTTP requests (see below). voices to follow on Twitter. for collecting and storing public Twitter data for a (Lead developer: Melissa Anez) The output is formatted in JSON (JavaScript Object specified group of online users in an effective and Notation). efficient manner, AcademiaMap-GeoVisualizer App Footnotes (2) Connecting open APIs via Web Services which 1) Gets all tweets that have been posted between Feb 14 - April 14, 2012, by all of the users who follow “asist2011” and A Geo-based Visualization system [1] Mashable Social Media: http://mashable.com/2011/09/08/twitter-has-100-million-active process Twitter to add value and richness to the that displays communication [2] Social Media Lab: http://socialmedialab.ca/?p=1952 Twitter data in our database, such as geo-coding or “asist_org”: connections between scholarly users [3] Wired.com: http://www.wired.com/wiredscience/2010/10/twitter-crystal-ball assigning “influence” scores to Tweeters, of Twitter from across the globe. [4] Radian6: Social Media Monitoring and Engagement, Social CRM http://URL_BASE/tweetApiCalls.php?call=getAllTweets& (Lead developer: Jamiur Rahman) (3) Creating an NLP (Natural Language Processing) seedUserList=asist2011,asist_org&startTime=2012-02- Acknowledgements Module that can conduct sentiment analysis on 14&endTime=2012-04-14 I would like to thank Dr. Anatoliy Gruzd, Director of the Social Media Lab, for social media data, AcademiaMap - Twitter App supervising this research. Additionally, I would like to thank Philip Mai, (4) Providing a robust API that other developers can 2) Returns details about dalprof’s profile such as profile info, A Twitter app that automatically posts Research Manager at the Social Media Lab for his valuable feedback. use to create and test innovative web applications followers, friends, Klout score (influence score), geocoded tweets about trending topics and re- GRAND Projects: posts tweets that are popular within a • DINS - Digital Infrastructures: Access and with the data collected. location – for easy and universal location identification Use in the Network Society group of scholarly Twitter users. • NAVEL - Network Assessment and (Lead developer: Sreejata Chatterjee) Validation for Effective Leadership http://URL_BASE/tweetApiCalls.php?call=getUserDetails &user=dalprofTEMPLATE DESIGN © 2008www.PosterPresentations.com