SlideShare a Scribd company logo
CSE509: Introduction to Web Science and Technology Lecture 5: Social Network Analysis ArjumandYounus Web Science Research Group Institute of Business Administration (IBA)
Last Time… Web Data Explosion Part I MapReduce Basics MapReduce Example and Details MapReduce Case-Study: Web Crawler based on MapReduce Architecture Part II Large-Scale File Systems Google File System Case-Study August 06, 2011
Today Transition from Web 1.0 to Web 2.0 Social Media Characteristics Part I: Theoretical Aspects Social Networks as a Graph Properties of Social Networks Part II: Getting Hands-On Experience on Social Media Analytics Twitter Data Hacks Part III: Example Researches August 06, 2011
Quick Survey Do you have a Facebook, MySpace, Twitter, or LinkedIn account? Do you own a blog? Do you read blogs? Have you ever searched for something on Wikipedia? Have you ever submitted content to a social network? August 06, 2011
Web 1.0 vs. Web 2.0 August 06, 2011 Borrowed from SIGKDD 2008 tutorial slides of Professor Huan Liu and Professor Nitin Agarwal with permission
What is so Different about Web 2.0? User Generated Content Collaborative Environment: Participatory Web, Citizen Journalism User is the Driving Factor August 06, 2011 A Paradigm Shift rather than a Technology Shift
Top 20 Most Visited Web Sites Internet traffic report by Alexa on July 29th 2008 August 06, 2011 Borrowed from SIGKDD 2008 tutorial slides of Professor Huan Liu and Professor Nitin Agarwal with permission
Various forms of Social Media Blog: Wordpress, blogspot, LiveJournal Forum: Yahoo! Answers,  Epinions Media Sharing: Flickr, YouTube, Scribd Microblogging: Twitter, FourSquare Social Networking: Facebook, LinkedIn, Orkut Social Bookmarking: Del.icio.us, Diigo Wikis: Wikipedia, scholarpedia, AskDrWiki August 06, 2011
Characteristics of Social Media “Consumers” become “Producers” Rich User Interaction User-Generated Contents Collaborative environment Collective Wisdom Long Tail Broadcast Media Filter, then Publish Social Media Publish, then Filter August 06, 2011
August 06, 2011
PART I: Theoretical Aspects August 06, 2011
Networks and Representation Social Network:  A social structure made of nodes (individuals or organizations) and edges that connect nodes in various  relationships like friendship, kinship etc.  August 06, 2011 ,[object Object]
Matrix Representation,[object Object]
Scale-Free Distributions Degree distribution in large-scale networks often follows a power law.  A.k.a. long tail distribution, scale-free distribution August 06, 2011 Degrees        Nodes
Small-World Effect “Six Degrees of Separation” A famous experiment conducted by Travers and Milgram (1969) Subjects were asked to send a chain letter to his acquaintance in order to reach a target person  The average path length is around 5.5 Verified on a planetary-scale IM network of 180 million users (Leskovec and Horvitz 2008)  The average path length is 6.6 August 06, 2011
Small World Facebook Experiment by Yahoo! Labs Anyone in the world can get a message to anyone else in just "six degrees of separation" by passing it from friend to friend. Sociologists have tried to prove (or disprove) this claim for decades, but it is still unresolved. http://smallworld.sandbox.yahoo.com/ August 06, 2011
Community Structure Community: People in a group interact with each other more frequently than those outside the group ki = number of edges among node Ni’s neighbors Friends of a friend are likely to be friends as well Measured by clustering coefficient:  Density of connections among one’s friends August 06, 2011
Clustering Coefficient August 06, 2011 ,[object Object]
k6=4 as e(4,5), e(5,7), e(5,8), e(7,8)
C6 = 4/(4*3/2) = 2/3
Average clustering coefficientC = (C1 + C2 + … + Cn)/n ,[object Object]
In a random graph, the expected coefficient is  14/(9*8/2) = 0.19.,[object Object]
Social Computing Tasks Social Computing: a young and vibrant field Conferences: KDD, WSDM, WWW, ICML, AAAI/IJCAI, SocialCom, etc. Tasks Centrality Analysis and Influence Modeling Community Detection Classification and Recommendation Privacy, Spam and Security August 06, 2011
Centrality Analysis and Influence Modeling Centrality Analysis:  Identify the most important actors or edges E.g. PageRank in Google Various other criteria Influence modeling:  How is information diffused?  How does one influence each other?   Related Problems Viral marketing: word-of-mouth effect Influence maximization August 06, 2011
Community Detection A community is a set of nodes between which the interactions are (relatively) frequent A.k.a.,  group, cluster, cohesive subgroups, modules  Applications: Recommendation based communities, Network Compression, Visualization of a huge network  New lines of research in social media Community Detection in Heterogeneous Networks Community Evolution  in Dynamic Networks Scalable Community Detection in Large-Scale Networks August 06, 2011
Classification and Recommendation Common in social media applications Tag suggestion, Product/Friend/Group Recommendation August 06, 2011 Link prediction Network-Based Classification
Privacy, Spam and Security Privacy is a big concern in social media Facebook, Google buzz often appear in debates about privacy NetFlix Prize Sequel cancelled due to privacy concern Simple anonymization does not necessarily protect privacy Spam blog (splog), spam comments, fake identity, etc., all requires new techniques As private information is involved, a secure and trustable system is critical  Need to achieve a balance between sharing and privacy August 06, 2011
PART II: Practical SNA with Twittersphere Mining August 06, 2011
Pre-Requisites Expectation that Python is installed and you have some hands-on experience with it Dependencies easy_install networkx twitter (Twitter API for Python) For Windows users Install ActivePython: comes bundled with easy_install easy_installnetworkx easy_install twitter For Linux users sh setuptools-0.6c11-py2.6.egg sudoeasy_installnetworkx sudoeasy_install twitter August 06, 2011
Getting Tweets from Twitter Search API import twitter import json twitter_search=twitter.Twitter(domain="search.twitter.com") search_results=[] for page in range(1,6): search_results.append(twitter_search.search(q="pakistan",rpp=100,page=page)) print json.dumps(search_results, sort_keys=True, indent=1) tweets=[r['text'] for result in search_results for r in result['results']] print tweets August 06, 2011
Lexical Diversity for Tweets words=[] for t in tweets:     words+= [w for w in t.split()] lexical_diversity=1.0*len(set(words))/len(words) August 06, 2011
What People are Tweeting: Frequency Analysis freq_dist=nltk.FreqDist(words) freq_dist.keys()[:50] freq_dist.keys()[-50:] August 06, 2011
Extracting Relationships from Tweets (1/3) Step 1: Extracting Graph Data import networkx as nx import re g=nx.DiGraph() twitter_search=twitter.Twitter(domain="search.twitter.com") search_results=[] for page in range(1,6): search_results.append(twitter_search.search(q="pakistan",rpp=100,page=page)) all_tweets=[tweet for page in search_results for tweet in page["results"]] def get_rt_sources(tweet): rt_patterns=re.compile(r"(RT|via)((?:*@+)+)",re.IGNORECASE) return [source.strip() for tuple in rt_patterns.findall(tweet) for source in tuple if source not in ("RT", "via")] for tweet in all_tweets: rt_sources=get_rt_sources(tweet["text"]) if not rt_sources: continue for rt_source in rt_sources: g.add_edge(rt_source,tweet["from_user"],{"tweet_id":tweet["id"]}) August 06, 2011
Extracting Relationships from Tweets (2/3) Step 2: Generating DOT File OUT = "pakistan_search_results.dot“ dot=['"%s" -> "%s" [tweet_id=%s]' % (n1.encode('utf-8'), n2.encode('utf-8'), g[n1][n2]['tweet_id']) for n1, n2 in g.edges()] f=open(OUT, 'w') f.write('strict digraph {%s}' % (';'.join(dot),)) f.close() August 06, 2011
Extracting Relationships from Tweets (3/3) Step 3: Visualizing the Retweet Data in Graphical Form For Windows users For Linux users circo -Tpng -Osnl_search_results pakistan_search_results.dot August 06, 2011
PART III: Example Researches August 06, 2011
Million Follower Fallacy (New York Times) August 06, 2011
Twitter: More a News Medium than a Social Network (PC World) August 06, 2011
Twitter for World Peace (Business Week) August 06, 2011
SocialFlow: Social Media Optimization Social Media Optimization Platform Works in Domains of Viral and Word-of-Mouth Marketing Provides Services to Major Media Outlets  Recent study How different audiences consumed and rebroadcast messages news organizations were sending out: AlJazeera English, BBC News,  CNN, The Economist, Fox News and New York Times August 06, 2011
August 06, 2011 Twitter as a Real-Time News Analysis Service
Studying Ins and Outs of News Using Twitter to study hot news items people are heavily tweeting about August 06, 2011
Algorithm for Identification of Popular News August 06, 2011

More Related Content

What's hot

2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL
Marc Smith
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
Giorgos Cheliotis
 
New Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives MeetupNew Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives Meetup
Tatyana Kanzaveli
 
The Basics of Social Network Analysis
The Basics of Social Network AnalysisThe Basics of Social Network Analysis
The Basics of Social Network Analysis
Rory Sie
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
Marc Smith
 
Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...
Christoph Trattner
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Lauri Eloranta
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
Toronto Metropolitan University
 
Community detection in complex social networks
Community detection in complex social networksCommunity detection in complex social networks
Community detection in complex social networks
Aboul Ella Hassanien
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
ACMBangalore
 
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slidesMining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Michael Mathioudakis
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
Caleb Jones
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020
Michael Mathioudakis
 
00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview
Duke Network Analysis Center
 
Prof. Hendrik Speck - Social Network Analysis
Prof. Hendrik Speck - Social Network AnalysisProf. Hendrik Speck - Social Network Analysis
Prof. Hendrik Speck - Social Network Analysis
Hendrik Speck
 
2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow
Marc Smith
 
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOA COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
ijaia
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
Marco Brambilla
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited Talk
Ralf Klamma
 
Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101
librarianrafia
 

What's hot (20)

2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
New Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives MeetupNew Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives Meetup
 
The Basics of Social Network Analysis
The Basics of Social Network AnalysisThe Basics of Social Network Analysis
The Basics of Social Network Analysis
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
 
Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Community detection in complex social networks
Community detection in complex social networksCommunity detection in complex social networks
Community detection in complex social networks
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
 
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slidesMining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020
 
00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview
 
Prof. Hendrik Speck - Social Network Analysis
Prof. Hendrik Speck - Social Network AnalysisProf. Hendrik Speck - Social Network Analysis
Prof. Hendrik Speck - Social Network Analysis
 
2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow
 
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOA COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBO
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited Talk
 
Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101
 

Similar to CSE509 Lecture 5

2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis
Marc Smith
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
BalasundaramSr
 
Q046049397
Q046049397Q046049397
Q046049397
IJERA Editor
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …
Marc Smith
 
Requirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docx
Requirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docxRequirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docx
Requirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docx
heunice
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
rangesharp
 
Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...
Shalin Hai-Jew
 
Scaling Community Information Systems
Scaling Community Information SystemsScaling Community Information Systems
Scaling Community Information Systems
Ralf Klamma
 
GIJC19 - NodeXL Tutorial - Session 1
GIJC19 - NodeXL Tutorial - Session 1GIJC19 - NodeXL Tutorial - Session 1
GIJC19 - NodeXL Tutorial - Session 1
Harald Meier
 
Social Software and Community Information Systems
Social Software and Community Information SystemsSocial Software and Community Information Systems
Social Software and Community Information Systems
Ralf Klamma
 
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Ruchika Sharma
 
Cloudengine at SEDA 2011
Cloudengine at SEDA 2011Cloudengine at SEDA 2011
On managing social data for enabling socially-aware applications and services
On managing social data for enabling socially-aware applications and servicesOn managing social data for enabling socially-aware applications and services
On managing social data for enabling socially-aware applications and services
Nicolas Kourtellis
 
Understanding User-Community Engagement by Multi-faceted Features: A Case ...
Understanding User-Community Engagement by Multi-faceted Features: A Case ...Understanding User-Community Engagement by Multi-faceted Features: A Case ...
Understanding User-Community Engagement by Multi-faceted Features: A Case ...
Artificial Intelligence Institute at UofSC
 
Self-modeling and self-reflection of E-learning communities
Self-modeling and self-reflection of E-learning communitiesSelf-modeling and self-reflection of E-learning communities
Self-modeling and self-reflection of E-learning communities
Zina Petrushyna
 
Notes on mining social media updated
Notes on mining social media updatedNotes on mining social media updated
Notes on mining social media updated
Gary Myers KMb Unit, York University
 
Embracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital LibrariesEmbracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital Libraries
Akhmad Riza Faizal
 
Ongoing activities october 15th 2010
Ongoing activities october 15th 2010Ongoing activities october 15th 2010
Ongoing activities october 15th 2010
ictseserv
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
Stefan Sommer
 
Overview of Social Networks
Overview of Social NetworksOverview of Social Networks
Overview of Social Networks
Paolo Nesi
 

Similar to CSE509 Lecture 5 (20)

2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
 
Q046049397
Q046049397Q046049397
Q046049397
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …
 
Requirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docx
Requirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docxRequirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docx
Requirements1) Due 6 Oct2) APA 6th Ed format3) 6 Pages.docx
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...
 
Scaling Community Information Systems
Scaling Community Information SystemsScaling Community Information Systems
Scaling Community Information Systems
 
GIJC19 - NodeXL Tutorial - Session 1
GIJC19 - NodeXL Tutorial - Session 1GIJC19 - NodeXL Tutorial - Session 1
GIJC19 - NodeXL Tutorial - Session 1
 
Social Software and Community Information Systems
Social Software and Community Information SystemsSocial Software and Community Information Systems
Social Software and Community Information Systems
 
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
 
Cloudengine at SEDA 2011
Cloudengine at SEDA 2011Cloudengine at SEDA 2011
Cloudengine at SEDA 2011
 
On managing social data for enabling socially-aware applications and services
On managing social data for enabling socially-aware applications and servicesOn managing social data for enabling socially-aware applications and services
On managing social data for enabling socially-aware applications and services
 
Understanding User-Community Engagement by Multi-faceted Features: A Case ...
Understanding User-Community Engagement by Multi-faceted Features: A Case ...Understanding User-Community Engagement by Multi-faceted Features: A Case ...
Understanding User-Community Engagement by Multi-faceted Features: A Case ...
 
Self-modeling and self-reflection of E-learning communities
Self-modeling and self-reflection of E-learning communitiesSelf-modeling and self-reflection of E-learning communities
Self-modeling and self-reflection of E-learning communities
 
Notes on mining social media updated
Notes on mining social media updatedNotes on mining social media updated
Notes on mining social media updated
 
Embracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital LibrariesEmbracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital Libraries
 
Ongoing activities october 15th 2010
Ongoing activities october 15th 2010Ongoing activities october 15th 2010
Ongoing activities october 15th 2010
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
 
Overview of Social Networks
Overview of Social NetworksOverview of Social Networks
Overview of Social Networks
 

More from Web Science Research Group at Institute of Business Administration, Karachi, Pakistan

ReThinking CS Curriculum for Pakistan
ReThinking CS Curriculum for PakistanReThinking CS Curriculum for Pakistan
Information Retrieval
Information RetrievalInformation Retrieval
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
CSE509 Lecture 6
CSE509 Lecture 6CSE509 Lecture 6
CSE509 Lecture 4
CSE509 Lecture 4CSE509 Lecture 4
CSE509 Lecture 3
CSE509 Lecture 3CSE509 Lecture 3
CSE509 Lecture 2
CSE509 Lecture 2CSE509 Lecture 2
CSE509 Lecture 1
CSE509 Lecture 1CSE509 Lecture 1

More from Web Science Research Group at Institute of Business Administration, Karachi, Pakistan (8)

ReThinking CS Curriculum for Pakistan
ReThinking CS Curriculum for PakistanReThinking CS Curriculum for Pakistan
ReThinking CS Curriculum for Pakistan
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
CSE509 Lecture 6
CSE509 Lecture 6CSE509 Lecture 6
CSE509 Lecture 6
 
CSE509 Lecture 4
CSE509 Lecture 4CSE509 Lecture 4
CSE509 Lecture 4
 
CSE509 Lecture 3
CSE509 Lecture 3CSE509 Lecture 3
CSE509 Lecture 3
 
CSE509 Lecture 2
CSE509 Lecture 2CSE509 Lecture 2
CSE509 Lecture 2
 
CSE509 Lecture 1
CSE509 Lecture 1CSE509 Lecture 1
CSE509 Lecture 1
 

Recently uploaded

Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 

Recently uploaded (20)

Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 

CSE509 Lecture 5

  • 1. CSE509: Introduction to Web Science and Technology Lecture 5: Social Network Analysis ArjumandYounus Web Science Research Group Institute of Business Administration (IBA)
  • 2. Last Time… Web Data Explosion Part I MapReduce Basics MapReduce Example and Details MapReduce Case-Study: Web Crawler based on MapReduce Architecture Part II Large-Scale File Systems Google File System Case-Study August 06, 2011
  • 3. Today Transition from Web 1.0 to Web 2.0 Social Media Characteristics Part I: Theoretical Aspects Social Networks as a Graph Properties of Social Networks Part II: Getting Hands-On Experience on Social Media Analytics Twitter Data Hacks Part III: Example Researches August 06, 2011
  • 4. Quick Survey Do you have a Facebook, MySpace, Twitter, or LinkedIn account? Do you own a blog? Do you read blogs? Have you ever searched for something on Wikipedia? Have you ever submitted content to a social network? August 06, 2011
  • 5. Web 1.0 vs. Web 2.0 August 06, 2011 Borrowed from SIGKDD 2008 tutorial slides of Professor Huan Liu and Professor Nitin Agarwal with permission
  • 6. What is so Different about Web 2.0? User Generated Content Collaborative Environment: Participatory Web, Citizen Journalism User is the Driving Factor August 06, 2011 A Paradigm Shift rather than a Technology Shift
  • 7. Top 20 Most Visited Web Sites Internet traffic report by Alexa on July 29th 2008 August 06, 2011 Borrowed from SIGKDD 2008 tutorial slides of Professor Huan Liu and Professor Nitin Agarwal with permission
  • 8. Various forms of Social Media Blog: Wordpress, blogspot, LiveJournal Forum: Yahoo! Answers, Epinions Media Sharing: Flickr, YouTube, Scribd Microblogging: Twitter, FourSquare Social Networking: Facebook, LinkedIn, Orkut Social Bookmarking: Del.icio.us, Diigo Wikis: Wikipedia, scholarpedia, AskDrWiki August 06, 2011
  • 9. Characteristics of Social Media “Consumers” become “Producers” Rich User Interaction User-Generated Contents Collaborative environment Collective Wisdom Long Tail Broadcast Media Filter, then Publish Social Media Publish, then Filter August 06, 2011
  • 11. PART I: Theoretical Aspects August 06, 2011
  • 12.
  • 13.
  • 14. Scale-Free Distributions Degree distribution in large-scale networks often follows a power law. A.k.a. long tail distribution, scale-free distribution August 06, 2011 Degrees Nodes
  • 15. Small-World Effect “Six Degrees of Separation” A famous experiment conducted by Travers and Milgram (1969) Subjects were asked to send a chain letter to his acquaintance in order to reach a target person The average path length is around 5.5 Verified on a planetary-scale IM network of 180 million users (Leskovec and Horvitz 2008) The average path length is 6.6 August 06, 2011
  • 16. Small World Facebook Experiment by Yahoo! Labs Anyone in the world can get a message to anyone else in just "six degrees of separation" by passing it from friend to friend. Sociologists have tried to prove (or disprove) this claim for decades, but it is still unresolved. http://smallworld.sandbox.yahoo.com/ August 06, 2011
  • 17. Community Structure Community: People in a group interact with each other more frequently than those outside the group ki = number of edges among node Ni’s neighbors Friends of a friend are likely to be friends as well Measured by clustering coefficient: Density of connections among one’s friends August 06, 2011
  • 18.
  • 19. k6=4 as e(4,5), e(5,7), e(5,8), e(7,8)
  • 21.
  • 22.
  • 23. Social Computing Tasks Social Computing: a young and vibrant field Conferences: KDD, WSDM, WWW, ICML, AAAI/IJCAI, SocialCom, etc. Tasks Centrality Analysis and Influence Modeling Community Detection Classification and Recommendation Privacy, Spam and Security August 06, 2011
  • 24. Centrality Analysis and Influence Modeling Centrality Analysis: Identify the most important actors or edges E.g. PageRank in Google Various other criteria Influence modeling: How is information diffused? How does one influence each other? Related Problems Viral marketing: word-of-mouth effect Influence maximization August 06, 2011
  • 25. Community Detection A community is a set of nodes between which the interactions are (relatively) frequent A.k.a., group, cluster, cohesive subgroups, modules Applications: Recommendation based communities, Network Compression, Visualization of a huge network New lines of research in social media Community Detection in Heterogeneous Networks Community Evolution in Dynamic Networks Scalable Community Detection in Large-Scale Networks August 06, 2011
  • 26. Classification and Recommendation Common in social media applications Tag suggestion, Product/Friend/Group Recommendation August 06, 2011 Link prediction Network-Based Classification
  • 27. Privacy, Spam and Security Privacy is a big concern in social media Facebook, Google buzz often appear in debates about privacy NetFlix Prize Sequel cancelled due to privacy concern Simple anonymization does not necessarily protect privacy Spam blog (splog), spam comments, fake identity, etc., all requires new techniques As private information is involved, a secure and trustable system is critical Need to achieve a balance between sharing and privacy August 06, 2011
  • 28. PART II: Practical SNA with Twittersphere Mining August 06, 2011
  • 29. Pre-Requisites Expectation that Python is installed and you have some hands-on experience with it Dependencies easy_install networkx twitter (Twitter API for Python) For Windows users Install ActivePython: comes bundled with easy_install easy_installnetworkx easy_install twitter For Linux users sh setuptools-0.6c11-py2.6.egg sudoeasy_installnetworkx sudoeasy_install twitter August 06, 2011
  • 30. Getting Tweets from Twitter Search API import twitter import json twitter_search=twitter.Twitter(domain="search.twitter.com") search_results=[] for page in range(1,6): search_results.append(twitter_search.search(q="pakistan",rpp=100,page=page)) print json.dumps(search_results, sort_keys=True, indent=1) tweets=[r['text'] for result in search_results for r in result['results']] print tweets August 06, 2011
  • 31. Lexical Diversity for Tweets words=[] for t in tweets: words+= [w for w in t.split()] lexical_diversity=1.0*len(set(words))/len(words) August 06, 2011
  • 32. What People are Tweeting: Frequency Analysis freq_dist=nltk.FreqDist(words) freq_dist.keys()[:50] freq_dist.keys()[-50:] August 06, 2011
  • 33. Extracting Relationships from Tweets (1/3) Step 1: Extracting Graph Data import networkx as nx import re g=nx.DiGraph() twitter_search=twitter.Twitter(domain="search.twitter.com") search_results=[] for page in range(1,6): search_results.append(twitter_search.search(q="pakistan",rpp=100,page=page)) all_tweets=[tweet for page in search_results for tweet in page["results"]] def get_rt_sources(tweet): rt_patterns=re.compile(r"(RT|via)((?:*@+)+)",re.IGNORECASE) return [source.strip() for tuple in rt_patterns.findall(tweet) for source in tuple if source not in ("RT", "via")] for tweet in all_tweets: rt_sources=get_rt_sources(tweet["text"]) if not rt_sources: continue for rt_source in rt_sources: g.add_edge(rt_source,tweet["from_user"],{"tweet_id":tweet["id"]}) August 06, 2011
  • 34. Extracting Relationships from Tweets (2/3) Step 2: Generating DOT File OUT = "pakistan_search_results.dot“ dot=['"%s" -> "%s" [tweet_id=%s]' % (n1.encode('utf-8'), n2.encode('utf-8'), g[n1][n2]['tweet_id']) for n1, n2 in g.edges()] f=open(OUT, 'w') f.write('strict digraph {%s}' % (';'.join(dot),)) f.close() August 06, 2011
  • 35. Extracting Relationships from Tweets (3/3) Step 3: Visualizing the Retweet Data in Graphical Form For Windows users For Linux users circo -Tpng -Osnl_search_results pakistan_search_results.dot August 06, 2011
  • 36. PART III: Example Researches August 06, 2011
  • 37. Million Follower Fallacy (New York Times) August 06, 2011
  • 38. Twitter: More a News Medium than a Social Network (PC World) August 06, 2011
  • 39. Twitter for World Peace (Business Week) August 06, 2011
  • 40. SocialFlow: Social Media Optimization Social Media Optimization Platform Works in Domains of Viral and Word-of-Mouth Marketing Provides Services to Major Media Outlets Recent study How different audiences consumed and rebroadcast messages news organizations were sending out: AlJazeera English, BBC News, CNN, The Economist, Fox News and New York Times August 06, 2011
  • 41. August 06, 2011 Twitter as a Real-Time News Analysis Service
  • 42. Studying Ins and Outs of News Using Twitter to study hot news items people are heavily tweeting about August 06, 2011
  • 43. Algorithm for Identification of Popular News August 06, 2011
  • 45. Observations (1/3) August 06, 2011 Percentage of news in tweets per day greater than 50% for all days except one day
  • 46. Observations (2/3) August 06, 2011 Highest Number of Recorded Tweets per Day

Editor's Notes

  1. The past decade has witnessed the emergence of participatory Web and social media, bringing peopletogether in many creative ways. Millions of users are playing, tagging, working, and socializingonline, demonstrating new forms of collaboration, communication, and intelligence that were hardlyimaginable just a short time ago. Social media also helps reshape business models, sway opinions andemotions, and opens up numerous possibilities to study human interaction and collective behavior inan unparalleled scale. This lecture, from a data mining perspective, introduces characteristics of socialmedia, reviews representative tasks of computing with social media, and illustrates associated challenges.
  2. In traditional media such as TV, radio, movies, and newspapers, it is only a small numberof “authorities” or “experts” who decide which information should be produced and how it is distributed.The majority of users are consumers who are separated from the production process. Thecommunication pattern in the traditional media is one-way traffic, from a centralized producer towidespread consumers.This new type of mass publication enables the production of timely news and grassrootsinformation and leads to mountains of user-generated contents, forming the wisdom of crowds
  3. Twitter: a directed graphFacebook: an undirected graphIn Twitter, for example, one user x follows another user y, but user y does not necessarily follow user x. In this case, the follower-followee network is directed and asymmetrical
  4. a linear relationship between the logarithms of the variables
  5. the number of connections between one’s friends over the total number of possible connections among them
  6. Previously: email communication networks, instant messaging networks, mobile call networks, friendshipNetworks. Other forms of complex networks, like coauthorship or citation networks, biological networks, metabolic pathways, genetic regulatory networks and food webThese large-scale networks combined with unique characteristics of social media present novelchallenges for mining social media.In reality, multiple relationships can exist between individuals. Two personscan be friends and colleagues at the same time. Thus, a variety of interactions exist betweenthe same set of actors in a network. Multiple types of entities can also be involved in onenetwork. For many social bookmarking and media sharing sites, users, tags and content areintertwined with each other, leading to heterogeneous entities in one network. Analysis ofthese heterogeneous networks involving heterogeneous entities or interactions requires newtheories and tools.Social media emphasizes timeliness. For example, in content sharing sites andblogosphere, people quickly lose their interest in most shared contents and blog posts. Thisdiffers fromclassical web mining.Newusers join in,newconnections establish between existingmembers, and senior users become dormant or simply leave.How can we capture the dynamicsof individuals in networks? Can we find the die-hard members that are the backbone ofcommunities? Can they determine the rise and fall of their communities?In social media, people tend to share their connections. The wisdomof crowds, in forms of tags, comments, reviews, and ratings, is often accessible. The metainformation, in conjunction with user interactions, might be useful for many applications.It remains a challenge to effectively employ social connectivity information and collectiveintelligence to build social computing applications.A research barrier concerning mining social media is evaluation. In traditionaldata mining, we are so used to the training-testing model of evaluation. It differs in socialmedia. Since many social media sites are required to protect user privacy information, limitedbenchmark data is available. Another frequently encountered problem is the lack of groundtruth for many social computing tasks, which further hinders some comparative study ofdifferent works.Without ground truth, how can we conduct fair comparison and evaluation?Slide 7-11