SlideShare a Scribd company logo
1 of 53
Social Data Mining
Mahesh J. Meniya
Akash M. Rangani
Data, Information, Knowledge(1)
Data

Facts and statistics collected together for reference or analysis.
The quantities, characters, or symbols on which operations
are performed by a computer, being stored and transmitted.
Information

The patterns, associations, or relationships among all this data
can provide information. For example, analysis of retail point
of sale transaction data can yield information on which
products are selling and when.
Data, Information, Knowledge(2)
Knowledge

Information can be converted into knowledge about historical
patterns and future trends. For example, summary
information on retail supermarket sales can be analyzed in
light of promotional efforts to provide knowledge of
consumer buying behavior. Thus, a manufacturer or retailer
could determine which items are most susceptible to
promotional efforts.
What is Data Mining ?
From the large dataset find the :
Unknown
Useful
Information.

The overall goal of the data mining process is to extract
information from a data set and transform it into an
understandable structure for further use.
The process of collecting, searching through, and analyzing a
large amount of data in a database, as to discover patterns or
relationships
What is Social Data Mining ?
Social media is designed as a group of Internet-based
applications that build on the ideological and technological
foundations of Web 2.0 and that allow the creation and
exchanges of user-generated content.
Vast amounts of user-generated content are created on social
media sites every day i.e. facebook, Twitter, Google+
Systematically analyzing the valuable information from the
Social media is Social data mining
Social media data are largely user-generated content which is
vast, noisy, distributed, unstructured, and dynamic
Social Data Mining
Social Media Platform
Blogging
Microblogs
Community-based Question Answer (CQA)
Emails and Chat
Hybrid Applications
Wikis
Social news
Social bookmarking
Media sharing
Opinion, reviews, and ratings
Why Important ?
The WWW is vast
People shares more data
Advertising and marketing
Products are more customized
More devices produce more data
Market Research
Customer Experience
Brand Loyalties
Product development and design
Communication, Marketing
Structures in Social Media
Social structures represent social relationships between
community members. Accordingly, social applications are often
designed to systemically support these properties.
Social structures represent social relationships between
community members. For example, in online forums, a useful
criterion provided by a social structure is whether or not a
member is an expert in a specific topic.
Types of Social Media Structure
Hierarchical Structure
Objects used in social data mining often possess a natural
hierarchical structure. For example, even a short document comprises a
number of sentences. Accordingly, hierarchical structures has been
frequently addressed in information representation.

Conversational Structure
We can identify conversational structures explicitly or implicitly
in most social platform involving interactions between users. For
example, in emails and forums, conversational structures are formed by
replies.
Data Mining Techniques for
Social Media
Graph Mining
Graphs (or networks) constitute a dominant data structure and
appear essentially in all forms of information. Examples include the
Web graph, social networks. Typically, the communities correspond to
groups of nodes, where nodes within the same community (or clusters)
tend to be highly similar sharing common features, while on the other
hand, nodes of different communities show low similarity.
Extracting useful knowledge (patterns, outliers, etc.) from
structured data that can be represented as a graph.
Graph Mining usage
Google uses page rank as one of many predictors for the relevance of
a web page. The link structure in the world-wide-web network
provides valuable contextual information about which pages are
deemed most relevant by the web page creators—this contextual link
structure is then used to predict relevance for a user’s query.
Useful for understand relationships as well as content (text, images),
Social media host tries to look at certain online groups and predict
about the group whether the group will flourish or disband.
Graph Mining usage cont.
Phone provider looks at cell phone call records to determine
whether an account is a result of identity theft.

Facebook Graph Search
Query examples
Searching people: “friends of friends who are single female in Rajkot”
Searching interests: “movies my friends like”, “TV shows my friends
like”, “Videos by TV shows liked by my friends”.
Searching places: “Restaurant in Rajkot liked by friends”
Sample query for Facebook
Graph search
Result Facebook Graph search
Text Mining
Text mining is an emerging technology that attempts to extract
meaningful information from unstructured textual data. Text mining
is an extension of data mining to textual data. Social networks contain
a lot of text in the nodes in various forms. For example, social
networks may contain links to posts, blogs or other news articles.
Usage of text mining (1)
Automatic processing of messages, emails
common application for text mining is to aid in the automatic
classification of texts. For example, it is possible to "filter" out
automatically most undesirable "junk email" based on certain
terms or words that are not likely to appear in legitimate messages

Investigating competitors by crawling their web sites
Another type of potentially very useful application is to
automatically process the contents of Web pages in a particular
domain. For example, you could go to a Web page, and begin
"crawling" the links you find there to process all Web pages that
are referenced.
Usage of text mining (2)
Medical
Mining medical records to improve care of patient

Security applications
Many text mining software packages are marketed for security
applications, especially monitoring and analysis of online plain
text sources such as Internet news, blogs, etc. for national security
purposes.
Text Mining Process
Generic Process of social data
mining
Web 2.0 data source

Data Collection

Data Modeling
Used In
application

Mining Methods
• Cluster & community Detection
• static analysis
• Classification
Text Mining Process stages (1)
Data Collection
The data collector module continuously downloads the from one or
more social platform and stores the raw data into the database
(e.g.BigData) or normal database. Based on application type the
parameters are specified with the API call.

Data Modeling
Data modeling is a process used to define and analyze data
requirements needed to support the application processes within the
scope of corresponding application. In the data modeling stage data
is model in various data model based on the application nature
Text Mining Process stages (2)
Mining Methods
Cluster analysis
automatic or semi-automatic analysis of large quantities of data to
extract previously unknown interesting patterns such as groups of
data records known as cluster analysis.
Anomaly detection
It is the search for items or events which do not confirm to an
expected pattern
Text Mining Process stages (3)
Static analysis
Analysis of historical business activities, stored as static data in data
warehouse databases, to reveal hidden patterns and trends.
Examples of what businesses use data mining for include
performing market analysis to finding the root cause of
manufacturing problems
Can be used to assist in discovering previously unknown strategic
business information.
To prevent customer attrition and acquire new customers
Cross-sell to existing customers
Manage customers with more accuracy.
OAuth 2.0
OAuth is an open standard for authorization
It provides a process for end-users to authorize third-party access to
their server resources without sharing their credentials (typically, a
username and password pair), using user-agent redirections.
Open authentication protocol which enables applications to access
each other’s data.
Authorization flow
Authorization flow steps(1)
First the user accesses the client web application. In this web app is
button saying "Login via Facebook" (or some other system like
Google or Twitter).
Second, when the user clicks the login button, the user is redirected
to the authenticating application (e.g. Facebook). The user then logs
into the authenticating application, and is asked if s/he wants to
grant access to her data in the authenticating application, to the
client application. The user accepts.
Third, the authenticating application redirects the user to a redirect
URI, which the client app has provided to the authenticating app.
providing this redirect URI is normally done by registering the client
application with the authenticating application.
Authorization flow steps(2)
Fourth, the user accesses the page located at the redirect URI in the
client application. In the background the client application contacts
the authenticating application and sends
Once the client application has obtained an access token, this access
token can be sent to the Facebook, Google, Twitter etc. to access
resources in these systems, related to the user who logged in.
Roles of users and applications
in oAuth 2.0 (1)
Roles of users and applications
in Auth 2.0 (2)
Resource Owner
The resource owner is the person or application that owns the data that is
to be shared. For instance, a user on Facebook or Google could be a
resource owner.

Resource Server
The resource server is the server hosting the resource owned by the
resource server. For instance, Facebook or Google is a resource server

Client Application
The client application is the application requesting access to the resources
stored on the resource server. The resources, which are owned by the
resource owner. A client application could be a game requesting access to a
users Facebook account.
Roles of users and applications
in Auth 2.0 (3)
Authorization Server
The authorization server is the server authorizing the client
application to access the resources of the resource owner.
The authorization server and the resource server can be the same
server
Big data
Big data is the term for a collection of data sets so large and complex
that it becomes difficult to process using on-hand database
management tools or traditional data processing applications. The
challenges include capture, storage, search, sharing, transfer, analysis
and visualization.
Some Examples :
Facebook has more than 1.15 billion active users generating social
interaction data.
More than 5 billion people are calling, texting, tweeting and
browsing websites on mobile phones
Scientific instruments generate large amount of data
Characteristics of Big Data
Application Big data
Google Flu Trends uses search terms to predict the spread of the flu
virus
MIT are using mobile phone data to establish how peoples' locations
and traffic patterns can be used for urban planning
Statistician Nate Silver predicted the outcome of the US election
down to each individual state in 2012.
Big Data can bring the intelligence of online shopping into the retail
environment
Tools used in Big data (1)
NoSQL databases
NoSQL, it means non relational or Non-SQL database. There are
several database types that fit into this category, such as key-value
stores and document stores, which focus on the storage and retrieval
of large volumes of unstructured, semi-structured, or even structured
data.
Map Reduce by Google
This is a programming paradigm that allows for massive job
execution scalability against thousands of servers or clusters of
servers.
The "Map" task, where an input dataset is converted into a different set of
key/value pairs, or tuples
The "Reduce" task, where several of the outputs of the "Map" task are
combined to form a reduced set of tuples
Tools used in Big data (2)
Hadoop by Apache
Hadoop is by far the most popular implementation of MapReduce,
being an entirely open source platform for handling Big Data. It is
flexible enough to be able to work with multiple data sources, either
aggregating multiple sources of data in order to do large scale
processing.
Access Data from Twitter (1)
Twitter is an online social networking and microblogging service
that enables users to send and read "tweets", which are text messages
limited to 140 characters.
Twitter, provides various APIs that allows developers to build upon
and extend their applications in new and creative ways.
Twitter for Websites
Twitter for Websites is a suite of products that enables websites to easily
integrate Twitter. It is ideal for site developers looking to quickly and easily
integrate very basic Twitter functions.
Access Data from Twitter (2)
Search API
The Search API designed for products looking to allow a user to query
for Twitter content. This may include finding a set of tweets with specific
keywords, finding tweets referencing a specific user, or finding tweets
from a particular user.

REST API
The REST API enables developers to access some of the core primitives
of Twitter including timelines, status updates, and user information. If
you're building application that leverages core Twitter objects, then this
is the API which can be useful.
Twitter REST API calls
Access Data from Twitter (3)
Streaming API
Streaming APIs offered by Twitter give developers low latency access to
Twitter's global stream of Tweet data. This API is for those developers
with data intensive needs. To build a data mining product or are
interested in analytics research, the Streaming API is most suited for such
things.
Twitter Streaming API calls
Access Data from facebook
Facebook platform provides various API,SDK for develop application
which access the facebook data. The Facebook SDK provides a fast,
native, Facebook integration, using the exact same implementation,
regardless of which environment you're deploying to.

For Mobile platform facebook provides SDK for two platform
iOS platform
Android platform

For Web development SDK are provided by both Facebook and the
community
Php
Javascript
Ruby
Node.js
C#
Facebook APIs (1)
Search API
The Graph API is a simple HTTP-based API that gives access to the
Facebook social graph, uniformly representing objects in the graph and the
connections between them. Most other APIs at Facebook are based on the
Graph API.

FQL
Facebook Query Language, or FQL, enables you to use a SQL-style
interface to query the data exposed by the Graph API.

Dialogs
Facebook offers a number of dialogs for Facebook Login, posting to a
person's timeline or sending requests
Facebook APIs (2)
Chat
One can integrate Facebook Chat into Web-based, desktop, or mobile
instant messaging products.

Ads API
The Ads API allows you to build your own app as a customized alternative
to the Facebook Ads.

Public Feed API
The Public Feed API lets you read the stream of public comments as they
are posted to Facebook.
Friend Locator - Facebook App
Facebook application to display friend’s current location
and home town on Google map using jquery, google map
api and facebook platform.
It uses Oauth and FQL for accessing the client data from
the facebook.
Request Permission for
application
Friend Map on Google Map
List of Friends in selected city
Example of Mining Social Media
The core principal in mining of social sites is attribute-value that is
gathering by applying various algorithms. Attribute for any social
networking site can be categorized into two parts:
Individual Attributes
Community Attributes

Individual attribute describe the personal information about the
human like Gender, birth date, address, phone number, email
address etc.
Community attributes like friend list, tagged pictures, followers.
If we consider the example of facebook then Nowadays Facebook
users these days can control photo tagging and the sharing of their
friend list with the public user can also share the status with specific
people or group but still user cannot control friends sharing their
friend lists or uploading photos of them from their profiles to the
public.
By collecting and assess the vast amount of facebook user data one
can obtain general behavior of the user. Facebook provides the
sharing option for the phone number and personnel information, if
user discloses this sensitive information in their profile. The user
vulnerability will be increase to become the victim.
Conclusion
Valuable information is hidden in vast amounts of social media
data, presenting ample opportunities social media mining to discover
actionable knowledge that is otherwise difficult to find. Social media data
are vast, noisy, distributed, unstructured, and dynamic, which poses novel
challenges for data mining. In this paper, we offer a brief introduction to
mining social media, use illustrative examples to show that burgeoning
social media mining is spearheading the social media research, and
demonstrate its invaluable contributions to real-world applications.
References
[1] PritamGundecha, Huan Liu “Mining Social Media: A Brief
Introduction”, ISBN No 978-0-9843378-3-5
[2] Brain Amento, Loren Terveen , Will Hill “Experiments in Social
Data Mining”.
[3] Roosevelt C. Mosley Jr., FCAS, MAAA “Social Media Analytics:
Data Mining Applied to Insurance Twitter Posts”.
[4] Facebook Development - https://developers.facebook.com/
[5] Twitter Development - https://dev.twitter.com/
[6] Social Networking Statistics & Facts - http://visual.ly/100-socialnetworking-statistics-facts-2012
Social Data Mining
Social Data Mining

More Related Content

What's hot

Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101librarianrafia
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)SocialMediaMining
 
Social Media Analytics
Social Media AnalyticsSocial Media Analytics
Social Media AnalyticsMuhammad Rifqi
 
Social media analytics
Social media analyticsSocial media analytics
Social media analyticsShekhar Shukla
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis WorkshopData Works MD
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network AnalysisPremsankar Chakkingal
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)SocialMediaMining
 
Social Media Mining - Chapter 7 (Information Diffusion)
Social Media Mining - Chapter 7 (Information Diffusion)Social Media Mining - Chapter 7 (Information Diffusion)
Social Media Mining - Chapter 7 (Information Diffusion)SocialMediaMining
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisSujoy Bag
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisFred Stutzman
 
Social network analysis part ii
Social network analysis part iiSocial network analysis part ii
Social network analysis part iiTHomas Plotkowiak
 
Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)SocialMediaMining
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011guillaume ereteo
 
The Basics of Social Network Analysis
The Basics of Social Network AnalysisThe Basics of Social Network Analysis
The Basics of Social Network AnalysisRory Sie
 
Social Media Analytics Lecture
Social Media Analytics LectureSocial Media Analytics Lecture
Social Media Analytics LectureDr Wasim Ahmed
 

What's hot (20)

Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)
 
Social Media Analytics
Social Media AnalyticsSocial Media Analytics
Social Media Analytics
 
Social media analytics
Social media analyticsSocial media analytics
Social media analytics
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)
 
Social Media Analytics
Social Media Analytics Social Media Analytics
Social Media Analytics
 
Social Media Mining - Chapter 7 (Information Diffusion)
Social Media Mining - Chapter 7 (Information Diffusion)Social Media Mining - Chapter 7 (Information Diffusion)
Social Media Mining - Chapter 7 (Information Diffusion)
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Social network analysis part ii
Social network analysis part iiSocial network analysis part ii
Social network analysis part ii
 
Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)Social Media Mining - Chapter 6 (Community Analysis)
Social Media Mining - Chapter 6 (Community Analysis)
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011
 
The Basics of Social Network Analysis
The Basics of Social Network AnalysisThe Basics of Social Network Analysis
The Basics of Social Network Analysis
 
Social Media Analytics Lecture
Social Media Analytics LectureSocial Media Analytics Lecture
Social Media Analytics Lecture
 

Viewers also liked

4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDatamining Tools
 
Social Network Analysis in R
Social Network Analysis in RSocial Network Analysis in R
Social Network Analysis in RIan Cook
 
Phi 235 social media security users guide presentation
Phi 235 social media security users guide presentationPhi 235 social media security users guide presentation
Phi 235 social media security users guide presentationAlan Holyoke
 
Introduction to Social Media for Researchers
Introduction to Social Media for ResearchersIntroduction to Social Media for Researchers
Introduction to Social Media for ResearchersHelen Dixon
 
Hydration and the role of Sports Drinks
Hydration and the role of Sports DrinksHydration and the role of Sports Drinks
Hydration and the role of Sports DrinksIain82
 
Social media ethical issues jotham wasike
Social media ethical issues   jotham wasikeSocial media ethical issues   jotham wasike
Social media ethical issues jotham wasikeFrancis Mwangi
 
Add plots and images into a PowerPoint document from R software
Add plots and images into a PowerPoint document from R softwareAdd plots and images into a PowerPoint document from R software
Add plots and images into a PowerPoint document from R softwarekassambara
 
Add a table into PowerPoint document from R software using ReporteRs package
Add a table into PowerPoint document from R software using ReporteRs packageAdd a table into PowerPoint document from R software using ReporteRs package
Add a table into PowerPoint document from R software using ReporteRs packagekassambara
 
Create a Powerpoint using R software and ReporteRs package
Create a Powerpoint using R software and ReporteRs packageCreate a Powerpoint using R software and ReporteRs package
Create a Powerpoint using R software and ReporteRs packagekassambara
 
Create a PowerPoint document from template using R software and ReporteRs pac...
Create a PowerPoint document from template using R software and ReporteRs pac...Create a PowerPoint document from template using R software and ReporteRs pac...
Create a PowerPoint document from template using R software and ReporteRs pac...kassambara
 
foreign workers in Malaysia
foreign workers in Malaysiaforeign workers in Malaysia
foreign workers in MalaysiaZakinan Nawaz
 
USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS
USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS
USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS Mahesh Kodituwakku
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...Kasthuripriya Nanda Kumar
 
Social Media and its effects on youth
Social Media and its effects on youthSocial Media and its effects on youth
Social Media and its effects on youthAbhishek Jain
 

Viewers also liked (19)

4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Social Network Analysis in R
Social Network Analysis in RSocial Network Analysis in R
Social Network Analysis in R
 
Social Media Risks
Social Media RisksSocial Media Risks
Social Media Risks
 
Phi 235 social media security users guide presentation
Phi 235 social media security users guide presentationPhi 235 social media security users guide presentation
Phi 235 social media security users guide presentation
 
Introduction to Social Media for Researchers
Introduction to Social Media for ResearchersIntroduction to Social Media for Researchers
Introduction to Social Media for Researchers
 
Hydration and the role of Sports Drinks
Hydration and the role of Sports DrinksHydration and the role of Sports Drinks
Hydration and the role of Sports Drinks
 
Social media ethical issues jotham wasike
Social media ethical issues   jotham wasikeSocial media ethical issues   jotham wasike
Social media ethical issues jotham wasike
 
The problem with Social Media
The problem with Social MediaThe problem with Social Media
The problem with Social Media
 
Add plots and images into a PowerPoint document from R software
Add plots and images into a PowerPoint document from R softwareAdd plots and images into a PowerPoint document from R software
Add plots and images into a PowerPoint document from R software
 
Add a table into PowerPoint document from R software using ReporteRs package
Add a table into PowerPoint document from R software using ReporteRs packageAdd a table into PowerPoint document from R software using ReporteRs package
Add a table into PowerPoint document from R software using ReporteRs package
 
Create a Powerpoint using R software and ReporteRs package
Create a Powerpoint using R software and ReporteRs packageCreate a Powerpoint using R software and ReporteRs package
Create a Powerpoint using R software and ReporteRs package
 
Create a PowerPoint document from template using R software and ReporteRs pac...
Create a PowerPoint document from template using R software and ReporteRs pac...Create a PowerPoint document from template using R software and ReporteRs pac...
Create a PowerPoint document from template using R software and ReporteRs pac...
 
foreign workers in Malaysia
foreign workers in Malaysiaforeign workers in Malaysia
foreign workers in Malaysia
 
USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS
USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS
USE OF SOCIAL NETWORKS AND ITS EFFECTS ON STUDENTS
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
 
Social Media and its effects on youth
Social Media and its effects on youthSocial Media and its effects on youth
Social Media and its effects on youth
 
Data mining
Data miningData mining
Data mining
 

Similar to Social Data Mining

Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsMike Kujawski
 
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008Journal For Research
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceUyoyo Edosio
 
Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...IJMTST Journal
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...inventionjournals
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataIOSR Journals
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systemsvivatechijri
 
APPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICS
APPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICSAPPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICS
APPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICSJim Jimenez
 
Social Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization ToolsSocial Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization ToolsSayani Majumder
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking ijcseit
 
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET Journal
 
Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)Mumbai Academisc
 

Similar to Social Data Mining (20)

Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
 
Data Mining
Data MiningData Mining
Data Mining
 
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-Commerce
 
Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
 
O017148084
O017148084O017148084
O017148084
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media Data
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Jx2517481755
Jx2517481755Jx2517481755
Jx2517481755
 
Jx2517481755
Jx2517481755Jx2517481755
Jx2517481755
 
APPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICS
APPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICSAPPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICS
APPLICATION OF SENTIMENT ANALYSIS IN WEB DATA ANALYTICS
 
Social Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization ToolsSocial Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization Tools
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking
 
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
 
Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)
 
Ac02411221125
Ac02411221125Ac02411221125
Ac02411221125
 
Proposal.docx
Proposal.docxProposal.docx
Proposal.docx
 

Social Data Mining

  • 1. Social Data Mining Mahesh J. Meniya Akash M. Rangani
  • 2. Data, Information, Knowledge(1) Data Facts and statistics collected together for reference or analysis. The quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted. Information The patterns, associations, or relationships among all this data can provide information. For example, analysis of retail point of sale transaction data can yield information on which products are selling and when.
  • 3. Data, Information, Knowledge(2) Knowledge Information can be converted into knowledge about historical patterns and future trends. For example, summary information on retail supermarket sales can be analyzed in light of promotional efforts to provide knowledge of consumer buying behavior. Thus, a manufacturer or retailer could determine which items are most susceptible to promotional efforts.
  • 4. What is Data Mining ? From the large dataset find the : Unknown Useful Information. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. The process of collecting, searching through, and analyzing a large amount of data in a database, as to discover patterns or relationships
  • 5. What is Social Data Mining ? Social media is designed as a group of Internet-based applications that build on the ideological and technological foundations of Web 2.0 and that allow the creation and exchanges of user-generated content. Vast amounts of user-generated content are created on social media sites every day i.e. facebook, Twitter, Google+ Systematically analyzing the valuable information from the Social media is Social data mining Social media data are largely user-generated content which is vast, noisy, distributed, unstructured, and dynamic
  • 7. Social Media Platform Blogging Microblogs Community-based Question Answer (CQA) Emails and Chat Hybrid Applications Wikis Social news Social bookmarking Media sharing Opinion, reviews, and ratings
  • 8. Why Important ? The WWW is vast People shares more data Advertising and marketing Products are more customized More devices produce more data Market Research Customer Experience Brand Loyalties Product development and design Communication, Marketing
  • 9. Structures in Social Media Social structures represent social relationships between community members. Accordingly, social applications are often designed to systemically support these properties. Social structures represent social relationships between community members. For example, in online forums, a useful criterion provided by a social structure is whether or not a member is an expert in a specific topic.
  • 10. Types of Social Media Structure Hierarchical Structure Objects used in social data mining often possess a natural hierarchical structure. For example, even a short document comprises a number of sentences. Accordingly, hierarchical structures has been frequently addressed in information representation. Conversational Structure We can identify conversational structures explicitly or implicitly in most social platform involving interactions between users. For example, in emails and forums, conversational structures are formed by replies.
  • 11. Data Mining Techniques for Social Media Graph Mining Graphs (or networks) constitute a dominant data structure and appear essentially in all forms of information. Examples include the Web graph, social networks. Typically, the communities correspond to groups of nodes, where nodes within the same community (or clusters) tend to be highly similar sharing common features, while on the other hand, nodes of different communities show low similarity. Extracting useful knowledge (patterns, outliers, etc.) from structured data that can be represented as a graph.
  • 12. Graph Mining usage Google uses page rank as one of many predictors for the relevance of a web page. The link structure in the world-wide-web network provides valuable contextual information about which pages are deemed most relevant by the web page creators—this contextual link structure is then used to predict relevance for a user’s query. Useful for understand relationships as well as content (text, images), Social media host tries to look at certain online groups and predict about the group whether the group will flourish or disband.
  • 13. Graph Mining usage cont. Phone provider looks at cell phone call records to determine whether an account is a result of identity theft. Facebook Graph Search Query examples Searching people: “friends of friends who are single female in Rajkot” Searching interests: “movies my friends like”, “TV shows my friends like”, “Videos by TV shows liked by my friends”. Searching places: “Restaurant in Rajkot liked by friends”
  • 14. Sample query for Facebook Graph search
  • 16. Text Mining Text mining is an emerging technology that attempts to extract meaningful information from unstructured textual data. Text mining is an extension of data mining to textual data. Social networks contain a lot of text in the nodes in various forms. For example, social networks may contain links to posts, blogs or other news articles.
  • 17. Usage of text mining (1) Automatic processing of messages, emails common application for text mining is to aid in the automatic classification of texts. For example, it is possible to "filter" out automatically most undesirable "junk email" based on certain terms or words that are not likely to appear in legitimate messages Investigating competitors by crawling their web sites Another type of potentially very useful application is to automatically process the contents of Web pages in a particular domain. For example, you could go to a Web page, and begin "crawling" the links you find there to process all Web pages that are referenced.
  • 18. Usage of text mining (2) Medical Mining medical records to improve care of patient Security applications Many text mining software packages are marketed for security applications, especially monitoring and analysis of online plain text sources such as Internet news, blogs, etc. for national security purposes.
  • 20. Generic Process of social data mining Web 2.0 data source Data Collection Data Modeling Used In application Mining Methods • Cluster & community Detection • static analysis • Classification
  • 21. Text Mining Process stages (1) Data Collection The data collector module continuously downloads the from one or more social platform and stores the raw data into the database (e.g.BigData) or normal database. Based on application type the parameters are specified with the API call. Data Modeling Data modeling is a process used to define and analyze data requirements needed to support the application processes within the scope of corresponding application. In the data modeling stage data is model in various data model based on the application nature
  • 22. Text Mining Process stages (2) Mining Methods Cluster analysis automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records known as cluster analysis. Anomaly detection It is the search for items or events which do not confirm to an expected pattern
  • 23. Text Mining Process stages (3) Static analysis Analysis of historical business activities, stored as static data in data warehouse databases, to reveal hidden patterns and trends. Examples of what businesses use data mining for include performing market analysis to finding the root cause of manufacturing problems Can be used to assist in discovering previously unknown strategic business information. To prevent customer attrition and acquire new customers Cross-sell to existing customers Manage customers with more accuracy.
  • 24. OAuth 2.0 OAuth is an open standard for authorization It provides a process for end-users to authorize third-party access to their server resources without sharing their credentials (typically, a username and password pair), using user-agent redirections. Open authentication protocol which enables applications to access each other’s data.
  • 26. Authorization flow steps(1) First the user accesses the client web application. In this web app is button saying "Login via Facebook" (or some other system like Google or Twitter). Second, when the user clicks the login button, the user is redirected to the authenticating application (e.g. Facebook). The user then logs into the authenticating application, and is asked if s/he wants to grant access to her data in the authenticating application, to the client application. The user accepts. Third, the authenticating application redirects the user to a redirect URI, which the client app has provided to the authenticating app. providing this redirect URI is normally done by registering the client application with the authenticating application.
  • 27. Authorization flow steps(2) Fourth, the user accesses the page located at the redirect URI in the client application. In the background the client application contacts the authenticating application and sends Once the client application has obtained an access token, this access token can be sent to the Facebook, Google, Twitter etc. to access resources in these systems, related to the user who logged in.
  • 28. Roles of users and applications in oAuth 2.0 (1)
  • 29. Roles of users and applications in Auth 2.0 (2) Resource Owner The resource owner is the person or application that owns the data that is to be shared. For instance, a user on Facebook or Google could be a resource owner. Resource Server The resource server is the server hosting the resource owned by the resource server. For instance, Facebook or Google is a resource server Client Application The client application is the application requesting access to the resources stored on the resource server. The resources, which are owned by the resource owner. A client application could be a game requesting access to a users Facebook account.
  • 30. Roles of users and applications in Auth 2.0 (3) Authorization Server The authorization server is the server authorizing the client application to access the resources of the resource owner. The authorization server and the resource server can be the same server
  • 31. Big data Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, storage, search, sharing, transfer, analysis and visualization. Some Examples : Facebook has more than 1.15 billion active users generating social interaction data. More than 5 billion people are calling, texting, tweeting and browsing websites on mobile phones Scientific instruments generate large amount of data
  • 33. Application Big data Google Flu Trends uses search terms to predict the spread of the flu virus MIT are using mobile phone data to establish how peoples' locations and traffic patterns can be used for urban planning Statistician Nate Silver predicted the outcome of the US election down to each individual state in 2012. Big Data can bring the intelligence of online shopping into the retail environment
  • 34. Tools used in Big data (1) NoSQL databases NoSQL, it means non relational or Non-SQL database. There are several database types that fit into this category, such as key-value stores and document stores, which focus on the storage and retrieval of large volumes of unstructured, semi-structured, or even structured data. Map Reduce by Google This is a programming paradigm that allows for massive job execution scalability against thousands of servers or clusters of servers. The "Map" task, where an input dataset is converted into a different set of key/value pairs, or tuples The "Reduce" task, where several of the outputs of the "Map" task are combined to form a reduced set of tuples
  • 35. Tools used in Big data (2) Hadoop by Apache Hadoop is by far the most popular implementation of MapReduce, being an entirely open source platform for handling Big Data. It is flexible enough to be able to work with multiple data sources, either aggregating multiple sources of data in order to do large scale processing.
  • 36. Access Data from Twitter (1) Twitter is an online social networking and microblogging service that enables users to send and read "tweets", which are text messages limited to 140 characters. Twitter, provides various APIs that allows developers to build upon and extend their applications in new and creative ways. Twitter for Websites Twitter for Websites is a suite of products that enables websites to easily integrate Twitter. It is ideal for site developers looking to quickly and easily integrate very basic Twitter functions.
  • 37. Access Data from Twitter (2) Search API The Search API designed for products looking to allow a user to query for Twitter content. This may include finding a set of tweets with specific keywords, finding tweets referencing a specific user, or finding tweets from a particular user. REST API The REST API enables developers to access some of the core primitives of Twitter including timelines, status updates, and user information. If you're building application that leverages core Twitter objects, then this is the API which can be useful.
  • 39. Access Data from Twitter (3) Streaming API Streaming APIs offered by Twitter give developers low latency access to Twitter's global stream of Tweet data. This API is for those developers with data intensive needs. To build a data mining product or are interested in analytics research, the Streaming API is most suited for such things.
  • 41. Access Data from facebook Facebook platform provides various API,SDK for develop application which access the facebook data. The Facebook SDK provides a fast, native, Facebook integration, using the exact same implementation, regardless of which environment you're deploying to. For Mobile platform facebook provides SDK for two platform iOS platform Android platform For Web development SDK are provided by both Facebook and the community Php Javascript Ruby Node.js C#
  • 42. Facebook APIs (1) Search API The Graph API is a simple HTTP-based API that gives access to the Facebook social graph, uniformly representing objects in the graph and the connections between them. Most other APIs at Facebook are based on the Graph API. FQL Facebook Query Language, or FQL, enables you to use a SQL-style interface to query the data exposed by the Graph API. Dialogs Facebook offers a number of dialogs for Facebook Login, posting to a person's timeline or sending requests
  • 43. Facebook APIs (2) Chat One can integrate Facebook Chat into Web-based, desktop, or mobile instant messaging products. Ads API The Ads API allows you to build your own app as a customized alternative to the Facebook Ads. Public Feed API The Public Feed API lets you read the stream of public comments as they are posted to Facebook.
  • 44. Friend Locator - Facebook App Facebook application to display friend’s current location and home town on Google map using jquery, google map api and facebook platform. It uses Oauth and FQL for accessing the client data from the facebook.
  • 46. Friend Map on Google Map
  • 47. List of Friends in selected city
  • 48. Example of Mining Social Media The core principal in mining of social sites is attribute-value that is gathering by applying various algorithms. Attribute for any social networking site can be categorized into two parts: Individual Attributes Community Attributes Individual attribute describe the personal information about the human like Gender, birth date, address, phone number, email address etc. Community attributes like friend list, tagged pictures, followers.
  • 49. If we consider the example of facebook then Nowadays Facebook users these days can control photo tagging and the sharing of their friend list with the public user can also share the status with specific people or group but still user cannot control friends sharing their friend lists or uploading photos of them from their profiles to the public. By collecting and assess the vast amount of facebook user data one can obtain general behavior of the user. Facebook provides the sharing option for the phone number and personnel information, if user discloses this sensitive information in their profile. The user vulnerability will be increase to become the victim.
  • 50. Conclusion Valuable information is hidden in vast amounts of social media data, presenting ample opportunities social media mining to discover actionable knowledge that is otherwise difficult to find. Social media data are vast, noisy, distributed, unstructured, and dynamic, which poses novel challenges for data mining. In this paper, we offer a brief introduction to mining social media, use illustrative examples to show that burgeoning social media mining is spearheading the social media research, and demonstrate its invaluable contributions to real-world applications.
  • 51. References [1] PritamGundecha, Huan Liu “Mining Social Media: A Brief Introduction”, ISBN No 978-0-9843378-3-5 [2] Brain Amento, Loren Terveen , Will Hill “Experiments in Social Data Mining”. [3] Roosevelt C. Mosley Jr., FCAS, MAAA “Social Media Analytics: Data Mining Applied to Insurance Twitter Posts”. [4] Facebook Development - https://developers.facebook.com/ [5] Twitter Development - https://dev.twitter.com/ [6] Social Networking Statistics & Facts - http://visual.ly/100-socialnetworking-statistics-facts-2012