SlideShare a Scribd company logo
1 of 23
Text Analytics In Social
Media
Content
• Introduction to text mining in relation with social media
• Unique features of texts in social media
• Applying Text analytics in social media
• Example of text analytics in social media
Text Mining and
Social Media
The picture here shows the 10 top sites
that generates a lot of traffic. And
majority are under the social media
umbrella.
Social media can then be said to be a
medium whereby information and
communication can be accessed, shared
and discussed
Text Mining and
Social Media
Category Representatives sites
Wiki Wikipedia, Scholarpedia
Blogging Blogger, LiveJournal, Wordpress
Social news Digg, Briefing, Mixx, Slashdot
Micro Blogging Twitter, Google Buzz
Opinion & Reviews ePinions, Yelp
Question Answering Stack Overflow, Yahoo! Answers,
Quora
Media Sharing Flickr, Youtube
Social Bookmarking Delicious, CiteULike
Social Networking Facebook, LinkedIn, MySpace
The table shows the various
categories where we could classify
social media.
It contains various types of
services thereby resulting into
various kinds of data format.
The information in most social
media site are in text format.
Text Mining and Social Media
• With the current trend of Data Mining techniques and Business intelligence
from data, this question arises relating to social media.
“How can I get valuable information from the texts in
social media platform?”
Unique features of texts in social media
• With different kind of social media, there would definitely be some distinct
characteristics of this text and how they occur.
• Text Analytics describes a set of linguistic, statistical, and machine learning
techniques that model and structure the information content of textual
sources for business intelligence, exploratory data analysis, research, or
investigation
• This section gives us a hint on how to answer our previous question.
Unique features of texts in social media
• Text preprocessing is making the input more consistent to facilitate text
representation. text preprocessing methods include stop word removal and
stemming.
• Features Generation/ Text Representation. The most common ways is to
transform them into numeric vectors. Its representation is called BOW or
VSM.
• Knowledge Discovery: Where we apply machine learning or data mining
methods to discover pattern or insight.
Unique features of texts in social media
• Time Sensitivity.
An important and common feature of many social media services is their real-
time nature. Bloggers may update their post every x nos of days but most
networking sites gets updates regularly like in minutes.
The text in social media is not an independent and identically distributed data
anymore due to the sensitivity and timeliness of the textual data.
Unique features of texts in social media
• Short Length
As short messages enhances the participation of users on social media sites, it
poses a great challenge in mining with clustering or classification as a large
number of text provide sufficient context information for effective similarity
measure which is a basis for many text processing methods.
Example. Twitter is limited to 140 characters, Windows Live messenger is
limited to 512 characters but Facebook has 63,026 characters.
Unique features of texts in social media
• Unstructured Phrases
The main challenge posed by content in social media sites is the fact that the
distribution of quality has high variance: from very high-quality items to low-
quality. This can be attributed to the people’s attitudes when posting a
microblogging message or answering a question in a forum.
The difficulty here is how to accurately identify the semantic meaning from
more than 1 word that’s been abbreviated.
Applying Text analytics in social media
• Event detection
• Event Detection aims to monitor a data source and detect the occurrence of an event
that is captured within that source
• Collaborative Question Answering:
• Analyzing the differences between conversational questions and informational
questions
Illustrative Example.
• This example illustrates how to utilize text analytics to solve problems identified in
its application to social media.
• We want to improve the short text representation quality by integrating semantic
knowledge resources found to be useful in dealing with the semantic gap.
• This has 3 steps:
• Seed Phase Extraction
• Semantic features Generation
• Feature Space Construction.
Seed Phase Extraction
• Problem Statement
• Given a sentence level feature T = {t1,t2,…tn}, the phrase levels ti contained in
T. The similarity between the ti and {t1,t2,…,tn} is given by:
InfoScore(ti) = 𝒋=𝟏,𝒋≠𝒊
𝒏
𝒔𝒆𝒎(𝒕𝒊, 𝒕𝒋)
t* = 𝒂𝒓𝒈 𝐦𝐚𝐱
𝒕𝒊 ∈{t1,t2,…tn}
𝑰𝒏𝒇𝒐𝑺𝒄𝒐𝒓𝒆(𝒕𝒊)
Where t* is denoted as the phrasal level feature
Semantic features Generation
• Now the seed phrases has been extracted in the first step.
• What this steps aim to achieve is to generate semantic features on the seed
phrases. What the seed phrase has help us to do is to obtain an informative
and effective basic representation of the input text
• We use Wikipedia as our target social media.
Algorithm
Problem Statement:
Given a set of seed phases from a
text corpus already preprocessed,
generates the semantic features
from the text.
Feature Space Construction
• For the sake of data quality, effectiveness and valuable original information,
we conduct 2 more important basic steps in this process.
• Feature filtering to refine meaningless features
• Feature selection to avoid aggravating the “curse of dimensionality”
Feature Space Construction
• Feature Filtering
For the Wikipedia example, we formulate rules to refine the unstructured
features. Some rules could be
Remove features generated form too general seed phrases.
Transform features e.g List of hotels >>>hotels
Remove features related to chronology.
Feature Space Construction
• Feature Selection
• We need to select semantic features to construct feature space for various
tasks.
• The number of needed features is determined by specific tasks.
Feature Space Construction
• First we calculate the tf-idf weights of all generated features. term
frequency–inverse document frequency, is a numerical statistic that is
intended to reflect how important a word is to a document in a collection
or corpus.
• One seed phrase may generate k semantic features denoted by {fi1,fi2,…,fik}.
• The selection here is one seed phase, one feature
fi
* = arg max
𝑓𝑖𝑗
∈{𝑓𝑖1
,
𝑓𝑖2
,…,
𝑓𝑖𝑗}
𝑡𝑓_𝑖𝑑𝑓(𝑓𝑖𝑗)
Feature Space Construction
• Second the top n features are extracted from the remaining semantic features
based on their frequency.
• These frequently appearing features, together with the features from the first
step, are used to construct the m+n semantic features.
Finally
• With all the processes, and the feature space generated, we can then apply
text clustering or any other text analytics methods.
• In conclusion, though research is still intense on this subject, nevertheless
this short presentation has opened the way for us on how to apply text
analytics in social media resources.
References: [Aggarwal_C.,_Zhai_C._(eds.)]_Mining_Text_Data Ch. 12

More Related Content

What's hot

Association rule mining
Association rule miningAssociation rule mining
Association rule miningAcad
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentationDavid Raj Kanthi
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Introduction to Object Oriented databases
Introduction to Object Oriented databasesIntroduction to Object Oriented databases
Introduction to Object Oriented databasesDr. C.V. Suresh Babu
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classificationKrish_ver2
 
Knowledge Representation, Semantic Web
Knowledge Representation, Semantic WebKnowledge Representation, Semantic Web
Knowledge Representation, Semantic WebSerendipity Seraph
 
CS6010 Social Network Analysis Unit II
CS6010 Social Network Analysis   Unit IICS6010 Social Network Analysis   Unit II
CS6010 Social Network Analysis Unit IIpkaviya
 
Relational database
Relational database Relational database
Relational database Megha Sharma
 
Web ontology language (owl)
Web ontology language (owl)Web ontology language (owl)
Web ontology language (owl)Ameer Sameer
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebMarina Santini
 

What's hot (20)

03 data mining : data warehouse
03 data mining : data warehouse03 data mining : data warehouse
03 data mining : data warehouse
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I  PPT  IN PDFCS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I  PPT  IN PDF
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Introduction to Object Oriented databases
Introduction to Object Oriented databasesIntroduction to Object Oriented databases
Introduction to Object Oriented databases
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Knowledge Representation, Semantic Web
Knowledge Representation, Semantic WebKnowledge Representation, Semantic Web
Knowledge Representation, Semantic Web
 
Text MIning
Text MIningText MIning
Text MIning
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
CS6010 Social Network Analysis Unit II
CS6010 Social Network Analysis   Unit IICS6010 Social Network Analysis   Unit II
CS6010 Social Network Analysis Unit II
 
Relational database
Relational database Relational database
Relational database
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Frames
FramesFrames
Frames
 
Web ontology language (owl)
Web ontology language (owl)Web ontology language (owl)
Web ontology language (owl)
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic Web
 
Web content mining
Web content miningWeb content mining
Web content mining
 

Viewers also liked

Data, Text and Web Mining
Data, Text and Web Mining Data, Text and Web Mining
Data, Text and Web Mining Jeremiah Fadugba
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...guest5b1607
 
Dialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-MediaDialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-MediaTom Masterman
 
Detecting insults in social media conversations
Detecting insults in social media conversationsDetecting insults in social media conversations
Detecting insults in social media conversationsraj
 
Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)Jie Bao
 
Social Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive InsightsSocial Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive InsightsJohn Blossom
 
The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)Mitch Goodwin
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Mediahome
 
Text mining of Social Network Data for Business Intelligence - iLabs camp
Text mining of Social Network Data for Business Intelligence - iLabs campText mining of Social Network Data for Business Intelligence - iLabs camp
Text mining of Social Network Data for Business Intelligence - iLabs campAnkit Sharma
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering BigML, Inc
 
Opinion mining for social media
Opinion mining for social mediaOpinion mining for social media
Opinion mining for social mediaDiana Maynard
 
Media and Information Literacy (MIL) - 5. Media and Information Sources
Media and Information Literacy (MIL) - 5. Media and Information SourcesMedia and Information Literacy (MIL) - 5. Media and Information Sources
Media and Information Literacy (MIL) - 5. Media and Information SourcesArniel Ping
 
Text Mining in Social Media
Text Mining in Social MediaText Mining in Social Media
Text Mining in Social MediaManas Ranjan Kar
 

Viewers also liked (17)

Data, Text and Web Mining
Data, Text and Web Mining Data, Text and Web Mining
Data, Text and Web Mining
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
 
Dialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-MediaDialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-Media
 
Detecting insults in social media conversations
Detecting insults in social media conversationsDetecting insults in social media conversations
Detecting insults in social media conversations
 
Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)
 
Social Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive InsightsSocial Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive Insights
 
The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
Social Media Mining and Retrieval
Social Media Mining and RetrievalSocial Media Mining and Retrieval
Social Media Mining and Retrieval
 
Text mining of Social Network Data for Business Intelligence - iLabs camp
Text mining of Social Network Data for Business Intelligence - iLabs campText mining of Social Network Data for Business Intelligence - iLabs camp
Text mining of Social Network Data for Business Intelligence - iLabs camp
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering
 
Opinion mining for social media
Opinion mining for social mediaOpinion mining for social media
Opinion mining for social media
 
Media and Information Literacy (MIL) - 5. Media and Information Sources
Media and Information Literacy (MIL) - 5. Media and Information SourcesMedia and Information Literacy (MIL) - 5. Media and Information Sources
Media and Information Literacy (MIL) - 5. Media and Information Sources
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data Mining
 
Honey Pot
Honey PotHoney Pot
Honey Pot
 
Text Mining in Social Media
Text Mining in Social MediaText Mining in Social Media
Text Mining in Social Media
 

Similar to Text analytics in social media

Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using Rsantoshi mangalgi
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsIRJET Journal
 
Prediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social NetworksPrediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social NetworksMohamed El-Geish
 
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter DataApplying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter Dataijbuiiir1
 
Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignCommunitySense
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataIOSR Journals
 
Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxarunchoubeybxr
 
Measuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMeasuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMatthew Rowe
 
Filtering out improper user accounts from twitter user accounts for discoveri...
Filtering out improper user accounts from twitter user accounts for discoveri...Filtering out improper user accounts from twitter user accounts for discoveri...
Filtering out improper user accounts from twitter user accounts for discoveri...siramatu-lab
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud ComputingCarmen Sanborn
 
BEA 2015 Generating Metadata by Machine
BEA 2015 Generating Metadata by MachineBEA 2015 Generating Metadata by Machine
BEA 2015 Generating Metadata by MachineBowker
 
BEA 2015 Generating Metadata by Machine Final
BEA 2015 Generating Metadata by Machine FinalBEA 2015 Generating Metadata by Machine Final
BEA 2015 Generating Metadata by Machine FinalS. M. Hassan Zaidi
 
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence Marina Santini
 
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...Jonathan Ralton
 

Similar to Text analytics in social media (20)

Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using R
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining Applications
 
Prediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social NetworksPrediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social Networks
 
Text analytics
Text analyticsText analytics
Text analytics
 
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter DataApplying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
 
Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems Design
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media Data
 
O017148084
O017148084O017148084
O017148084
 
Text Mining
Text MiningText Mining
Text Mining
 
Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptx
 
Measuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMeasuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online Communities
 
Tldr
TldrTldr
Tldr
 
Filtering out improper user accounts from twitter user accounts for discoveri...
Filtering out improper user accounts from twitter user accounts for discoveri...Filtering out improper user accounts from twitter user accounts for discoveri...
Filtering out improper user accounts from twitter user accounts for discoveri...
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
BEA 2015 Generating Metadata by Machine
BEA 2015 Generating Metadata by MachineBEA 2015 Generating Metadata by Machine
BEA 2015 Generating Metadata by Machine
 
Text mining
Text miningText mining
Text mining
 
BEA 2015 Generating Metadata by Machine Final
BEA 2015 Generating Metadata by Machine FinalBEA 2015 Generating Metadata by Machine Final
BEA 2015 Generating Metadata by Machine Final
 
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
 

Recently uploaded

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 

Recently uploaded (20)

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 

Text analytics in social media

  • 1. Text Analytics In Social Media
  • 2. Content • Introduction to text mining in relation with social media • Unique features of texts in social media • Applying Text analytics in social media • Example of text analytics in social media
  • 3. Text Mining and Social Media The picture here shows the 10 top sites that generates a lot of traffic. And majority are under the social media umbrella. Social media can then be said to be a medium whereby information and communication can be accessed, shared and discussed
  • 4. Text Mining and Social Media Category Representatives sites Wiki Wikipedia, Scholarpedia Blogging Blogger, LiveJournal, Wordpress Social news Digg, Briefing, Mixx, Slashdot Micro Blogging Twitter, Google Buzz Opinion & Reviews ePinions, Yelp Question Answering Stack Overflow, Yahoo! Answers, Quora Media Sharing Flickr, Youtube Social Bookmarking Delicious, CiteULike Social Networking Facebook, LinkedIn, MySpace The table shows the various categories where we could classify social media. It contains various types of services thereby resulting into various kinds of data format. The information in most social media site are in text format.
  • 5. Text Mining and Social Media • With the current trend of Data Mining techniques and Business intelligence from data, this question arises relating to social media. “How can I get valuable information from the texts in social media platform?”
  • 6. Unique features of texts in social media • With different kind of social media, there would definitely be some distinct characteristics of this text and how they occur. • Text Analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation • This section gives us a hint on how to answer our previous question.
  • 7.
  • 8. Unique features of texts in social media • Text preprocessing is making the input more consistent to facilitate text representation. text preprocessing methods include stop word removal and stemming. • Features Generation/ Text Representation. The most common ways is to transform them into numeric vectors. Its representation is called BOW or VSM. • Knowledge Discovery: Where we apply machine learning or data mining methods to discover pattern or insight.
  • 9. Unique features of texts in social media • Time Sensitivity. An important and common feature of many social media services is their real- time nature. Bloggers may update their post every x nos of days but most networking sites gets updates regularly like in minutes. The text in social media is not an independent and identically distributed data anymore due to the sensitivity and timeliness of the textual data.
  • 10. Unique features of texts in social media • Short Length As short messages enhances the participation of users on social media sites, it poses a great challenge in mining with clustering or classification as a large number of text provide sufficient context information for effective similarity measure which is a basis for many text processing methods. Example. Twitter is limited to 140 characters, Windows Live messenger is limited to 512 characters but Facebook has 63,026 characters.
  • 11. Unique features of texts in social media • Unstructured Phrases The main challenge posed by content in social media sites is the fact that the distribution of quality has high variance: from very high-quality items to low- quality. This can be attributed to the people’s attitudes when posting a microblogging message or answering a question in a forum. The difficulty here is how to accurately identify the semantic meaning from more than 1 word that’s been abbreviated.
  • 12. Applying Text analytics in social media • Event detection • Event Detection aims to monitor a data source and detect the occurrence of an event that is captured within that source • Collaborative Question Answering: • Analyzing the differences between conversational questions and informational questions
  • 13. Illustrative Example. • This example illustrates how to utilize text analytics to solve problems identified in its application to social media. • We want to improve the short text representation quality by integrating semantic knowledge resources found to be useful in dealing with the semantic gap. • This has 3 steps: • Seed Phase Extraction • Semantic features Generation • Feature Space Construction.
  • 14. Seed Phase Extraction • Problem Statement • Given a sentence level feature T = {t1,t2,…tn}, the phrase levels ti contained in T. The similarity between the ti and {t1,t2,…,tn} is given by: InfoScore(ti) = 𝒋=𝟏,𝒋≠𝒊 𝒏 𝒔𝒆𝒎(𝒕𝒊, 𝒕𝒋) t* = 𝒂𝒓𝒈 𝐦𝐚𝐱 𝒕𝒊 ∈{t1,t2,…tn} 𝑰𝒏𝒇𝒐𝑺𝒄𝒐𝒓𝒆(𝒕𝒊) Where t* is denoted as the phrasal level feature
  • 15. Semantic features Generation • Now the seed phrases has been extracted in the first step. • What this steps aim to achieve is to generate semantic features on the seed phrases. What the seed phrase has help us to do is to obtain an informative and effective basic representation of the input text • We use Wikipedia as our target social media.
  • 16. Algorithm Problem Statement: Given a set of seed phases from a text corpus already preprocessed, generates the semantic features from the text.
  • 17. Feature Space Construction • For the sake of data quality, effectiveness and valuable original information, we conduct 2 more important basic steps in this process. • Feature filtering to refine meaningless features • Feature selection to avoid aggravating the “curse of dimensionality”
  • 18. Feature Space Construction • Feature Filtering For the Wikipedia example, we formulate rules to refine the unstructured features. Some rules could be Remove features generated form too general seed phrases. Transform features e.g List of hotels >>>hotels Remove features related to chronology.
  • 19. Feature Space Construction • Feature Selection • We need to select semantic features to construct feature space for various tasks. • The number of needed features is determined by specific tasks.
  • 20. Feature Space Construction • First we calculate the tf-idf weights of all generated features. term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. • One seed phrase may generate k semantic features denoted by {fi1,fi2,…,fik}. • The selection here is one seed phase, one feature fi * = arg max 𝑓𝑖𝑗 ∈{𝑓𝑖1 , 𝑓𝑖2 ,…, 𝑓𝑖𝑗} 𝑡𝑓_𝑖𝑑𝑓(𝑓𝑖𝑗)
  • 21. Feature Space Construction • Second the top n features are extracted from the remaining semantic features based on their frequency. • These frequently appearing features, together with the features from the first step, are used to construct the m+n semantic features.
  • 22. Finally • With all the processes, and the feature space generated, we can then apply text clustering or any other text analytics methods. • In conclusion, though research is still intense on this subject, nevertheless this short presentation has opened the way for us on how to apply text analytics in social media resources.