SlideShare a Scribd company logo
1 of 12
A Study on the Spacio-Temporal Trend
of Brand Index using Twitter Messages
Sentiment Analysis
Abstract
Twitter Data
Social
Scien
ce
Huma
n
ArtMedic
al
Econo
my
Sentiment
Analysis
Introduction
 Twitter Crawling
 Data Pre-processing
 Korean Morphology Analysis
 Twitter Opinion Mining
 Sentiment Dictionary
 Evaluating performance of candidate classifiers
 Sentiment Classification
 Visualize Associative Relationship of Terms
 Relationship with Brand Index
Twitter Crawling
Twitter API
Streaming API
REST API
- Search API
Get 1% of all
twitter data in
real time
Get twitter data
from the keyword
2013.9.9.Mon. 9:35pm ~ Now
About 10,000 ~ 15,000 tweets per a day
Total 1,220,000 tweets (2013.11.2.Sat)
Data Pre-Processing
 Only get tweets which contain at least more than 3 Korean characters and tweets within
a 500km radius of Seoul, Korea.
 To remove foreign languages, special characters
 Remove tweets which only contain location information.
 Remove retweets
‫ويتكلم‬ ‫نهائيا‬ ‫السمع‬ ‫فقد‬ ‫متعب‬ ‫ابو‬ ‫الملك‬ ‫ان‬ ‫خبر‬ ‫اكد‬ ‫المستوى‬ ‫رفيع‬ ‫وامير‬ ‫موثوق‬ ‫صدر‬
‫مفهوم‬ ‫وغير‬ ‫مترابط‬ ‫غير‬ ‫كالم‬((‫تخريف‬::)) Sat Oct 12 00:06:37 KST 2013
I'm at Club ELLUI - @ellui_seoul (서울특별시) w/ 2
others http://t.co/zhcrncosKH::Sat Oct 12 00:02:06 KST 2013
Korean Morpheme Analyzer
 꼬꼬마 Korean Morpheme Analyzer
 한나눔 Korean Morpheme Analyzer
 Komoran Korean Morpheme Analyzer
 Lucene Korean Analyzer
 은전한닢 Korean Morpheme Analyzer
 Performance of the analyzer
 Foreign language and slang tagging
 Sentiment related word tagging (slang,
verb, emoticon)
 It has good dictionary
 Don’t need to think about word spacing
 But, unable to perceive lots of emoticons,
metaphor, sarcasm, irony.
Korean Morpheme Analyzer
> 배가 아파서 병원에 갔다.
배 NN,F,배,*,*,*,*,*
가 JKS,F,가,*,*,*,*,*
아파서 VA+EC,F,아파서,Inflect,VA,EC,아프/VA+ㅏ서/EC,*
병원 NN,T,병원,*,*,*,*,*
에 JKB,F,에,*,*,*,*,*
갔 VV+EP,T,갔,Inflect,VV,EP,가/VV+ㅏㅆ/EP,*
다 EF,F,다,*,*,*,*,*
. SF,*,*,*,*,*,*,*
EOS
Noun
Verb
Adjective
Adverb
Root
Building Sentiment Dictionary
Manually labeled twitter data
1 • 6 days of twitter data (2013.9.9, 9.16, 9.23, 9.30, 10.7, 10.14)
• Labeled positive and negative sets of Noun, Adjective, Verb, Root (total 8 sets)
• Labeled by 4 person
2 • 20,000 reviews from 2 movies
• 545 positive set, 545 negative set,
545 neutral set
Naver Movie review data with rating
0
1000
2000
3000
4000
5000
6000
1 2 3 4 5 6 7 8 9 10
0
500
1000
1500
2000
2500
3000
3500
1 2 3 4 5 6 7 8 9 10
Positive
Positivenegative
Movie 1 Movie 2
Sentiment Classification
 SVM Classifier
 1. Training set - 150 positive set, 150 negative set (Twitter data)
2. Test set – 545 positive set, 545 negative set (Movie review data)
Accuracy = 70.64220183486239% (770/1090) (classification)
Mean squared error = 1.1743119266055047 (regression)
Squared correlation coefficient = 0.18400994471523438 (regression)
 Naïve bayes Classifier
 SO-PMI Classifier
Building Sentiment Dictionary
Unlabeled &
labeled data set
Ternary classifier : Naïve Bayes,
SO-PMI, SVM
Positive
set
Negative
set
Neutral
set
Positive
set
Negative
set
Neutral
set
Positive
set
Negative
set
Neutral
set
SO-PMI
SVM
Naïve Bayes
Sentiment of Brand Index
Samsung
Galaxy S2
Battery LCDPrice ….
: Brand (keyword)
: Related nouns (attribute)
Adjective
Verb
Noun
Adverb …
correlation
good
good nice
good good
Nice, pretty,
lovely …
Bad, terrible …
PMI(word, pword) + PMI(word, nword)
Determining
Objectivity
Scenario

More Related Content

More from SOYEON KIM

Network embedding
Network embeddingNetwork embedding
Network embeddingSOYEON KIM
 
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...SOYEON KIM
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveySOYEON KIM
 
DeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsSOYEON KIM
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringSOYEON KIM
 
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchVisual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchSOYEON KIM
 
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...SOYEON KIM
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisSOYEON KIM
 
Translated learning
Translated learningTranslated learning
Translated learningSOYEON KIM
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clusteringSOYEON KIM
 
Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...SOYEON KIM
 
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...SOYEON KIM
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a surveySOYEON KIM
 
Opinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsOpinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsSOYEON KIM
 
Evaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionEvaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionSOYEON KIM
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...SOYEON KIM
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
Sentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSOYEON KIM
 
Opinion spam and analysis
Opinion spam and analysisOpinion spam and analysis
Opinion spam and analysisSOYEON KIM
 
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...SOYEON KIM
 

More from SOYEON KIM (20)

Network embedding
Network embeddingNetwork embedding
Network embedding
 
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a survey
 
DeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social Representations
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
 
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchVisual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
 
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
 
Translated learning
Translated learningTranslated learning
Translated learning
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
 
Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...
 
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a survey
 
Opinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsOpinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network Effects
 
Evaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionEvaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognition
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Sentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion mining
 
Opinion spam and analysis
Opinion spam and analysisOpinion spam and analysis
Opinion spam and analysis
 
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
 

Recently uploaded

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 

A study on the spacio temporal trend of brand index using twitter messages sentiment analysis

  • 1. A Study on the Spacio-Temporal Trend of Brand Index using Twitter Messages Sentiment Analysis
  • 3. Introduction  Twitter Crawling  Data Pre-processing  Korean Morphology Analysis  Twitter Opinion Mining  Sentiment Dictionary  Evaluating performance of candidate classifiers  Sentiment Classification  Visualize Associative Relationship of Terms  Relationship with Brand Index
  • 4. Twitter Crawling Twitter API Streaming API REST API - Search API Get 1% of all twitter data in real time Get twitter data from the keyword 2013.9.9.Mon. 9:35pm ~ Now About 10,000 ~ 15,000 tweets per a day Total 1,220,000 tweets (2013.11.2.Sat)
  • 5. Data Pre-Processing  Only get tweets which contain at least more than 3 Korean characters and tweets within a 500km radius of Seoul, Korea.  To remove foreign languages, special characters  Remove tweets which only contain location information.  Remove retweets ‫ويتكلم‬ ‫نهائيا‬ ‫السمع‬ ‫فقد‬ ‫متعب‬ ‫ابو‬ ‫الملك‬ ‫ان‬ ‫خبر‬ ‫اكد‬ ‫المستوى‬ ‫رفيع‬ ‫وامير‬ ‫موثوق‬ ‫صدر‬ ‫مفهوم‬ ‫وغير‬ ‫مترابط‬ ‫غير‬ ‫كالم‬((‫تخريف‬::)) Sat Oct 12 00:06:37 KST 2013 I'm at Club ELLUI - @ellui_seoul (서울특별시) w/ 2 others http://t.co/zhcrncosKH::Sat Oct 12 00:02:06 KST 2013
  • 6. Korean Morpheme Analyzer  꼬꼬마 Korean Morpheme Analyzer  한나눔 Korean Morpheme Analyzer  Komoran Korean Morpheme Analyzer  Lucene Korean Analyzer  은전한닢 Korean Morpheme Analyzer  Performance of the analyzer  Foreign language and slang tagging  Sentiment related word tagging (slang, verb, emoticon)  It has good dictionary  Don’t need to think about word spacing  But, unable to perceive lots of emoticons, metaphor, sarcasm, irony.
  • 7. Korean Morpheme Analyzer > 배가 아파서 병원에 갔다. 배 NN,F,배,*,*,*,*,* 가 JKS,F,가,*,*,*,*,* 아파서 VA+EC,F,아파서,Inflect,VA,EC,아프/VA+ㅏ서/EC,* 병원 NN,T,병원,*,*,*,*,* 에 JKB,F,에,*,*,*,*,* 갔 VV+EP,T,갔,Inflect,VV,EP,가/VV+ㅏㅆ/EP,* 다 EF,F,다,*,*,*,*,* . SF,*,*,*,*,*,*,* EOS Noun Verb Adjective Adverb Root
  • 8. Building Sentiment Dictionary Manually labeled twitter data 1 • 6 days of twitter data (2013.9.9, 9.16, 9.23, 9.30, 10.7, 10.14) • Labeled positive and negative sets of Noun, Adjective, Verb, Root (total 8 sets) • Labeled by 4 person 2 • 20,000 reviews from 2 movies • 545 positive set, 545 negative set, 545 neutral set Naver Movie review data with rating 0 1000 2000 3000 4000 5000 6000 1 2 3 4 5 6 7 8 9 10 0 500 1000 1500 2000 2500 3000 3500 1 2 3 4 5 6 7 8 9 10 Positive Positivenegative Movie 1 Movie 2
  • 9. Sentiment Classification  SVM Classifier  1. Training set - 150 positive set, 150 negative set (Twitter data) 2. Test set – 545 positive set, 545 negative set (Movie review data) Accuracy = 70.64220183486239% (770/1090) (classification) Mean squared error = 1.1743119266055047 (regression) Squared correlation coefficient = 0.18400994471523438 (regression)  Naïve bayes Classifier  SO-PMI Classifier
  • 10. Building Sentiment Dictionary Unlabeled & labeled data set Ternary classifier : Naïve Bayes, SO-PMI, SVM Positive set Negative set Neutral set Positive set Negative set Neutral set Positive set Negative set Neutral set SO-PMI SVM Naïve Bayes
  • 11. Sentiment of Brand Index Samsung Galaxy S2 Battery LCDPrice …. : Brand (keyword) : Related nouns (attribute) Adjective Verb Noun Adverb … correlation good good nice good good Nice, pretty, lovely … Bad, terrible … PMI(word, pword) + PMI(word, nword) Determining Objectivity

Editor's Notes

  1. SNS(SocialNetWorkServic) 시작 확대 -> 개인 BigData 출현 BigData를 이용한 DataMining 대두 트위터롤로지(twitterology) 새로운 학문의 출현 - 트위터를 연구하는 학문’을 뜻하는 신조어 - 소셜네트워크서비스(SNS)인 트위터(twitter)에 학문을 뜻하는 접미사 로지(-logy) - 트위터의 실시간 정보가 사회학 경제학 의학 언어학 등의 연구
  2. Twitter 4J library를 이용한 Streaming API (실시간)와 REST API(15분에 420회- 15분마다 요청하면 420개 받음) 구현 전체 데이터의 1%만 받을 수 있음 – 승우 발표 9월 9일 9:35pm ~ 지금도 계속 하루 평균 만~만오천개의 데이터 현재 2013.11.2 122만개의 데이터 축적
  3. 한글 3글자 이하는 받지않음 (특수문자 다빠지고, 영어, 일본어 다 빠짐) 위치정보 imap 등의 정보 제거 서울 반경 500km 이내의 데이터 받음 (전세계의 트위터가 다나옴. 우리나라꺼만 받기위해)
  4. 은전한닢 형태소분석기 리눅스에서 자바연동
  5. 1. Training set - 긍정 : DB 검색 '좋' 결과 - 이중 150개                         부정 : DB 검색 '싫' 결과 - 이중 150개  2. Test set - 긍정 : 영화평 545개                    부정 : 영화평 545개  사전에 아예 걸리지 않은 영화평도 포함하였을 때  optimization finished, #iter = 73  nu = 0.16326140616206591  obj = -32.23746306073249, rho = 0.11723225832508417  nSV = 61, nBSV = 38  Total nSV = 61  Accuracy = 70.64220183486239% (770/1090) (classification)  Mean squared error = 1.1743119266055047 (regression)  Squared correlation coefficient = 0.18400994471523438 (regression)
  6. p(word1 & word2) is the probability that word1 and word2 co-occur f the degree of statistical dependence between the words The log of the ratio corresponds to a form of correlation
  7. – 시나리오 : 악성 보도 이후 해명기사를 낸 기업