SlideShare a Scribd company logo
自然言語処理による
Twitter タイムライン感情分析
プロジェクトの目的
・投稿されたタイムラインの感情レベルを数値化
・Twitterのようなカジュアルな投稿の分析
・NPLで投稿の感情レベルに対する,評価指標を作成
・感情分析のアルゴリズムを作成し,スコアを高精度化
Twitter API CSV
前処理
形態素解析
プロジェクトの概要
評価極性辞書
辞書化
FastText
レビュー
データ
拡張
Google Natural
Language
①
②
③
④
①〜④スコアを
それぞれ比較,効果検証
Google Natural Language
Twitter API 前処理 GCP スコア
・Twitter API でキーワードを検索 → 7日間分のタイムラインを取得
・csvファイルを取得
・Google Natural Language API → スコア化
・絵文字の除去;関数
・リンクの除去;正規表現
・TimeStamp変換
・データの読み込み
−1 ≤ 𝑠 ≤ 1
・投稿をスコア化
うざいと感じる菅総理
・う;感動詞,*,…
・ざいと;名詞,一般,…
・感じる;動詞,自立,…
・菅;名詞,固有名詞,…
・総理;名詞,一般,…
Neologd
userdic.csv
・固有名詞
・スラング
・うざい;形容詞,一般,…
・と;助詞,格助詞,…
・感じる;動詞,自立,…
・菅総理;名詞,固有名詞,…
拡張
うざい 形容詞 … ウザイ
評価極性辞書
テキスト 評価極性辞書
照合
助詞,連体詞,助動詞
除外
形態素の平均値を算出
・彼;
・は;
・面白い;
・賢い;
NAN
NAN
0.989199
0.999486
注)「NAN」は形態素の要素としてカウントしない
Ex)
𝑠𝑐𝑜𝑟𝑒 =
(0.989199 + 0.999486)
2
= 0.9943425
評価極性辞書の問題点
・カジュアルな単語が網羅されていない
→スラングや俗語などが評価されない
・FastTextの分散表現
・商品レビューによる重要度のスコア化
FastTextを用いたスコア化
・FastText;単語同士の類似度を数値で表現するための学習モデル(Word2Vec)
Wikipedia
学習済みモデル
・テキスト抽出
・分かち書き
FastText
学習
・ネガポジ度を算出
Ex)笑った;0.83442
Ex)ワロタ
レビューサイトによる重要度のスコア化
𝑠 = 𝑛𝑤𝑒𝑖𝑔ℎ𝑡 × log(𝑑𝑓 𝑡 )
𝑑𝑓 𝑡 ;文章における単語の出現頻度
𝑛𝑤𝑒𝑖𝑔ℎ𝑡;レビューサイトの評価( −5 ≤ 𝑛𝑤𝑒𝑖𝑔ℎ𝑡 ≤ 5 )
スコアの極性
・文章中の単語の頻度→単語の極性=スコアの指標
・レビューの星の数(評価)→ネガティブ(負)かポジティブ(正)の分類
プロジェクトの実行結果
・Google Natural Languageによる実装
→『トランプ大統領』のキーワードで評価
プロジェクトの実行結果
・『トランプ大統領』に対するスコアの結果
プロジェクトの評価
・Google Natural Language ・評価極性辞書
「ウザイ」という単語がデータベースにない
明らかにネガティブな投稿なのに
ポジティブなスコアになっている
プロジェクトの評価
・FastText ・レビューデータ
score = -0.521852で平均値化
学習用に利用したWikipediaの記事には
「うざい」という項目がない
プロジェクトの考察
・フォーマルな投稿はAPIである程度スコア化できる.
・レビューデータを利用することで,スラングなどの評価を適切に行うことができた.
(レビュースコアの定義をもう一度見直す必要はあり)
プロジェクトの課題
・文脈を考慮したスコアの分析を行うことでより,精緻化することができる.
・Deep Learningやロジスティック回帰モデルによる評価極性辞書の拡張
・絵文字など,文章ではないが感情を含むテキストの分析

More Related Content

Similar to ルールベースによるTwitter タイムライン感情分析

Pa2 session 4
Pa2 session 4Pa2 session 4
Pa2 session 4
aiclub_slides
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
WingChan46
 
Aws r
Aws rAws r
Real time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsReal time semantic search engine for social tv streams
Real time semantic search engine for social tv streams
Sngular Meaning
 
Pa1 session 6
Pa1 session 6Pa1 session 6
Pa1 session 6
aiclub_slides
 
Agile Estimation
Agile EstimationAgile Estimation
Agile Estimation
Saltmarch Media
 
Natural Language Processing for Data Analytics - Tel Aviv Summit 2018
Natural Language Processing for Data Analytics - Tel Aviv Summit 2018Natural Language Processing for Data Analytics - Tel Aviv Summit 2018
Natural Language Processing for Data Analytics - Tel Aviv Summit 2018
Amazon Web Services
 
Measuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and PythonMeasuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and Python
Sujit Pal
 
Newest mmis resume
Newest mmis  resumeNewest mmis  resume
Newest mmis resume
Supratik Chanda
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
Rish Agrawal
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
Rish Agrawal
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
Rish Agrawal
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Matt Stubbs
 
Plagiarism Checker.pptx
Plagiarism Checker.pptxPlagiarism Checker.pptx
Plagiarism Checker.pptx
kumaragurusv
 
Christine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdfChristine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdf
Christine Straub
 
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Amazon Web Services
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
Rish Agrawal
 
Mike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupMike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backup
m1ked
 
Penghao Wang Intern Resume 2016 Spring
Penghao Wang Intern Resume 2016 SpringPenghao Wang Intern Resume 2016 Spring
Penghao Wang Intern Resume 2016 Spring
Penghao Wang
 
Suhas_Manjunath_Resume
Suhas_Manjunath_ResumeSuhas_Manjunath_Resume
Suhas_Manjunath_Resume
Suhas m
 

Similar to ルールベースによるTwitter タイムライン感情分析 (20)

Pa2 session 4
Pa2 session 4Pa2 session 4
Pa2 session 4
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Aws r
Aws rAws r
Aws r
 
Real time semantic search engine for social tv streams
Real time semantic search engine for social tv streamsReal time semantic search engine for social tv streams
Real time semantic search engine for social tv streams
 
Pa1 session 6
Pa1 session 6Pa1 session 6
Pa1 session 6
 
Agile Estimation
Agile EstimationAgile Estimation
Agile Estimation
 
Natural Language Processing for Data Analytics - Tel Aviv Summit 2018
Natural Language Processing for Data Analytics - Tel Aviv Summit 2018Natural Language Processing for Data Analytics - Tel Aviv Summit 2018
Natural Language Processing for Data Analytics - Tel Aviv Summit 2018
 
Measuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and PythonMeasuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and Python
 
Newest mmis resume
Newest mmis  resumeNewest mmis  resume
Newest mmis resume
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
 
Plagiarism Checker.pptx
Plagiarism Checker.pptxPlagiarism Checker.pptx
Plagiarism Checker.pptx
 
Christine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdfChristine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdf
 
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
 
Rishubh Agrawal Resume
Rishubh Agrawal ResumeRishubh Agrawal Resume
Rishubh Agrawal Resume
 
Mike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backupMike davies sentiment_analysis_presentation_backup
Mike davies sentiment_analysis_presentation_backup
 
Penghao Wang Intern Resume 2016 Spring
Penghao Wang Intern Resume 2016 SpringPenghao Wang Intern Resume 2016 Spring
Penghao Wang Intern Resume 2016 Spring
 
Suhas_Manjunath_Resume
Suhas_Manjunath_ResumeSuhas_Manjunath_Resume
Suhas_Manjunath_Resume
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

ルールベースによるTwitter タイムライン感情分析