SlideShare a Scribd company logo
1 of 28
Download to read offline
Sentiment Analysis
State of the Art in Research and Industry
16.9.2016 – SDS 2016 – Zurich
Mark Cieliebak
Zurich University of Applied Sciences
Mark Cieliebak
+ PhD in Theoretical Computer Science
+ IT Consultant in Major Swiss Bank
+ CIO at Netbreeze (bought by Microsoft)
+ >30 Publications
Lecturer Conference ChairCEO
SwissText
3
Mark Cieliebak, 16.9.2016ZHAW
Sentiment Analysis
Goal: Decide whether a text
expresses positive or
negative emotion.
" This is a nice conference! "
4
Mark Cieliebak, 16.9.2016ZHAW
Insights for Marketing and Sales
Sentiment Analysis can identify trends in Social Media
5
Mark Cieliebak, 16.9.2016ZHAW
Characteristics of Sentiment Analysis
Labels:
• Positive
• Negative
• Neutral
• Mixed
• (unknown)
Tasks:
• Single sentence
• Complete document
• Specific aspect/target
• Quantification
6
Mark Cieliebak, 16.9.2016ZHAW
Sentiment-Analysis sounds easy
…but it isn't
@francesco_con40 2nd worst
QB. DEFINITELY Tony Romo.
The man who likes to share
the ball with everyone.
Including the other team
Tim Tebow may be availible !
Wow Jerry , what the heck you
waiting for !
http://t.co/a7z9FBL4
@prodnose is this one of your
little jokes like Elvis playing at
the Marquee next Tuesday?
#YouCantDateMe if u still sag ur
pants super hard...dat shit is
played the fuck out!!!
7
Mark Cieliebak, 16.9.2016ZHAW
A Remark about Tool Quality
"They all suck…and we suck, too."
CEO of a sentiment analysis company (2013)
8
Mark Cieliebak, 16.9.2016ZHAW
Evaluation of Commercial Sentiment
Analysis Tools in 2013
7 Text Corpora
• Single statements
• Various media types (tweet, news,
reviews, speech transcripts etc.)
• Total: 28'653 texts
9 Commercial APIs
• Stand-alone
• Free for this evaluation
• English text
9
Mark Cieliebak, 16.9.2016ZHAW
Quality of Commercial Tools in 2013
0.0000
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
Best Tool per
Corpus
Overall Best Tool
(Sentigem)
Source: M. Cieliebak et al.: Potential and Limitations of Commercial Sentiment Detection Tools, ESSEM 2013.
F1-Score
10
Mark Cieliebak, 16.9.2016ZHAW
0.0000
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
Worst Tool per
Corpus
Best Tool per
Corpus
Overall Best Tool
(Sentigem)
Source: M. Cieliebak et al.: Potential and Limitations of Commercial Sentiment Detection Tools, ESSEM 2013.
F1-Score
Quality of Commercial Tools in 2013
11
Mark Cieliebak, 16.9.2016ZHAW
SemEval: International Competition for
Sentiment Analysis
Year Winning Team F1-Score Winning Technology Remarks
2013 NRC Canada 69.02 Features + large
dictionaries
First run of the competition
2014 TeamX 72.12 Similar approach as in 2013 First two participants using
deep learning
2015 Webis 64.84 Ensemble of 4 approaches
from previous years
2016 SwissCheese 63.30 CNN+Distant Supervision 30'000 new tweets
Dominance of deep learning
among submissions
Task: Build a system for sentiment analysis (pos, neg,
neutral) on tweets in English
12
Mark Cieliebak, 16.9.2016ZHAW
F1-Score
60
65
70
75
80
2013 2014 2015 2016
Winner of the Respective Year
Did Sentiment Technology Improve?
13
Mark Cieliebak, 16.9.2016ZHAW
Did Sentiment Technology Improve?
F1-Score
60
65
70
75
80
2013 2014 2015 2016
Winner of the Respective Year Winner of 2016
Red line: performance of SemEval winner from 2016 (SwissCheese)
if only trained on training data for each year
14
Mark Cieliebak, 16.9.2016ZHAW
A Shallow Dive into
Technology
15
Mark Cieliebak, 16.9.2016ZHAW
SwissCheese: 3-Phase Training with Distant Supervision
Twitter
cat 0.1 0.9 0.3
cats 0.3 0.2 0.7
cute 0.2 0.3 0.1
…
Word Embeddings
Adapted Word Emb.
:-)
:-(
word2vec
GloVe
Distant
Supervision
2-Layer
CNN
Raw Tweets
(200M)
Smiley
Tweets
(90M) .
Annotated
Tweets
(18k)
Unknown
Tweet
cat 0.1 0.9 0.3
cats 0.3 0.2 0.7
cute 0.2 0.3 0.1
…
Predictive
Model
3-PhaseTrainingApplication
16
Mark Cieliebak, 16.9.2016ZHAW
2-Layer Convolutional Neural Network
for Sentiment Analysis
17
Mark Cieliebak, 16.9.2016ZHAW
Distant Phase rearranges Word Embeddings
Before the Distant Phase After the Distant Phase
18
Mark Cieliebak, 16.9.2016ZHAW
The More Data, The Better!
Number of annotated
tweets
Number of tweets in distant
phase
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
test2013-task-B test2014-task-B
test2015-task-B test2016-task-A
19
Mark Cieliebak, 16.9.2016ZHAW
Learn on Tweets, Classify News?
test
train
SemEval'13_tweets MPQ_reviews DIL_reviews DAI_tweets Union of All
Test Data
SemEval'13_tweets 72.4 45.8 53.1 62.2 63.9
MPQ_reviews 62.2 54.1 40.9 57.8 58.7
DIL_reviews 57.3 36.8 55.1 48.5 52.9
DAI_tweets 67.9 37.7 50.4 70.8 60.4
Union of All
Training Data 73.0 50.8 49.9 76.6 66.6
Measured in F1 score
Cross-Domain Performance of SemEval Winner 2016
20
Mark Cieliebak, 16.9.2016ZHAW
Sentiment for other Languages
Language Available Data Best Know Result
(F1 Score)
Reference
German 10'000 Tweets 64.19 Deriu et al., 2016,
WSDM (submitted)
Spanish 68'000 Tweets 71.1
(precision)
Villena-Roman et
al., 2013,
Procesamiento del
Lenguaje Natural
Italian 7'000 Tweets 65.87 Deriu et al., 2016,
WSDM (submitted)
Dutch 1'100 Tweets
(labeled pos/neg)
88.33 Deriu et al., 2016,
WSDM (submitted)
Arabic 1'100 Tweets 73.5 Salab et al., 2015,
ANLP
21
Mark Cieliebak, 16.9.2016ZHAW
We did it: Theory is over!
22
Mark Cieliebak, 16.9.2016ZHAW
Do It Yourself:
Sentiment Analysis Tools and APIs
Big Players
• Google Prediction API
• IBM AlchemyAPI
• Microsoft Azure Text Analytics API
NLP Specialists
• RapidMiner
• Repustate
• Semantria
• SentiStrength
• SpinningBytes
Development Toolkits
• Natural Language ToolKit NLTK (Python)
• StanfordNLP (Java)
23
Mark Cieliebak, 16.9.2016ZHAW
Understand Customer Reviews
Source: http://blog.aylien.com/aspect-based-sentiment-analysis-now-available-in/
Example: Aspect-based Sentiment Analysis for Hotel Reviews
24
Mark Cieliebak, 16.9.2016ZHAW
Use Twitter to predict Heart Disease
Mortality
Source: Eichstaedt et al., 2015: Psychological Language on Twitter Predicts County-Level Heart
Disease Mortality
25
Mark Cieliebak, 16.9.2016ZHAW
"Cleantechness" of Company Products
Air and Environment
Disaster Prevention
Energy Production
Energy Transportation
Energy Efficiency
Mobility
Company Website
Cleantech Topics
Automatic Classifier
26
Mark Cieliebak, 16.9.2016ZHAW
Age and Gender of "Anonymous" Users
Goal: Predict age (18-24, 25-34, 35-49, 50+)
and gender (male/female) of Twitter users
Results PAN 2015:
Age: 86%
Gender: 84%
Source: Rangel et al., 2015: Overview of the 3rd Author Profiling Task at PAN 2015
27
Mark Cieliebak, 16.9.2016ZHAW
Talk in Short!
Sentiment Analysis
 approx. 70% F1-score
 the more data – the better
 has important application
28
Mark Cieliebak, 16.9.2016ZHAW
Thanks!!
Mark Cieliebak
Zurich University of Applied Sciences (ZHAW)
Email: ciel@zhaw.ch, Website: www.zhaw.ch/~ciel
This presentation is based on
joint work with:
• Aurelien Lucchi, ETH
• Dominic Egger, ZHAW
• Fatih Uzdilli, ZHAW
• Jan Deriu, ZHAW
• Leon Derczynski, Univ. of
Sheffield
• Martin Jaggi, EPFL
• Maurice Gonzenbach, ZHAW
• Valeria de Luca, ETH

More Related Content

Similar to Sentiment Analysis - State of the Art in Research and Industry - SDS 2016

ICO Media - Pan European PR agency
ICO Media - Pan European PR agencyICO Media - Pan European PR agency
ICO Media - Pan European PR agencyICO Partners
 
ios1-RecordOfAchievement
ios1-RecordOfAchievementios1-RecordOfAchievement
ios1-RecordOfAchievementNagaraju Annam
 
Romero: What can Visualization do for You?
Romero: What can Visualization do for You?Romero: What can Visualization do for You?
Romero: What can Visualization do for You?Mario Romero, Ph.D.
 
In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...Josh Grossman
 
Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”
Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”
Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”Lviv Startup Club
 
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...Fouad Saeidi
 
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Pres...
App Store Optimization Nerdy Little Secrets - App Growth Summit  Seattle Pres...App Store Optimization Nerdy Little Secrets - App Growth Summit  Seattle Pres...
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Pres...Fouad Saeidi
 
Marketing Tactics for Next-Generation HR Teams
Marketing Tactics for Next-Generation HR TeamsMarketing Tactics for Next-Generation HR Teams
Marketing Tactics for Next-Generation HR TeamsHuman Capital Media
 
Cédric Thomas, OW2 CEO presentation at Net Futures 2016
Cédric Thomas, OW2 CEO presentation at Net Futures 2016Cédric Thomas, OW2 CEO presentation at Net Futures 2016
Cédric Thomas, OW2 CEO presentation at Net Futures 2016OW2
 
sap fiori for ios certificate
sap fiori for ios certificatesap fiori for ios certificate
sap fiori for ios certificatekrishna ravi
 
BAQMaR - Conference Evening
BAQMaR - Conference EveningBAQMaR - Conference Evening
BAQMaR - Conference EveningBAQMaR
 
Market Research Meets Big Data Analytics for Business Transformation
Market Research Meets Big Data Analytics  for Business Transformation Market Research Meets Big Data Analytics  for Business Transformation
Market Research Meets Big Data Analytics for Business Transformation Sally Sadosky
 
dfnd1-2-RecordOfAchievement
dfnd1-2-RecordOfAchievementdfnd1-2-RecordOfAchievement
dfnd1-2-RecordOfAchievementFrederico Rocha
 
ios1-RecordOfAchievement
ios1-RecordOfAchievementios1-RecordOfAchievement
ios1-RecordOfAchievementOvers Mudaka
 
Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...
Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...
Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...Design for Context
 
2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...
2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...
2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...Susanna Frazier
 
CETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project Successful
CETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project SuccessfulCETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project Successful
CETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project SuccessfulChicago eLearning & Technology Showcase
 
Kevin Gray Festival of NewMR 2016
Kevin Gray Festival of NewMR 2016Kevin Gray Festival of NewMR 2016
Kevin Gray Festival of NewMR 2016Ray Poynter
 

Similar to Sentiment Analysis - State of the Art in Research and Industry - SDS 2016 (20)

Knowledge-Driven NGS Solutions
Knowledge-Driven NGS SolutionsKnowledge-Driven NGS Solutions
Knowledge-Driven NGS Solutions
 
ICO Media - Pan European PR agency
ICO Media - Pan European PR agencyICO Media - Pan European PR agency
ICO Media - Pan European PR agency
 
ios1-RecordOfAchievement
ios1-RecordOfAchievementios1-RecordOfAchievement
ios1-RecordOfAchievement
 
Romero: What can Visualization do for You?
Romero: What can Visualization do for You?Romero: What can Visualization do for You?
Romero: What can Visualization do for You?
 
In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...In that case, we have an OWASP Top 10 opportunity...
In that case, we have an OWASP Top 10 opportunity...
 
Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”
Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”
Андрій Безверхий “Практика стартапу з кібербезпеки: bootstrap & go global!”
 
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Prese...
 
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Pres...
App Store Optimization Nerdy Little Secrets - App Growth Summit  Seattle Pres...App Store Optimization Nerdy Little Secrets - App Growth Summit  Seattle Pres...
App Store Optimization Nerdy Little Secrets - App Growth Summit Seattle Pres...
 
Marketing Tactics for Next-Generation HR Teams
Marketing Tactics for Next-Generation HR TeamsMarketing Tactics for Next-Generation HR Teams
Marketing Tactics for Next-Generation HR Teams
 
Cédric Thomas, OW2 CEO presentation at Net Futures 2016
Cédric Thomas, OW2 CEO presentation at Net Futures 2016Cédric Thomas, OW2 CEO presentation at Net Futures 2016
Cédric Thomas, OW2 CEO presentation at Net Futures 2016
 
sap fiori for ios certificate
sap fiori for ios certificatesap fiori for ios certificate
sap fiori for ios certificate
 
BAQMaR - Conference Evening
BAQMaR - Conference EveningBAQMaR - Conference Evening
BAQMaR - Conference Evening
 
Market Research Meets Big Data Analytics for Business Transformation
Market Research Meets Big Data Analytics  for Business Transformation Market Research Meets Big Data Analytics  for Business Transformation
Market Research Meets Big Data Analytics for Business Transformation
 
dfnd1-2-RecordOfAchievement
dfnd1-2-RecordOfAchievementdfnd1-2-RecordOfAchievement
dfnd1-2-RecordOfAchievement
 
ios1-RecordOfAchievement
ios1-RecordOfAchievementios1-RecordOfAchievement
ios1-RecordOfAchievement
 
Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...
Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...
Aligning Your Organization's Strategic Direction, Roadmaps, and Technology, A...
 
2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...
2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...
2017 Spring SourceCon: Key Presentation Slides - Compiled by Susanna Conway @...
 
CETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project Successful
CETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project SuccessfulCETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project Successful
CETS 2011, Mark Steiner, Top 10 Ways to Make Your eLearning Project Successful
 
From open source labs to ceo methods and advice by sysfera
From open source labs to ceo methods and advice by sysferaFrom open source labs to ceo methods and advice by sysfera
From open source labs to ceo methods and advice by sysfera
 
Kevin Gray Festival of NewMR 2016
Kevin Gray Festival of NewMR 2016Kevin Gray Festival of NewMR 2016
Kevin Gray Festival of NewMR 2016
 

More from Mark Cieliebak

Natural Language Processing - A brief survey of technologies and applications
Natural Language Processing - A brief survey of technologies and applicationsNatural Language Processing - A brief survey of technologies and applications
Natural Language Processing - A brief survey of technologies and applicationsMark Cieliebak
 
Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...
Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...
Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...Mark Cieliebak
 
Chatbots and Natural Language Generation - A Bird Eyes View
Chatbots and Natural Language Generation - A Bird Eyes ViewChatbots and Natural Language Generation - A Bird Eyes View
Chatbots and Natural Language Generation - A Bird Eyes ViewMark Cieliebak
 
Machine Intelligence - Wie Systeme lernen und unseren Alltag verändern
Machine Intelligence - Wie Systeme lernen und unseren Alltag verändernMachine Intelligence - Wie Systeme lernen und unseren Alltag verändern
Machine Intelligence - Wie Systeme lernen und unseren Alltag verändernMark Cieliebak
 
Can Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemCan Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemMark Cieliebak
 
#like or #fail - How Can Computers Tell the Difference?
#like or #fail - How Can Computers Tell the Difference? #like or #fail - How Can Computers Tell the Difference?
#like or #fail - How Can Computers Tell the Difference? Mark Cieliebak
 

More from Mark Cieliebak (6)

Natural Language Processing - A brief survey of technologies and applications
Natural Language Processing - A brief survey of technologies and applicationsNatural Language Processing - A brief survey of technologies and applications
Natural Language Processing - A brief survey of technologies and applications
 
Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...
Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...
Chatbots: Technology and Applications - Mark Cieliebak - Swiss ICT Symposium ...
 
Chatbots and Natural Language Generation - A Bird Eyes View
Chatbots and Natural Language Generation - A Bird Eyes ViewChatbots and Natural Language Generation - A Bird Eyes View
Chatbots and Natural Language Generation - A Bird Eyes View
 
Machine Intelligence - Wie Systeme lernen und unseren Alltag verändern
Machine Intelligence - Wie Systeme lernen und unseren Alltag verändernMachine Intelligence - Wie Systeme lernen und unseren Alltag verändern
Machine Intelligence - Wie Systeme lernen und unseren Alltag verändern
 
Can Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemCan Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis Problem
 
#like or #fail - How Can Computers Tell the Difference?
#like or #fail - How Can Computers Tell the Difference? #like or #fail - How Can Computers Tell the Difference?
#like or #fail - How Can Computers Tell the Difference?
 

Recently uploaded

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 

Sentiment Analysis - State of the Art in Research and Industry - SDS 2016

  • 1. Sentiment Analysis State of the Art in Research and Industry 16.9.2016 – SDS 2016 – Zurich Mark Cieliebak Zurich University of Applied Sciences
  • 2. Mark Cieliebak + PhD in Theoretical Computer Science + IT Consultant in Major Swiss Bank + CIO at Netbreeze (bought by Microsoft) + >30 Publications Lecturer Conference ChairCEO SwissText
  • 3. 3 Mark Cieliebak, 16.9.2016ZHAW Sentiment Analysis Goal: Decide whether a text expresses positive or negative emotion. " This is a nice conference! "
  • 4. 4 Mark Cieliebak, 16.9.2016ZHAW Insights for Marketing and Sales Sentiment Analysis can identify trends in Social Media
  • 5. 5 Mark Cieliebak, 16.9.2016ZHAW Characteristics of Sentiment Analysis Labels: • Positive • Negative • Neutral • Mixed • (unknown) Tasks: • Single sentence • Complete document • Specific aspect/target • Quantification
  • 6. 6 Mark Cieliebak, 16.9.2016ZHAW Sentiment-Analysis sounds easy …but it isn't @francesco_con40 2nd worst QB. DEFINITELY Tony Romo. The man who likes to share the ball with everyone. Including the other team Tim Tebow may be availible ! Wow Jerry , what the heck you waiting for ! http://t.co/a7z9FBL4 @prodnose is this one of your little jokes like Elvis playing at the Marquee next Tuesday? #YouCantDateMe if u still sag ur pants super hard...dat shit is played the fuck out!!!
  • 7. 7 Mark Cieliebak, 16.9.2016ZHAW A Remark about Tool Quality "They all suck…and we suck, too." CEO of a sentiment analysis company (2013)
  • 8. 8 Mark Cieliebak, 16.9.2016ZHAW Evaluation of Commercial Sentiment Analysis Tools in 2013 7 Text Corpora • Single statements • Various media types (tweet, news, reviews, speech transcripts etc.) • Total: 28'653 texts 9 Commercial APIs • Stand-alone • Free for this evaluation • English text
  • 9. 9 Mark Cieliebak, 16.9.2016ZHAW Quality of Commercial Tools in 2013 0.0000 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 Best Tool per Corpus Overall Best Tool (Sentigem) Source: M. Cieliebak et al.: Potential and Limitations of Commercial Sentiment Detection Tools, ESSEM 2013. F1-Score
  • 10. 10 Mark Cieliebak, 16.9.2016ZHAW 0.0000 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 Worst Tool per Corpus Best Tool per Corpus Overall Best Tool (Sentigem) Source: M. Cieliebak et al.: Potential and Limitations of Commercial Sentiment Detection Tools, ESSEM 2013. F1-Score Quality of Commercial Tools in 2013
  • 11. 11 Mark Cieliebak, 16.9.2016ZHAW SemEval: International Competition for Sentiment Analysis Year Winning Team F1-Score Winning Technology Remarks 2013 NRC Canada 69.02 Features + large dictionaries First run of the competition 2014 TeamX 72.12 Similar approach as in 2013 First two participants using deep learning 2015 Webis 64.84 Ensemble of 4 approaches from previous years 2016 SwissCheese 63.30 CNN+Distant Supervision 30'000 new tweets Dominance of deep learning among submissions Task: Build a system for sentiment analysis (pos, neg, neutral) on tweets in English
  • 12. 12 Mark Cieliebak, 16.9.2016ZHAW F1-Score 60 65 70 75 80 2013 2014 2015 2016 Winner of the Respective Year Did Sentiment Technology Improve?
  • 13. 13 Mark Cieliebak, 16.9.2016ZHAW Did Sentiment Technology Improve? F1-Score 60 65 70 75 80 2013 2014 2015 2016 Winner of the Respective Year Winner of 2016 Red line: performance of SemEval winner from 2016 (SwissCheese) if only trained on training data for each year
  • 14. 14 Mark Cieliebak, 16.9.2016ZHAW A Shallow Dive into Technology
  • 15. 15 Mark Cieliebak, 16.9.2016ZHAW SwissCheese: 3-Phase Training with Distant Supervision Twitter cat 0.1 0.9 0.3 cats 0.3 0.2 0.7 cute 0.2 0.3 0.1 … Word Embeddings Adapted Word Emb. :-) :-( word2vec GloVe Distant Supervision 2-Layer CNN Raw Tweets (200M) Smiley Tweets (90M) . Annotated Tweets (18k) Unknown Tweet cat 0.1 0.9 0.3 cats 0.3 0.2 0.7 cute 0.2 0.3 0.1 … Predictive Model 3-PhaseTrainingApplication
  • 16. 16 Mark Cieliebak, 16.9.2016ZHAW 2-Layer Convolutional Neural Network for Sentiment Analysis
  • 17. 17 Mark Cieliebak, 16.9.2016ZHAW Distant Phase rearranges Word Embeddings Before the Distant Phase After the Distant Phase
  • 18. 18 Mark Cieliebak, 16.9.2016ZHAW The More Data, The Better! Number of annotated tweets Number of tweets in distant phase 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 test2013-task-B test2014-task-B test2015-task-B test2016-task-A
  • 19. 19 Mark Cieliebak, 16.9.2016ZHAW Learn on Tweets, Classify News? test train SemEval'13_tweets MPQ_reviews DIL_reviews DAI_tweets Union of All Test Data SemEval'13_tweets 72.4 45.8 53.1 62.2 63.9 MPQ_reviews 62.2 54.1 40.9 57.8 58.7 DIL_reviews 57.3 36.8 55.1 48.5 52.9 DAI_tweets 67.9 37.7 50.4 70.8 60.4 Union of All Training Data 73.0 50.8 49.9 76.6 66.6 Measured in F1 score Cross-Domain Performance of SemEval Winner 2016
  • 20. 20 Mark Cieliebak, 16.9.2016ZHAW Sentiment for other Languages Language Available Data Best Know Result (F1 Score) Reference German 10'000 Tweets 64.19 Deriu et al., 2016, WSDM (submitted) Spanish 68'000 Tweets 71.1 (precision) Villena-Roman et al., 2013, Procesamiento del Lenguaje Natural Italian 7'000 Tweets 65.87 Deriu et al., 2016, WSDM (submitted) Dutch 1'100 Tweets (labeled pos/neg) 88.33 Deriu et al., 2016, WSDM (submitted) Arabic 1'100 Tweets 73.5 Salab et al., 2015, ANLP
  • 21. 21 Mark Cieliebak, 16.9.2016ZHAW We did it: Theory is over!
  • 22. 22 Mark Cieliebak, 16.9.2016ZHAW Do It Yourself: Sentiment Analysis Tools and APIs Big Players • Google Prediction API • IBM AlchemyAPI • Microsoft Azure Text Analytics API NLP Specialists • RapidMiner • Repustate • Semantria • SentiStrength • SpinningBytes Development Toolkits • Natural Language ToolKit NLTK (Python) • StanfordNLP (Java)
  • 23. 23 Mark Cieliebak, 16.9.2016ZHAW Understand Customer Reviews Source: http://blog.aylien.com/aspect-based-sentiment-analysis-now-available-in/ Example: Aspect-based Sentiment Analysis for Hotel Reviews
  • 24. 24 Mark Cieliebak, 16.9.2016ZHAW Use Twitter to predict Heart Disease Mortality Source: Eichstaedt et al., 2015: Psychological Language on Twitter Predicts County-Level Heart Disease Mortality
  • 25. 25 Mark Cieliebak, 16.9.2016ZHAW "Cleantechness" of Company Products Air and Environment Disaster Prevention Energy Production Energy Transportation Energy Efficiency Mobility Company Website Cleantech Topics Automatic Classifier
  • 26. 26 Mark Cieliebak, 16.9.2016ZHAW Age and Gender of "Anonymous" Users Goal: Predict age (18-24, 25-34, 35-49, 50+) and gender (male/female) of Twitter users Results PAN 2015: Age: 86% Gender: 84% Source: Rangel et al., 2015: Overview of the 3rd Author Profiling Task at PAN 2015
  • 27. 27 Mark Cieliebak, 16.9.2016ZHAW Talk in Short! Sentiment Analysis  approx. 70% F1-score  the more data – the better  has important application
  • 28. 28 Mark Cieliebak, 16.9.2016ZHAW Thanks!! Mark Cieliebak Zurich University of Applied Sciences (ZHAW) Email: ciel@zhaw.ch, Website: www.zhaw.ch/~ciel This presentation is based on joint work with: • Aurelien Lucchi, ETH • Dominic Egger, ZHAW • Fatih Uzdilli, ZHAW • Jan Deriu, ZHAW • Leon Derczynski, Univ. of Sheffield • Martin Jaggi, EPFL • Maurice Gonzenbach, ZHAW • Valeria de Luca, ETH