SlideShare a Scribd company logo
1 of 17
Zurich Universitiy
of Applied Sciences

Potential and Limitations of
Commercial Sentiment Detection Tools

Fatih Uzdilli
joint work with Mark Cieliebak and Oliver Dürr

03.12.2013 @ ESSEM’13
Zurich Universitiy
of Applied Sciences

About Me

Fatih Uzdilli
Institute of Applied Information Technology (InIT)
ZHAW, Winterthur, Switzerland
Email:
, more about me: home.zhaw.ch/~uzdi

Research Interest
Information Retrieval, Machine Learning, Sentiment Analysis
Background
Software Engineer, Social Media Monitoring, Search Technologies

03.12.2013

Fatih Uzdilli

2
Zurich Universitiy
of Applied Sciences

Abstract

Evaluation of 9 commercial sentiment tools on approx.
30'000 short texts.
Best commercial tools have accuracy of only 60%.

Combining all tools using Random Forest improved the
accuracy.

03.12.2013

Fatih Uzdilli

3
Zurich Universitiy
of Applied Sciences

Motivation

• Scientific results for sentiment detection:
«very good performance: > 80% accuracy»

C

• Blog posts about commercial tools:
«very poor quality, unusable»

D

03.12.2013

Fatih Uzdilli

4
Zurich Universitiy
of Applied Sciences

Motivation

• Scientific results for sentiment detection:
«very good performance: > 80% accuracy»

C

• Blog posts about commercial tools:
«very poor quality, unusable»

D

03.12.2013

Fatih Uzdilli

5
Zurich Universitiy
of Applied Sciences

How good is commercial
Sentiment Detection?

source: http://www.commute.com/images/schools_evaluation.jpg

03.12.2013

Is there potential for
improvement?

source: http://3.bp.blogspot.com/-u3acK_WjaLU/ULYv51mHEhI/
AAAAAAAAARY/ DIZqOfxuswc/s1600/IcebergQ1.jpg

Fatih Uzdilli

6
Zurich Universitiy
of Applied Sciences

Evaluation Setup

• 9 Commercial APIs

• 7 Public Text Corpora

– Stand-alone
– Free for this evaluation
– Arbitrary Text

– Single Statements
– Different Media Types
• Tweet, News, Review,
Speech Transcript

– Total: 28653 Texts

POSITIVE

NEGATIVE

OTHER
( neutral / mixed )
03.12.2013

Fatih Uzdilli

7
Zurich Universitiy
of Applied Sciences

Tool Accuracy

Avg.
Best Tool per Corpus

Accuracy

52%

Worst Tool per Corpus

0.7

61%

Average of All Tools

0.8

40%

0.6

0.5
0.4
0.3
0.2

03.12.2013

Fatih Uzdilli

8
Zurich Universitiy
of Applied Sciences

Tool Accuracy

Best Tool per Corpus
Average of All Tools
Worst Tool per Corpus
Overall Best Tool
Overall Worst Tool

0.8

Accuracy

0.7

Avg.
61%
52%
40%
59%
45%

0.6

0.5
0.4
0.3
0.2

03.12.2013

Fatih Uzdilli

9
Zurich Universitiy
of Applied Sciences

Further Findings

• Longer texts are hard to classify

• Corpus annotations might be erroneous

03.12.2013

Fatih Uzdilli

10
Zurich Universitiy
of Applied Sciences

Can a Meta-Classifier do better?
• 1st Approach: Majority Classifier
– Sentiment with most votes chosen
Illustration:
api1 api2 api3 api4 api5 api6 api7
Text 1
+
+
o
+
o
Text 2
Text 3
Text n

03.12.2013

o

+
o
o

+
+
+

+
o

+
-

Fatih Uzdilli

+
o

o

Majority
+

+
o

11
Zurich Universitiy
of Applied Sciences

Tool Accuracy

Best Tool per Corpus

0.8

Average of All Tools
Worst Tool per Corpus

Accuracy

0.7
0.6
0.5
0.4

0.3
0.2

03.12.2013

Fatih Uzdilli

12
Majority Classifier beats Average

Best Tool per Corpus
Average of All Tools
Worst Tool per Corpus
Majority Classifier

0.8
0.7

Accuracy

Zurich Universitiy
of Applied Sciences

0.6
0.5
0.4

0.3
0.2

03.12.2013

Fatih Uzdilli

13
Zurich Universitiy
of Applied Sciences

2nd Approach: Random-Forest
api1 api2 api3 … api n annotation
Text 1 +
+ … o
+
Train
Text 2
+
o … +
Train
Text 3
o
- … +
Train
Text 4 +
Train
o
+ …
+
Text 5
Text 6
Text 7
Text 8
Text 9
03.12.2013

+
+
+
+
o

o
o
+
-

+
o
+
o
+

…
…
…
…
…

o
o
o
Fatih Uzdilli

o
Train
Train
o
unknown Predict
unknown Predict
unknown Predict

Random
Forest
Classifier

+
+

o
14
Zurich Universitiy
of Applied Sciences

Before Random Forest

Best Tool per Corpus
Average of All Tools
Worst Tool per Corpus
Majority Classifier

0.8

Accuracy

0.7
0.6
0.5
0.4

0.3
0.2

03.12.2013

Fatih Uzdilli

15
Random Forest Beats Best Single Tool

Best Tool per Corpus
Average of All Tools
Worst Tool per Corpus
Majority Classifier
Random Forest Classifier

0.8
0.7

Accuracy

Zurich Universitiy
of Applied Sciences

0.6
0.5
0.4

0.3
0.2

03.12.2013

Fatih Uzdilli

16
Zurich Universitiy
of Applied Sciences

Summary

• Best Tool: 59% Accuracy
• Random Forest combination: Up to 9% improvement

<=9%

03.12.2013

Fatih Uzdilli

17

More Related Content

Similar to Potential and Limitations of Commercial Sentiment Detection Tools

ICWE 2013 - Slides From The Poster And Demo Session
ICWE 2013 - Slides From The Poster And Demo SessionICWE 2013 - Slides From The Poster And Demo Session
ICWE 2013 - Slides From The Poster And Demo SessionAlessandro Bozzon
 
Social Media for Scientific Research
Social Media for Scientific ResearchSocial Media for Scientific Research
Social Media for Scientific ResearchRené Schneider
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
Infrastructures for Open, Digital Science
Infrastructures for Open, Digital ScienceInfrastructures for Open, Digital Science
Infrastructures for Open, Digital ScienceCarl-Christian Buhr
 
Altmetrics - Measuring the Buzz
Altmetrics - Measuring the BuzzAltmetrics - Measuring the Buzz
Altmetrics - Measuring the BuzzBruce Antelman
 
The OpenCon Intro to Open Data
The OpenCon Intro to Open DataThe OpenCon Intro to Open Data
The OpenCon Intro to Open DataRoss Mounce
 
Technology has all the right answers - but we have to start thinking about wh...
Technology has all the right answers - but we have to start thinking about wh...Technology has all the right answers - but we have to start thinking about wh...
Technology has all the right answers - but we have to start thinking about wh...Andy Tattersall
 
CILIP Conference 2019 - Digital innovation - Andy Tattersall
CILIP Conference 2019 - Digital innovation - Andy TattersallCILIP Conference 2019 - Digital innovation - Andy Tattersall
CILIP Conference 2019 - Digital innovation - Andy TattersallCILIP
 
Dart ord the citizen's persepctive-20141107
Dart ord the citizen's persepctive-20141107Dart ord the citizen's persepctive-20141107
Dart ord the citizen's persepctive-20141107Andre Golliez
 
Online Research Coming of age - Brownbag Presentation at Universitty of Preto...
Online Research Coming of age - Brownbag Presentation at Universitty of Preto...Online Research Coming of age - Brownbag Presentation at Universitty of Preto...
Online Research Coming of age - Brownbag Presentation at Universitty of Preto...Holger Lütters
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Carole Goble
 
The data science revolution in insurance
The data science revolution in insuranceThe data science revolution in insurance
The data science revolution in insuranceStefano Perfetti
 
Software Analytics: The Dark Side and the Test Side
Software Analytics: The Dark Side and the Test SideSoftware Analytics: The Dark Side and the Test Side
Software Analytics: The Dark Side and the Test SideAndy Zaidman
 
Social media cafe ResearchGate
Social media cafe ResearchGateSocial media cafe ResearchGate
Social media cafe ResearchGateHugo Besemer
 
Reproducibility Analytics Lab
Reproducibility Analytics Lab Reproducibility Analytics Lab
Reproducibility Analytics Lab Verena139
 
Crowdsourcing - an overview
Crowdsourcing - an overviewCrowdsourcing - an overview
Crowdsourcing - an overviewMirko Presser
 
How Many Dimensions of Compatibility?: Discovering What's Right for Your Users
How Many Dimensions of Compatibility?: Discovering What's Right for Your Users How Many Dimensions of Compatibility?: Discovering What's Right for Your Users
How Many Dimensions of Compatibility?: Discovering What's Right for Your Users Marliese Thomas
 

Similar to Potential and Limitations of Commercial Sentiment Detection Tools (20)

ICWE 2013 - Slides From The Poster And Demo Session
ICWE 2013 - Slides From The Poster And Demo SessionICWE 2013 - Slides From The Poster And Demo Session
ICWE 2013 - Slides From The Poster And Demo Session
 
Social Media for Scientific Research
Social Media for Scientific ResearchSocial Media for Scientific Research
Social Media for Scientific Research
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Infrastructures for Open, Digital Science
Infrastructures for Open, Digital ScienceInfrastructures for Open, Digital Science
Infrastructures for Open, Digital Science
 
Future Technological Practices: Medical Librarians’ Skills and Information St...
Future Technological Practices: Medical Librarians’ Skills and Information St...Future Technological Practices: Medical Librarians’ Skills and Information St...
Future Technological Practices: Medical Librarians’ Skills and Information St...
 
Altmetrics - Measuring the Buzz
Altmetrics - Measuring the BuzzAltmetrics - Measuring the Buzz
Altmetrics - Measuring the Buzz
 
The OpenCon Intro to Open Data
The OpenCon Intro to Open DataThe OpenCon Intro to Open Data
The OpenCon Intro to Open Data
 
Technology has all the right answers - but we have to start thinking about wh...
Technology has all the right answers - but we have to start thinking about wh...Technology has all the right answers - but we have to start thinking about wh...
Technology has all the right answers - but we have to start thinking about wh...
 
CILIP Conference 2019 - Digital innovation - Andy Tattersall
CILIP Conference 2019 - Digital innovation - Andy TattersallCILIP Conference 2019 - Digital innovation - Andy Tattersall
CILIP Conference 2019 - Digital innovation - Andy Tattersall
 
Dart ord the citizen's persepctive-20141107
Dart ord the citizen's persepctive-20141107Dart ord the citizen's persepctive-20141107
Dart ord the citizen's persepctive-20141107
 
Online Research Coming of age - Brownbag Presentation at Universitty of Preto...
Online Research Coming of age - Brownbag Presentation at Universitty of Preto...Online Research Coming of age - Brownbag Presentation at Universitty of Preto...
Online Research Coming of age - Brownbag Presentation at Universitty of Preto...
 
Open, Digital Science in Europe
Open, Digital Science in EuropeOpen, Digital Science in Europe
Open, Digital Science in Europe
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 
The data science revolution in insurance
The data science revolution in insuranceThe data science revolution in insurance
The data science revolution in insurance
 
Software Analytics: The Dark Side and the Test Side
Software Analytics: The Dark Side and the Test SideSoftware Analytics: The Dark Side and the Test Side
Software Analytics: The Dark Side and the Test Side
 
Social media cafe ResearchGate
Social media cafe ResearchGateSocial media cafe ResearchGate
Social media cafe ResearchGate
 
Reproducibility Analytics Lab
Reproducibility Analytics Lab Reproducibility Analytics Lab
Reproducibility Analytics Lab
 
Crowdsourcing - an overview
Crowdsourcing - an overviewCrowdsourcing - an overview
Crowdsourcing - an overview
 
How Many Dimensions of Compatibility?: Discovering What's Right for Your Users
How Many Dimensions of Compatibility?: Discovering What's Right for Your Users How Many Dimensions of Compatibility?: Discovering What's Right for Your Users
How Many Dimensions of Compatibility?: Discovering What's Right for Your Users
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Potential and Limitations of Commercial Sentiment Detection Tools

  • 1. Zurich Universitiy of Applied Sciences Potential and Limitations of Commercial Sentiment Detection Tools Fatih Uzdilli joint work with Mark Cieliebak and Oliver Dürr 03.12.2013 @ ESSEM’13
  • 2. Zurich Universitiy of Applied Sciences About Me Fatih Uzdilli Institute of Applied Information Technology (InIT) ZHAW, Winterthur, Switzerland Email: , more about me: home.zhaw.ch/~uzdi Research Interest Information Retrieval, Machine Learning, Sentiment Analysis Background Software Engineer, Social Media Monitoring, Search Technologies 03.12.2013 Fatih Uzdilli 2
  • 3. Zurich Universitiy of Applied Sciences Abstract Evaluation of 9 commercial sentiment tools on approx. 30'000 short texts. Best commercial tools have accuracy of only 60%. Combining all tools using Random Forest improved the accuracy. 03.12.2013 Fatih Uzdilli 3
  • 4. Zurich Universitiy of Applied Sciences Motivation • Scientific results for sentiment detection: «very good performance: > 80% accuracy» C • Blog posts about commercial tools: «very poor quality, unusable» D 03.12.2013 Fatih Uzdilli 4
  • 5. Zurich Universitiy of Applied Sciences Motivation • Scientific results for sentiment detection: «very good performance: > 80% accuracy» C • Blog posts about commercial tools: «very poor quality, unusable» D 03.12.2013 Fatih Uzdilli 5
  • 6. Zurich Universitiy of Applied Sciences How good is commercial Sentiment Detection? source: http://www.commute.com/images/schools_evaluation.jpg 03.12.2013 Is there potential for improvement? source: http://3.bp.blogspot.com/-u3acK_WjaLU/ULYv51mHEhI/ AAAAAAAAARY/ DIZqOfxuswc/s1600/IcebergQ1.jpg Fatih Uzdilli 6
  • 7. Zurich Universitiy of Applied Sciences Evaluation Setup • 9 Commercial APIs • 7 Public Text Corpora – Stand-alone – Free for this evaluation – Arbitrary Text – Single Statements – Different Media Types • Tweet, News, Review, Speech Transcript – Total: 28653 Texts POSITIVE NEGATIVE OTHER ( neutral / mixed ) 03.12.2013 Fatih Uzdilli 7
  • 8. Zurich Universitiy of Applied Sciences Tool Accuracy Avg. Best Tool per Corpus Accuracy 52% Worst Tool per Corpus 0.7 61% Average of All Tools 0.8 40% 0.6 0.5 0.4 0.3 0.2 03.12.2013 Fatih Uzdilli 8
  • 9. Zurich Universitiy of Applied Sciences Tool Accuracy Best Tool per Corpus Average of All Tools Worst Tool per Corpus Overall Best Tool Overall Worst Tool 0.8 Accuracy 0.7 Avg. 61% 52% 40% 59% 45% 0.6 0.5 0.4 0.3 0.2 03.12.2013 Fatih Uzdilli 9
  • 10. Zurich Universitiy of Applied Sciences Further Findings • Longer texts are hard to classify • Corpus annotations might be erroneous 03.12.2013 Fatih Uzdilli 10
  • 11. Zurich Universitiy of Applied Sciences Can a Meta-Classifier do better? • 1st Approach: Majority Classifier – Sentiment with most votes chosen Illustration: api1 api2 api3 api4 api5 api6 api7 Text 1 + + o + o Text 2 Text 3 Text n 03.12.2013 o + o o + + + + o + - Fatih Uzdilli + o o Majority + + o 11
  • 12. Zurich Universitiy of Applied Sciences Tool Accuracy Best Tool per Corpus 0.8 Average of All Tools Worst Tool per Corpus Accuracy 0.7 0.6 0.5 0.4 0.3 0.2 03.12.2013 Fatih Uzdilli 12
  • 13. Majority Classifier beats Average Best Tool per Corpus Average of All Tools Worst Tool per Corpus Majority Classifier 0.8 0.7 Accuracy Zurich Universitiy of Applied Sciences 0.6 0.5 0.4 0.3 0.2 03.12.2013 Fatih Uzdilli 13
  • 14. Zurich Universitiy of Applied Sciences 2nd Approach: Random-Forest api1 api2 api3 … api n annotation Text 1 + + … o + Train Text 2 + o … + Train Text 3 o - … + Train Text 4 + Train o + … + Text 5 Text 6 Text 7 Text 8 Text 9 03.12.2013 + + + + o o o + - + o + o + … … … … … o o o Fatih Uzdilli o Train Train o unknown Predict unknown Predict unknown Predict Random Forest Classifier + + o 14
  • 15. Zurich Universitiy of Applied Sciences Before Random Forest Best Tool per Corpus Average of All Tools Worst Tool per Corpus Majority Classifier 0.8 Accuracy 0.7 0.6 0.5 0.4 0.3 0.2 03.12.2013 Fatih Uzdilli 15
  • 16. Random Forest Beats Best Single Tool Best Tool per Corpus Average of All Tools Worst Tool per Corpus Majority Classifier Random Forest Classifier 0.8 0.7 Accuracy Zurich Universitiy of Applied Sciences 0.6 0.5 0.4 0.3 0.2 03.12.2013 Fatih Uzdilli 16
  • 17. Zurich Universitiy of Applied Sciences Summary • Best Tool: 59% Accuracy • Random Forest combination: Up to 9% improvement <=9% 03.12.2013 Fatih Uzdilli 17