SlideShare a Scribd company logo
Top 10 popular sentiment analysis
datasets
Sentiment analysis has found its applications in various fields which now help companies
to correctly assess and learn from their customers or clients. Sentiment analysis is
increasingly used for social media monitoring, brand monitoring, voice of the customer
(VoC), customer service, and market research.
Sentiment Analysis uses rules-based, hybrid, or machine-learning-based NLP methods and
algorithms to learn data from datasets. The data needed for sentiment analysis must be
specialized and needed in large quantities.
The hardest part of the sentiment analysis training process is not finding large amounts of
data; it’s more about finding the relevant datasets. These datasets should cover a wide area of
sentiment analysis and use case applications.
Below are some of the most popular datasets for sentiment analysis.
Newsdata.io news dataset
Newsdata.io provides news datasets that contain raw News data in CSV, Excel, and JSON
formats. The dataset contains historical News data exactly as it is posted on the News sources
along with lots of metadata such as the Title of the news, its URL, date and time, publisher,
and much more. You can request the historical news data by filling this form.
The price of the Newsdata.io Historical News dataset starts from $50 and depends on the
number of historical news you want and the length of the time. It is a one time cost that is
liable for a single report
Amazon product data
Amazon Product Data is a subset of a large 142.8 million Amazon review dataset that was
made available by Stanford Professor Julian McAuley.
This sentiment analysis dataset contains reviews from May 1996 through July 2014. Dataset
reviews include ratings, text, payloads, product description, category information, price,
brand, and image characteristics.
IMDB Movie Reviews Dataset
This large movie dataset contains a collection of approximately 50,000 IMDB movie reviews.
Only highly polarized opinions are considered in this dataset. Positive and negative reviews
are equal; however, negative reviews are rated 4 out of 10 and positive reviews are rated ≥ 7
out of 10.
Stanford Sentiment Treebank
This dataset contains just over 10,000 Stanford data from Rotten Tomatoes HTML files.
Feelings are rated between 1 and 25, where one is the most negative and 25 is the most
positive.
Stanford’s deep learning model was built on representing sentences based on sentence
structure instead of only giving marks based on positive and negative words.
Multi-Domain Sentiment Dataset
This dataset contains positive and negative files for thousands of Amazon products. While
the reviews are for older products, this data set is great to use. The data comes from the
Computer Science Department at Johns Hopkins University.
Reviews contain 1 to 5-star ratings which can be converted to binary as needed.
Download original data:
Unprocessed.tar.gz
process_acl.tar.gz
Process_stars.tar.gz
Sentiment140
Sentiment140 is used to know the sentiment of a brand or product or even a topic on the
social media platform Twitter. Rather than working on a keyword-based approach, which
exploits high precision for lower recall, Sentiment140 works with classifiers created by
machine learning algorithms.
Sentiment140 uses ranking results for individual tweets as well as the traditional surface that
aggregates metrics. Sentiment140 is used for brand management, surveys, and purchasing
planning.
Paper Reviews Data Set
The article review dataset contains English and Spanish reviews of computer science and
computer science conferences. The algorithm used predicts the opinions of reviews of
academic articles.
Most sentiment analysis data of this type is sent in Spanish. It has a total of instances of N =
405 rated on a 5 point scale, 2: very negative, 1: neutral, 1: positive, 2: very positive.
The distribution of marks is uniform and there is a difference between the way the article is
rated and the review written by the original reviewer.
Twitter US Airline Sentiment
This sentiment analysis dataset contains tweets from February 2015 about each of the major
U.S. airlines. Each tweet is classified as positive, negative, or neutral.
Features included include Twitter id, sentiment trust score, sentiments, negative reasons,
airline name, number of retweets, name, tweet text, tweet contact details, the date and time
of the tweet, and the location of the tweet.
Sentiment Lexicons For 81 Languages
Sentimental Lexicon for 81 languages contains languages ranging from Afrikaans to Yiddish.
These data include both positive and negative sentimental lexicons for a total of 81
languages.
These lexicons were generated by graph propagation for sentiment analysis based on a
knowledge graph which is a graphical representation of real-world objects and the
relationship between them.
The general idea is that closely related words on a knowledge graph can have similar
polarities. Sentiments were built on the basis of English sentimental lexicons.
Opin-Rank Review Dataset
OpinRank Review Dataset contains comprehensive reviews of cars and hotels. This dataset
includes approximately 2.59,000 hotel reviews and 42,230 car reviews collected by
TripAdvisor and Edmunds, respectively.
The car data set includes models from 2007, 2008, 2009 and has approximately 140,250
cars each year. Fields include dates, favorites, author names, and full-text reviews.
The dataset contains information on 10 different cities including Dubai, Beijing, Las Vegas,
San Francisco, etc. There are reviews of around 80,700 hotels for each city. The fields include
revision, date, title, and full revision.
Lexicoder Sentiment Dictionary
This sentiment analysis dataset is designed for use in Lexicoder, which performs the content
analysis. This dictionary consists of 2,858 negative feelings words and 1,709 positive feelings
words.
In addition to this, 2,860 negative words and 1,721 positive words are also included. The
developers advise anyone who wants to test this to subtract the positive-negative words from
the positive counts and subtract the negative words from the negative count.

More Related Content

More from Aparna Sharma

Top 11 API testing tools for 2022
Top 11 API testing tools for 2022Top 11 API testing tools for 2022
Top 11 API testing tools for 2022
Aparna Sharma
 
Top 11 api testing tools for 2022
Top 11 api testing tools for 2022Top 11 api testing tools for 2022
Top 11 api testing tools for 2022
Aparna Sharma
 
Top api testing tools in 2022
Top api testing tools in 2022Top api testing tools in 2022
Top api testing tools in 2022
Aparna Sharma
 
Best practices and advantages of REST APIs
Best practices and advantages of REST APIsBest practices and advantages of REST APIs
Best practices and advantages of REST APIs
Aparna Sharma
 
Is web scraping legal or not?
Is web scraping legal or not?Is web scraping legal or not?
Is web scraping legal or not?
Aparna Sharma
 
Top 17 web scraping tools for data extraction in 2022
Top 17 web scraping tools for data extraction in 2022Top 17 web scraping tools for data extraction in 2022
Top 17 web scraping tools for data extraction in 2022
Aparna Sharma
 
Future of saas in 2022 presentation
Future of saas in 2022 presentationFuture of saas in 2022 presentation
Future of saas in 2022 presentation
Aparna Sharma
 
Future of saas in 2022
Future of saas in 2022Future of saas in 2022
Future of saas in 2022
Aparna Sharma
 
10 best platforms to find free datasets
10 best platforms to find free datasets10 best platforms to find free datasets
10 best platforms to find free datasets
Aparna Sharma
 
Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022
Aparna Sharma
 
What is API test automation
What is API test automation What is API test automation
What is API test automation
Aparna Sharma
 
What is the difference between an api and web services
What is the difference between an api and web servicesWhat is the difference between an api and web services
What is the difference between an api and web services
Aparna Sharma
 
What are restful web services?
What are restful web services?What are restful web services?
What are restful web services?
Aparna Sharma
 

More from Aparna Sharma (13)

Top 11 API testing tools for 2022
Top 11 API testing tools for 2022Top 11 API testing tools for 2022
Top 11 API testing tools for 2022
 
Top 11 api testing tools for 2022
Top 11 api testing tools for 2022Top 11 api testing tools for 2022
Top 11 api testing tools for 2022
 
Top api testing tools in 2022
Top api testing tools in 2022Top api testing tools in 2022
Top api testing tools in 2022
 
Best practices and advantages of REST APIs
Best practices and advantages of REST APIsBest practices and advantages of REST APIs
Best practices and advantages of REST APIs
 
Is web scraping legal or not?
Is web scraping legal or not?Is web scraping legal or not?
Is web scraping legal or not?
 
Top 17 web scraping tools for data extraction in 2022
Top 17 web scraping tools for data extraction in 2022Top 17 web scraping tools for data extraction in 2022
Top 17 web scraping tools for data extraction in 2022
 
Future of saas in 2022 presentation
Future of saas in 2022 presentationFuture of saas in 2022 presentation
Future of saas in 2022 presentation
 
Future of saas in 2022
Future of saas in 2022Future of saas in 2022
Future of saas in 2022
 
10 best platforms to find free datasets
10 best platforms to find free datasets10 best platforms to find free datasets
10 best platforms to find free datasets
 
Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022
 
What is API test automation
What is API test automation What is API test automation
What is API test automation
 
What is the difference between an api and web services
What is the difference between an api and web servicesWhat is the difference between an api and web services
What is the difference between an api and web services
 
What are restful web services?
What are restful web services?What are restful web services?
What are restful web services?
 

Recently uploaded

Upturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in NashikUpturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio, Inc.
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
campbellclarkson
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
Anand Bagmar
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
kalichargn70th171
 
What is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdfWhat is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdf
kalichargn70th171
 
Cost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App DevelopmentCost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App Development
Softradix Technologies
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
safelyiotech
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
ICS
 

Recently uploaded (20)

Upturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in NashikUpturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in Nashik
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
 
What is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdfWhat is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdf
 
Cost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App DevelopmentCost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App Development
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 

Top 10 popular sentiment analysis datasets

  • 1. Top 10 popular sentiment analysis datasets Sentiment analysis has found its applications in various fields which now help companies to correctly assess and learn from their customers or clients. Sentiment analysis is increasingly used for social media monitoring, brand monitoring, voice of the customer (VoC), customer service, and market research. Sentiment Analysis uses rules-based, hybrid, or machine-learning-based NLP methods and algorithms to learn data from datasets. The data needed for sentiment analysis must be specialized and needed in large quantities. The hardest part of the sentiment analysis training process is not finding large amounts of data; it’s more about finding the relevant datasets. These datasets should cover a wide area of sentiment analysis and use case applications.
  • 2. Below are some of the most popular datasets for sentiment analysis. Newsdata.io news dataset Newsdata.io provides news datasets that contain raw News data in CSV, Excel, and JSON formats. The dataset contains historical News data exactly as it is posted on the News sources along with lots of metadata such as the Title of the news, its URL, date and time, publisher, and much more. You can request the historical news data by filling this form. The price of the Newsdata.io Historical News dataset starts from $50 and depends on the number of historical news you want and the length of the time. It is a one time cost that is liable for a single report Amazon product data Amazon Product Data is a subset of a large 142.8 million Amazon review dataset that was made available by Stanford Professor Julian McAuley. This sentiment analysis dataset contains reviews from May 1996 through July 2014. Dataset reviews include ratings, text, payloads, product description, category information, price, brand, and image characteristics. IMDB Movie Reviews Dataset This large movie dataset contains a collection of approximately 50,000 IMDB movie reviews. Only highly polarized opinions are considered in this dataset. Positive and negative reviews are equal; however, negative reviews are rated 4 out of 10 and positive reviews are rated ≥ 7 out of 10. Stanford Sentiment Treebank This dataset contains just over 10,000 Stanford data from Rotten Tomatoes HTML files. Feelings are rated between 1 and 25, where one is the most negative and 25 is the most positive.
  • 3. Stanford’s deep learning model was built on representing sentences based on sentence structure instead of only giving marks based on positive and negative words. Multi-Domain Sentiment Dataset This dataset contains positive and negative files for thousands of Amazon products. While the reviews are for older products, this data set is great to use. The data comes from the Computer Science Department at Johns Hopkins University. Reviews contain 1 to 5-star ratings which can be converted to binary as needed. Download original data: Unprocessed.tar.gz process_acl.tar.gz Process_stars.tar.gz Sentiment140 Sentiment140 is used to know the sentiment of a brand or product or even a topic on the social media platform Twitter. Rather than working on a keyword-based approach, which exploits high precision for lower recall, Sentiment140 works with classifiers created by machine learning algorithms. Sentiment140 uses ranking results for individual tweets as well as the traditional surface that aggregates metrics. Sentiment140 is used for brand management, surveys, and purchasing planning. Paper Reviews Data Set
  • 4. The article review dataset contains English and Spanish reviews of computer science and computer science conferences. The algorithm used predicts the opinions of reviews of academic articles. Most sentiment analysis data of this type is sent in Spanish. It has a total of instances of N = 405 rated on a 5 point scale, 2: very negative, 1: neutral, 1: positive, 2: very positive. The distribution of marks is uniform and there is a difference between the way the article is rated and the review written by the original reviewer. Twitter US Airline Sentiment This sentiment analysis dataset contains tweets from February 2015 about each of the major U.S. airlines. Each tweet is classified as positive, negative, or neutral. Features included include Twitter id, sentiment trust score, sentiments, negative reasons, airline name, number of retweets, name, tweet text, tweet contact details, the date and time of the tweet, and the location of the tweet. Sentiment Lexicons For 81 Languages Sentimental Lexicon for 81 languages contains languages ranging from Afrikaans to Yiddish. These data include both positive and negative sentimental lexicons for a total of 81 languages. These lexicons were generated by graph propagation for sentiment analysis based on a knowledge graph which is a graphical representation of real-world objects and the relationship between them. The general idea is that closely related words on a knowledge graph can have similar polarities. Sentiments were built on the basis of English sentimental lexicons. Opin-Rank Review Dataset
  • 5. OpinRank Review Dataset contains comprehensive reviews of cars and hotels. This dataset includes approximately 2.59,000 hotel reviews and 42,230 car reviews collected by TripAdvisor and Edmunds, respectively. The car data set includes models from 2007, 2008, 2009 and has approximately 140,250 cars each year. Fields include dates, favorites, author names, and full-text reviews. The dataset contains information on 10 different cities including Dubai, Beijing, Las Vegas, San Francisco, etc. There are reviews of around 80,700 hotels for each city. The fields include revision, date, title, and full revision. Lexicoder Sentiment Dictionary This sentiment analysis dataset is designed for use in Lexicoder, which performs the content analysis. This dictionary consists of 2,858 negative feelings words and 1,709 positive feelings words. In addition to this, 2,860 negative words and 1,721 positive words are also included. The developers advise anyone who wants to test this to subtract the positive-negative words from the positive counts and subtract the negative words from the negative count.