SlideShare a Scribd company logo
1 of 31
Presented by
Name of Student
(Roll No)
Under the Guidance of
An Approach For Sentiment Analysis On
Big Social Data Using Spark
Dr. Chiranjeevi Manike, Associate Professor
Department of Computer Science & Engineering
B V Raju Institute of Technology, Narsapur
• Collecting the opinions of the public by analyzing the big
social data has attracted a large amount of attention due to its
interactive and real-time nature.
• For this concept, recent studies have depended on both Social
Media and Sentiment Analysis so as to accompany big events
by tracking people’s behavior.
• The proposed system provides an adaptable approach of
Sentiment Analysis that analyzes social media posts and draws
user’s opinions in real-time.
• The approach used consists of two steps.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Abstract
• The first step is to build a dynamic dictionary of words’
Polarity based on a chosen set of Hashtags that are related to a
given subject.
• The second step is to classify the posts under many subjects by
introducing new qualities which firmly refine the polarity level
of a post.
• Twitter, Facebook and other social media conversations can be
mined for Sentiment data to know about the competition.
• Social media blogs help in knowing the current discussions of
the public.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Abstract
• The obtained information can be used to take focused, real-
time, decisions that boost market share.
• Spark is used as it is helpful in streaming real-time data from
various sources of Social Networks such as Twitter, Stock
Exchange, and Geographical Information Systems.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Abstract
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Introduction
• Millions of people around the world are being able to express
their viewpoints and sprawl them in the present days.
• For that purpose, Social Media has been very helpful from many
years.
• Social Networking websites and applications let the users show
their opinions by responding (liking or disliking) to the content
posted.
• The users may even post the content of their own to display their
intentions or feelings towards one particular subject or even more
number of them.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Introduction
• The accumulated data and the performed activities on social
media produce Volume, Variety, Value, Variability and Veracity in
large amounts and thus can be called as Big Social Data.
• Usually, the data of this kind consists of numerous sets of
opinions that can be processed to know public proneness in the
digital platform.
• Many research methods are involved to process this type of
activities such as Text Analysis.
• Most of the internet data, that is almost 80 percentage of it is text.
• That is why, Text Analysis has emerged to be an important factor
for Public Sentiment and Opinion Extraction.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Introduction
• Sentiment Analysis is also known as Opinion Mining.
• It targets on the people’s sentiment regarding a subject of matter
by performing analysis on their posts and related actions on
social media.
• Then, it proceeds with classification of the posts to determine
polarity and give results such as positive, negative and so on.
• The ‘Sentiment’ in each statement/tweet can be extracted using
two popular approaches:
– Lexicon Analysis
– Machine Learning
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Introduction
• Objective texts are also a part of Sentiment Analysis as they
show the ‘Neutral’ category of polarity.
• Eastern emojis are no longer used as they are the combination of
special characters and some people don’t tend to understand
them.
• So, Emojis which we use in the present day, play a crucial role in
the Sentiment of text, especially in tweets.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Introduction
Shrug expression examples
Eastern emoji
¯  _ (ツ) _ / ¯
Present-day emoji
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Introduction
• The two methods of analysis have been used more often on big
social data to gather public opinion to assess user's satisfaction of
a subject (services, products, events, topics or persons) in several
domains including Politics, Marketing, Health, Travel etc.
• However, the results may vary depending on a reasonable degree
of accuracy.
• The failure is caused generally due to the challenges of opinion
mining such as the semantic orientation of a word which could
change based on the context.
Literature Review
[01] Garg, K. and Kaur, D., 2019. Sentiment Analysis on Twitter Data using
Apache Hadoop and Performance Evaluation on Hadoop MapReduce and
Apache Spark. In Proceedings on the International Conference on Artificial
Intelligence (ICAI) (pp. 233-238). The Steering Committee of The World
Congress in Computer Science, Computer Engineering and Applied
Computing (WorldComp).
 Objective of Paper: Analyze real-time streaming of Twitter data to identify
the sentiment expressed in each tweet using Cloudera.
 Approach/Algorithm/Framework: Hadoop MapReduce framework
 Pros: Performance of Apache Spark has turned out to be considerably higher
almost 2x in terms of time on a single node.
 Cons: The correlation between user influence and sentiment of the author is
not computed by using Hadoop effectively.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[03] Al-Saqqa, S., Al-Naymat, G. and Awajan, A., 2018. A Large-Scale
Sentiment Data Classification for Online Reviews Under Apache Spark.
Procedia Computer Science, 141, pp.183-189.
 Objective of Paper: To present new evaluation experiments of sentiment
analysis for a large-scale dataset of online customer's reviews under Apache
Spark data Processing System.
 Approach/Algorithm/Framework: Spark's MLlib's classifiers/algorithms:
Naive Bayes, Support Vector Machine and Logistic regression
 Pros: According to the experimental results, Support vector machine
classifier performs better than Naïve Bayes and Logistic Regression
classifiers.
 Cons: Experiments using different feature sets and n-gram models (bi-gram
and tri-gram) that may enhance the performance of the classification are not
conducted.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[04] Ranganathan, J., Irudayaraj, A.S. and Tzacheva, A.A., 2017, November.
Action rules for sentiment analysis on twitter data using spark. In 2017 IEEE
International Conference on Data Mining Workshops (ICDMW) (pp. 51-60).
IEEE.
 Objective of Paper: To implement a new optimized and more promising
system, in terms of speed and efficiency, to generate meta-actions by
implementing Specific Action Rule discovery based on Grabbing strategy
(SARGS) algorithm.
 Approach/Algorithm/Framework: Action Rule mining algorithm
 Pros: According to the results, faster computational time for Spark system is
noticed compared to Hadoop MapReduce for implementing the meta-action
generation methods.
 Cons: Testing the system with more real-time large data like NPS dataset to
test and improve system’s scalability and feasibility is not done.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[07] Adib, P., Alirezazadeh, S. and Nezarat, A., 2017, October. Enhancing trust
accuracy among online social network users utilizing data text mining
techniques in apache spark. In 2017 7th International Conference on
Computer and Knowledge Engineering (ICCKE) (pp. 283-288). IEEE.
 Objective of Paper: To find malicious users and analyze their behavior to
proceed a more accurate trust within distributed execution in Spark
environment for providing a quicker call.
 Approach/Algorithm/Framework: Stochastic gradient descent (SGD)
 Pros: The proposed model benefits from a high diagnostic accuracy and
precedes SGD with 38% higher performance.
 Cons: The use of reverse malicious words in dictionary to keep a more
accurate detection of malicious users through their tweets is not done here.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[08] Podhoranyi, M. and Vojacek, L., 2019, September. Social Media Data
Processing Infrastructure by Using Apache Spark Big Data Platform: Twitter
Data Analysis. In Proceedings of the 2019 4th International Conference on
Cloud Computing and Internet of Things (pp. 1-6).
 Objective of Paper: To develop an architecture and a workflow which can
process Twitter social network data in near real-time so that tweets with the
defined topic – floods are analyzed.
 Approach/Algorithm/Framework: Apache Flume, Hadoop Distributed File
System (HDFS), HIVE with HiveQL, YARN, SPARK.
 Pros: The Word Frequency method (n-grams) is effective in revealing the
tweets’ content and proved their high informative potential in terms of data
quality and quantity.
 Cons: Text analyzing methods that are focused on geo-names extraction are
not applied.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[09] Yeruva, V.K., Junaid, S. and Lee, Y., 2017, November. Exploring social
contextual influences on healthy eating using big data analytics. In 2017
IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
(pp. 1507-1514). IEEE.
 Objective of Paper: To implement a Big Data Analytics framework which
targets to explore social contextual influences on healthy eating.
 Approach/Algorithm/Framework: BIDAF – Big Data Analytics
Framework for Smart Society
 Pros: The obtained results indicated that the BiDAF framework is effective in
classification and sentiment analysis of food tweet messages and showed its
potential towards healthy eating.
 Cons: BiDAF might not be very suitable for building a highly-customized
model required by client as it is a Front-end framework.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[10] Moise, I., Gaere, E., Merz, R., Koch, S. and Pournaras, E., 2016, December.
Tracking language mobility in the twitter landscape. In 2016 IEEE 16th
International Conference on Data Mining Workshops (ICDMW) (pp. 663-
670). IEEE.
 Objective of Paper: To examine the mobility of languages as captured by the
Twitter signal and extracting value from Twitter data.
 Approach/Algorithm/Framework: Density-based Clustering and Self-
Organizing Maps Techniques
 Pros: The analysis enabled the detection of tourism trends and real-world
events, as discovered through the Twitter lens based on country-language
coupling.
 Cons: Exploring the methods that identify location from the text of the
tweets, and then applying the same analytical steps on other countries and
languages are not included.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[11] Bharill, N., Tiwari, A. and Malviya, A., 2016. Fuzzy based scalable
clustering algorithms for handling big data using apache spark. IEEE
Transactions on Big Data, 2(4), pp.339-352.
 Objective of Paper: Implementing partitional based clustering algorithms on
Apache Spark, which are suited for clustering large datasets due to their low
computational requirements.
 Approach/Algorithm/Framework: Sampling with Iterative Optimization
Fuzzy c-Means algorithm (SRSIO-FCM)
 Pros: The results produced comparative reports regarding time and space
complexity, run time and measure of clustering quality, revealing that SRSIO-
FCM is able to run in very less time without compromising on the clustering
quality.
 Cons: Well-known cluster validity measures for use on Big Data by using
similar extensions are not presented.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[12] Rodrigues, A.P. and Chiplunkar, N.N., 2018. Real-time Twitter data analysis
using Hadoop ecosystem. Cogent Engineering, 5(1), p.1534519.
 Objective of Paper: To compare Executed tweets to Real-time tweets and
Performance in terms of execution time for analysis of real-time tweets using
Pig and Hive.
 Approach/Algorithm/Framework: Hadoop Ecosystem
 Pros: The experimental results show that Pig is more efficient than Hive as
Pig takes less time for execution when compared to Hive.
 Cons: Only large-scale business organizations which generate big data can
utilize Hadoop's function and it cannot efficiently perform in small-scale data
environments.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[13] Swe, T.T., Phyu, P. and Thein, S.P.P., 2019. Weather Prediction Model using
Random Forest Algorithm and Apache Spark. Weather, 3(6).
 Objective of Paper: Analyzing the algorithms on big data that are suitable
for weather prediction and focusing on the performance analysis with
Random Forest algorithms.
 Approach/Algorithm/Framework: Apache Spark
 Pros: Experimental results indicate the supreme and notable merits of
Random Forest over the other algorithms in terms of classification accuracy,
performance, and scalability.
 Cons: The incremental parallel random forest algorithm for data streams in
cloud environment and task scheduling mechanism for the algorithm are not
implemented.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[14] Wang, Y., Wang, M. and Xu, W., 2018. A sentiment-enhanced hybrid
recommender system for movie recommendation: a big data analytics
framework. Wireless Communications and Mobile Computing, 2018.
 Objective of Paper: A hybrid recommendation model to improve the
accuracy and timeliness of mobile movie recommender system based on
sentiment analysis .
 Approach/Algorithm/Framework: Apache Spark, Content-based
recommender system, Collaborative filtering recommender system, Hybrid
recommender system
 Pros: The implemented method makes it suitable and fast for users to receive
useful movie suggestions.
 Cons: Eliminating the individual characteristics hidden in the text description
from users is not performed.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Literature Review
[16] Omar, H.K. and Jumaa, A.K., 2019. Big Data Analysis Using Apache Spark
MLlib and Hadoop HDFS with Scala and Java. Kurdistan Journal of Applied
Research, 4(1), pp.7-14.
 Objective of Paper: Analyze big data with more suitable programming
languages and as consequences gaining better performance.
 Approach/Algorithm/Framework: Decision Tree Regression algorithm,
Clustering algorithm
 Pros: It is observed that the Scala of Spark speeds up the calculation of the
algorithms and completes them in less time as compared to Java.
 Cons: Altering the environment from a single node cluster into a multi-node
which leads to the better performance with the capability of executing larger
data sets is not accomplished.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Problem Statement
To design a Sentiment Analysis System (framework) where
real-time (or near real-time) sentiments are gathered for
Catastrophe management, Utility modification and Core
marketing.
• Data preprocessing (fetching raw tweets and cleaning).
• Classification of posts/tweets.
• Displaying the top trending topics on the dashboard.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
Proposed System
System Architecture
Abstract
Introduction
Literature Review
Problem Statement
Research Gap
Proposed System
References
Proposed System
Flume Architecture
Abstract
Introduction
Literature Review
Problem Statement
Research Gap
Proposed System
References
Proposed System
A Dashboard showing the Trend of a Twitter topic
Abstract
Introduction
Literature Review
Problem Statement
Research Gap
Proposed System
References
Proposed System
• Initially, Apache Flume is used, for connecting to Twitter to fetch
tweets.
• Then, to get a stream of real-time tweets, Apache Spark Streaming is
utilized.
• Hive is employed for querying the tweets present in HDFS.
• Tweets are enriched to incorporate information on Sentiment and
related entities derived from the post.
• Various Statistics about the data using Live Charts which are updated
continuously are applied to display the Trending topics onto the
Dashboard.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
References
[01] Garg, K. and Kaur, D., 2019. Sentiment Analysis on Twitter Data using Apache Hadoop and
Performance Evaluation on Hadoop MapReduce and Apache Spark. In Proceedings on the
International Conference on Artificial Intelligence (ICAI) (pp. 233-238). The Steering
Committee of The World Congress in Computer Science, Computer Engineering and Applied
Computing (WorldComp).
[02] Svyatkovskiy, A., Imai, K., Kroeger, M. and Shiraito, Y., 2016, December. Large-scale text
processing pipeline with Apache Spark. In 2016 IEEE International Conference on Big Data
(Big Data) (pp. 3928-3935). IEEE.
[03] Al-Saqqa, S., Al-Naymat, G. and Awajan, A., 2018. A Large-Scale Sentiment Data Classification
for Online Reviews Under Apache Spark. Procedia Computer Science, 141, pp.183-189.
[04] Ranganathan, J., Irudayaraj, A.S. and Tzacheva, A.A., 2017, November. Action rules for
sentiment analysis on twitter data using spark. In 2017 IEEE International Conference on Data
Mining Workshops (ICDMW) (pp. 51-60). IEEE.
[05] Elzayady, H., Badran, K.M. and Salama, G.I., 2018, December. Sentiment Analysis on Twitter
Data using Apache Spark Framework. In 2018 13th International Conference on Computer
Engineering and Systems (ICCES) (pp. 171-176). IEEE.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
References
[06] Das, S., Behera, R.K. and Rath, S.K., 2018. Real-time sentiment analysis of Twitter streaming
data for stock prediction. Procedia computer science, 132, pp.956-964.
[07] Adib, P., Alirezazadeh, S. and Nezarat, A., 2017, October. Enhancing trust accuracy among
online social network users utilizing data text mining techniques in apache spark. In 2017 7th
International Conference on Computer and Knowledge Engineering (ICCKE) (pp. 283-288).
IEEE.
[08] Podhoranyi, M. and Vojacek, L., 2019, September. Social Media Data Processing Infrastructure
by Using Apache Spark Big Data Platform: Twitter Data Analysis. In Proceedings of the 2019
4th International Conference on Cloud Computing and Internet of Things (pp. 1-6).
[09] Yeruva, V.K., Junaid, S. and Lee, Y., 2017, November. Exploring social contextual influences on
healthy eating using big data analytics. In 2017 IEEE International Conference on
Bioinformatics and Biomedicine (BIBM) (pp. 1507-1514). IEEE.
[10] Moise, I., Gaere, E., Merz, R., Koch, S. and Pournaras, E., 2016, December. Tracking language
mobility in the twitter landscape. In 2016 IEEE 16th International Conference on Data Mining
Workshops (ICDMW) (pp. 663-670). IEEE.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
References
[11] Bharill, N., Tiwari, A. and Malviya, A., 2016. Fuzzy based scalable clustering algorithms for
handling big data using apache spark. IEEE Transactions on Big Data, 2(4), pp.339-352.
[12] Rodrigues, A.P. and Chiplunkar, N.N., 2018. Real-time Twitter data analysis using Hadoop
ecosystem. Cogent Engineering, 5(1), p.1534519.
[13] Swe, T.T., Phyu, P. and Thein, S.P.P., 2019. Weather Prediction Model using Random Forest
Algorithm and Apache Spark. Weather, 3(6).
[14] Wang, Y., Wang, M. and Xu, W., 2018. A sentiment-enhanced hybrid recommender system for
movie recommendation: a big data analytics framework. Wireless Communications and Mobile
Computing, 2018.
[15] Alparslan, E. and Karahoca, A., 2016. Detecting similar opinion holders for massive sentiment
analysis. Global Journal of Information Technology: Emerging Technologies, 6(1), pp.65-71.
[16] Omar, H.K. and Jumaa, A.K., 2019. Big Data Analysis Using Apache Spark MLlib and Hadoop
HDFS with Scala and Java. Kurdistan Journal of Applied Research, 4(1), pp.7-14.
Abstract
Introduction
Literature Review
Problem Statement
Proposed System
References
?
Q and A?

More Related Content

Similar to SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx

Graph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkGraph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkKhan Mostafa
 
Co-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsCo-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsEditor IJCATR
 
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53IRJET Journal
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Recommendation system (1).pptx
Recommendation system (1).pptxRecommendation system (1).pptx
Recommendation system (1).pptxprathammishra28
 
recommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdfrecommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdf13DikshaDatir
 
Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...IJECEIAES
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Studyvivatechijri
 
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...IRJET Journal
 
Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013ijcsbi
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningIRJET Journal
 
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA TechniqueIRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA TechniqueIRJET Journal
 
Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02IJwest
 
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET Journal
 
Twitter sentiment classifications 1
Twitter sentiment classifications 1Twitter sentiment classifications 1
Twitter sentiment classifications 1eshtiyak
 
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...IJDKP
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine LearningIRJET Journal
 
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
IRJET-  	  Twitter Sentimental Analysis for Predicting Election Result using ...IRJET-  	  Twitter Sentimental Analysis for Predicting Election Result using ...
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...IRJET Journal
 

Similar to SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx (20)

Graph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkGraph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social Network
 
Co-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsCo-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online Reviews
 
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Recommendation system (1).pptx
Recommendation system (1).pptxRecommendation system (1).pptx
Recommendation system (1).pptx
 
recommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdfrecommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdf
 
Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
 
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
 
Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data Mining
 
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA TechniqueIRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
 
Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02
 
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
 
Twitter sentiment classifications 1
Twitter sentiment classifications 1Twitter sentiment classifications 1
Twitter sentiment classifications 1
 
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...
 
[IJCT-V3I2P30] Authors: Sunny Sharma
[IJCT-V3I2P30] Authors: Sunny Sharma[IJCT-V3I2P30] Authors: Sunny Sharma
[IJCT-V3I2P30] Authors: Sunny Sharma
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
 
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
IRJET-  	  Twitter Sentimental Analysis for Predicting Election Result using ...IRJET-  	  Twitter Sentimental Analysis for Predicting Election Result using ...
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
 

Recently uploaded

Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 

Recently uploaded (20)

Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 

SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx

  • 1. Presented by Name of Student (Roll No) Under the Guidance of An Approach For Sentiment Analysis On Big Social Data Using Spark Dr. Chiranjeevi Manike, Associate Professor Department of Computer Science & Engineering B V Raju Institute of Technology, Narsapur
  • 2. • Collecting the opinions of the public by analyzing the big social data has attracted a large amount of attention due to its interactive and real-time nature. • For this concept, recent studies have depended on both Social Media and Sentiment Analysis so as to accompany big events by tracking people’s behavior. • The proposed system provides an adaptable approach of Sentiment Analysis that analyzes social media posts and draws user’s opinions in real-time. • The approach used consists of two steps. Abstract Introduction Literature Review Problem Statement Proposed System References Abstract
  • 3. • The first step is to build a dynamic dictionary of words’ Polarity based on a chosen set of Hashtags that are related to a given subject. • The second step is to classify the posts under many subjects by introducing new qualities which firmly refine the polarity level of a post. • Twitter, Facebook and other social media conversations can be mined for Sentiment data to know about the competition. • Social media blogs help in knowing the current discussions of the public. Abstract Introduction Literature Review Problem Statement Proposed System References Abstract
  • 4. • The obtained information can be used to take focused, real- time, decisions that boost market share. • Spark is used as it is helpful in streaming real-time data from various sources of Social Networks such as Twitter, Stock Exchange, and Geographical Information Systems. Abstract Introduction Literature Review Problem Statement Proposed System References Abstract
  • 5. Abstract Introduction Literature Review Problem Statement Proposed System References Introduction • Millions of people around the world are being able to express their viewpoints and sprawl them in the present days. • For that purpose, Social Media has been very helpful from many years. • Social Networking websites and applications let the users show their opinions by responding (liking or disliking) to the content posted. • The users may even post the content of their own to display their intentions or feelings towards one particular subject or even more number of them.
  • 6. Abstract Introduction Literature Review Problem Statement Proposed System References Introduction • The accumulated data and the performed activities on social media produce Volume, Variety, Value, Variability and Veracity in large amounts and thus can be called as Big Social Data. • Usually, the data of this kind consists of numerous sets of opinions that can be processed to know public proneness in the digital platform. • Many research methods are involved to process this type of activities such as Text Analysis. • Most of the internet data, that is almost 80 percentage of it is text. • That is why, Text Analysis has emerged to be an important factor for Public Sentiment and Opinion Extraction.
  • 7. Abstract Introduction Literature Review Problem Statement Proposed System References Introduction • Sentiment Analysis is also known as Opinion Mining. • It targets on the people’s sentiment regarding a subject of matter by performing analysis on their posts and related actions on social media. • Then, it proceeds with classification of the posts to determine polarity and give results such as positive, negative and so on. • The ‘Sentiment’ in each statement/tweet can be extracted using two popular approaches: – Lexicon Analysis – Machine Learning
  • 8. Abstract Introduction Literature Review Problem Statement Proposed System References Introduction • Objective texts are also a part of Sentiment Analysis as they show the ‘Neutral’ category of polarity. • Eastern emojis are no longer used as they are the combination of special characters and some people don’t tend to understand them. • So, Emojis which we use in the present day, play a crucial role in the Sentiment of text, especially in tweets.
  • 9. Abstract Introduction Literature Review Problem Statement Proposed System References Introduction Shrug expression examples Eastern emoji ¯ _ (ツ) _ / ¯ Present-day emoji
  • 10. Abstract Introduction Literature Review Problem Statement Proposed System References Introduction • The two methods of analysis have been used more often on big social data to gather public opinion to assess user's satisfaction of a subject (services, products, events, topics or persons) in several domains including Politics, Marketing, Health, Travel etc. • However, the results may vary depending on a reasonable degree of accuracy. • The failure is caused generally due to the challenges of opinion mining such as the semantic orientation of a word which could change based on the context.
  • 11. Literature Review [01] Garg, K. and Kaur, D., 2019. Sentiment Analysis on Twitter Data using Apache Hadoop and Performance Evaluation on Hadoop MapReduce and Apache Spark. In Proceedings on the International Conference on Artificial Intelligence (ICAI) (pp. 233-238). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp).  Objective of Paper: Analyze real-time streaming of Twitter data to identify the sentiment expressed in each tweet using Cloudera.  Approach/Algorithm/Framework: Hadoop MapReduce framework  Pros: Performance of Apache Spark has turned out to be considerably higher almost 2x in terms of time on a single node.  Cons: The correlation between user influence and sentiment of the author is not computed by using Hadoop effectively. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 12. Literature Review [03] Al-Saqqa, S., Al-Naymat, G. and Awajan, A., 2018. A Large-Scale Sentiment Data Classification for Online Reviews Under Apache Spark. Procedia Computer Science, 141, pp.183-189.  Objective of Paper: To present new evaluation experiments of sentiment analysis for a large-scale dataset of online customer's reviews under Apache Spark data Processing System.  Approach/Algorithm/Framework: Spark's MLlib's classifiers/algorithms: Naive Bayes, Support Vector Machine and Logistic regression  Pros: According to the experimental results, Support vector machine classifier performs better than Naïve Bayes and Logistic Regression classifiers.  Cons: Experiments using different feature sets and n-gram models (bi-gram and tri-gram) that may enhance the performance of the classification are not conducted. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 13. Literature Review [04] Ranganathan, J., Irudayaraj, A.S. and Tzacheva, A.A., 2017, November. Action rules for sentiment analysis on twitter data using spark. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 51-60). IEEE.  Objective of Paper: To implement a new optimized and more promising system, in terms of speed and efficiency, to generate meta-actions by implementing Specific Action Rule discovery based on Grabbing strategy (SARGS) algorithm.  Approach/Algorithm/Framework: Action Rule mining algorithm  Pros: According to the results, faster computational time for Spark system is noticed compared to Hadoop MapReduce for implementing the meta-action generation methods.  Cons: Testing the system with more real-time large data like NPS dataset to test and improve system’s scalability and feasibility is not done. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 14. Literature Review [07] Adib, P., Alirezazadeh, S. and Nezarat, A., 2017, October. Enhancing trust accuracy among online social network users utilizing data text mining techniques in apache spark. In 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE) (pp. 283-288). IEEE.  Objective of Paper: To find malicious users and analyze their behavior to proceed a more accurate trust within distributed execution in Spark environment for providing a quicker call.  Approach/Algorithm/Framework: Stochastic gradient descent (SGD)  Pros: The proposed model benefits from a high diagnostic accuracy and precedes SGD with 38% higher performance.  Cons: The use of reverse malicious words in dictionary to keep a more accurate detection of malicious users through their tweets is not done here. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 15. Literature Review [08] Podhoranyi, M. and Vojacek, L., 2019, September. Social Media Data Processing Infrastructure by Using Apache Spark Big Data Platform: Twitter Data Analysis. In Proceedings of the 2019 4th International Conference on Cloud Computing and Internet of Things (pp. 1-6).  Objective of Paper: To develop an architecture and a workflow which can process Twitter social network data in near real-time so that tweets with the defined topic – floods are analyzed.  Approach/Algorithm/Framework: Apache Flume, Hadoop Distributed File System (HDFS), HIVE with HiveQL, YARN, SPARK.  Pros: The Word Frequency method (n-grams) is effective in revealing the tweets’ content and proved their high informative potential in terms of data quality and quantity.  Cons: Text analyzing methods that are focused on geo-names extraction are not applied. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 16. Literature Review [09] Yeruva, V.K., Junaid, S. and Lee, Y., 2017, November. Exploring social contextual influences on healthy eating using big data analytics. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1507-1514). IEEE.  Objective of Paper: To implement a Big Data Analytics framework which targets to explore social contextual influences on healthy eating.  Approach/Algorithm/Framework: BIDAF – Big Data Analytics Framework for Smart Society  Pros: The obtained results indicated that the BiDAF framework is effective in classification and sentiment analysis of food tweet messages and showed its potential towards healthy eating.  Cons: BiDAF might not be very suitable for building a highly-customized model required by client as it is a Front-end framework. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 17. Literature Review [10] Moise, I., Gaere, E., Merz, R., Koch, S. and Pournaras, E., 2016, December. Tracking language mobility in the twitter landscape. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (pp. 663- 670). IEEE.  Objective of Paper: To examine the mobility of languages as captured by the Twitter signal and extracting value from Twitter data.  Approach/Algorithm/Framework: Density-based Clustering and Self- Organizing Maps Techniques  Pros: The analysis enabled the detection of tourism trends and real-world events, as discovered through the Twitter lens based on country-language coupling.  Cons: Exploring the methods that identify location from the text of the tweets, and then applying the same analytical steps on other countries and languages are not included. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 18. Literature Review [11] Bharill, N., Tiwari, A. and Malviya, A., 2016. Fuzzy based scalable clustering algorithms for handling big data using apache spark. IEEE Transactions on Big Data, 2(4), pp.339-352.  Objective of Paper: Implementing partitional based clustering algorithms on Apache Spark, which are suited for clustering large datasets due to their low computational requirements.  Approach/Algorithm/Framework: Sampling with Iterative Optimization Fuzzy c-Means algorithm (SRSIO-FCM)  Pros: The results produced comparative reports regarding time and space complexity, run time and measure of clustering quality, revealing that SRSIO- FCM is able to run in very less time without compromising on the clustering quality.  Cons: Well-known cluster validity measures for use on Big Data by using similar extensions are not presented. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 19. Literature Review [12] Rodrigues, A.P. and Chiplunkar, N.N., 2018. Real-time Twitter data analysis using Hadoop ecosystem. Cogent Engineering, 5(1), p.1534519.  Objective of Paper: To compare Executed tweets to Real-time tweets and Performance in terms of execution time for analysis of real-time tweets using Pig and Hive.  Approach/Algorithm/Framework: Hadoop Ecosystem  Pros: The experimental results show that Pig is more efficient than Hive as Pig takes less time for execution when compared to Hive.  Cons: Only large-scale business organizations which generate big data can utilize Hadoop's function and it cannot efficiently perform in small-scale data environments. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 20. Literature Review [13] Swe, T.T., Phyu, P. and Thein, S.P.P., 2019. Weather Prediction Model using Random Forest Algorithm and Apache Spark. Weather, 3(6).  Objective of Paper: Analyzing the algorithms on big data that are suitable for weather prediction and focusing on the performance analysis with Random Forest algorithms.  Approach/Algorithm/Framework: Apache Spark  Pros: Experimental results indicate the supreme and notable merits of Random Forest over the other algorithms in terms of classification accuracy, performance, and scalability.  Cons: The incremental parallel random forest algorithm for data streams in cloud environment and task scheduling mechanism for the algorithm are not implemented. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 21. Literature Review [14] Wang, Y., Wang, M. and Xu, W., 2018. A sentiment-enhanced hybrid recommender system for movie recommendation: a big data analytics framework. Wireless Communications and Mobile Computing, 2018.  Objective of Paper: A hybrid recommendation model to improve the accuracy and timeliness of mobile movie recommender system based on sentiment analysis .  Approach/Algorithm/Framework: Apache Spark, Content-based recommender system, Collaborative filtering recommender system, Hybrid recommender system  Pros: The implemented method makes it suitable and fast for users to receive useful movie suggestions.  Cons: Eliminating the individual characteristics hidden in the text description from users is not performed. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 22. Literature Review [16] Omar, H.K. and Jumaa, A.K., 2019. Big Data Analysis Using Apache Spark MLlib and Hadoop HDFS with Scala and Java. Kurdistan Journal of Applied Research, 4(1), pp.7-14.  Objective of Paper: Analyze big data with more suitable programming languages and as consequences gaining better performance.  Approach/Algorithm/Framework: Decision Tree Regression algorithm, Clustering algorithm  Pros: It is observed that the Scala of Spark speeds up the calculation of the algorithms and completes them in less time as compared to Java.  Cons: Altering the environment from a single node cluster into a multi-node which leads to the better performance with the capability of executing larger data sets is not accomplished. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 23. Problem Statement To design a Sentiment Analysis System (framework) where real-time (or near real-time) sentiments are gathered for Catastrophe management, Utility modification and Core marketing. • Data preprocessing (fetching raw tweets and cleaning). • Classification of posts/tweets. • Displaying the top trending topics on the dashboard. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 24. Proposed System System Architecture Abstract Introduction Literature Review Problem Statement Research Gap Proposed System References
  • 25. Proposed System Flume Architecture Abstract Introduction Literature Review Problem Statement Research Gap Proposed System References
  • 26. Proposed System A Dashboard showing the Trend of a Twitter topic Abstract Introduction Literature Review Problem Statement Research Gap Proposed System References
  • 27. Proposed System • Initially, Apache Flume is used, for connecting to Twitter to fetch tweets. • Then, to get a stream of real-time tweets, Apache Spark Streaming is utilized. • Hive is employed for querying the tweets present in HDFS. • Tweets are enriched to incorporate information on Sentiment and related entities derived from the post. • Various Statistics about the data using Live Charts which are updated continuously are applied to display the Trending topics onto the Dashboard. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 28. References [01] Garg, K. and Kaur, D., 2019. Sentiment Analysis on Twitter Data using Apache Hadoop and Performance Evaluation on Hadoop MapReduce and Apache Spark. In Proceedings on the International Conference on Artificial Intelligence (ICAI) (pp. 233-238). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp). [02] Svyatkovskiy, A., Imai, K., Kroeger, M. and Shiraito, Y., 2016, December. Large-scale text processing pipeline with Apache Spark. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 3928-3935). IEEE. [03] Al-Saqqa, S., Al-Naymat, G. and Awajan, A., 2018. A Large-Scale Sentiment Data Classification for Online Reviews Under Apache Spark. Procedia Computer Science, 141, pp.183-189. [04] Ranganathan, J., Irudayaraj, A.S. and Tzacheva, A.A., 2017, November. Action rules for sentiment analysis on twitter data using spark. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 51-60). IEEE. [05] Elzayady, H., Badran, K.M. and Salama, G.I., 2018, December. Sentiment Analysis on Twitter Data using Apache Spark Framework. In 2018 13th International Conference on Computer Engineering and Systems (ICCES) (pp. 171-176). IEEE. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 29. References [06] Das, S., Behera, R.K. and Rath, S.K., 2018. Real-time sentiment analysis of Twitter streaming data for stock prediction. Procedia computer science, 132, pp.956-964. [07] Adib, P., Alirezazadeh, S. and Nezarat, A., 2017, October. Enhancing trust accuracy among online social network users utilizing data text mining techniques in apache spark. In 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE) (pp. 283-288). IEEE. [08] Podhoranyi, M. and Vojacek, L., 2019, September. Social Media Data Processing Infrastructure by Using Apache Spark Big Data Platform: Twitter Data Analysis. In Proceedings of the 2019 4th International Conference on Cloud Computing and Internet of Things (pp. 1-6). [09] Yeruva, V.K., Junaid, S. and Lee, Y., 2017, November. Exploring social contextual influences on healthy eating using big data analytics. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1507-1514). IEEE. [10] Moise, I., Gaere, E., Merz, R., Koch, S. and Pournaras, E., 2016, December. Tracking language mobility in the twitter landscape. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (pp. 663-670). IEEE. Abstract Introduction Literature Review Problem Statement Proposed System References
  • 30. References [11] Bharill, N., Tiwari, A. and Malviya, A., 2016. Fuzzy based scalable clustering algorithms for handling big data using apache spark. IEEE Transactions on Big Data, 2(4), pp.339-352. [12] Rodrigues, A.P. and Chiplunkar, N.N., 2018. Real-time Twitter data analysis using Hadoop ecosystem. Cogent Engineering, 5(1), p.1534519. [13] Swe, T.T., Phyu, P. and Thein, S.P.P., 2019. Weather Prediction Model using Random Forest Algorithm and Apache Spark. Weather, 3(6). [14] Wang, Y., Wang, M. and Xu, W., 2018. A sentiment-enhanced hybrid recommender system for movie recommendation: a big data analytics framework. Wireless Communications and Mobile Computing, 2018. [15] Alparslan, E. and Karahoca, A., 2016. Detecting similar opinion holders for massive sentiment analysis. Global Journal of Information Technology: Emerging Technologies, 6(1), pp.65-71. [16] Omar, H.K. and Jumaa, A.K., 2019. Big Data Analysis Using Apache Spark MLlib and Hadoop HDFS with Scala and Java. Kurdistan Journal of Applied Research, 4(1), pp.7-14. Abstract Introduction Literature Review Problem Statement Proposed System References