SlideShare a Scribd company logo
Big Data + Sentiment
Analysis = Awesome
Adel Rahimi
Sharif University of Technology
TABLE OF CONTENT
• Introduction to big data and its
usage
• Sentiment analysis and its use in
NLP
• How to big data?!
• Tools to use
• Further study
INTRODUCING BIG DATA
WHAT IS BIG DATA?
• Big data is a term denoting
the storage and usage of vast
amount of data, either
structured or unstructured, on
the cloud.
USAGES OF BIG DATA
• Internet Search
• Finance
• Business Informatics
SPECIFICATIONS OF BIG DATA
• Volume: big data doesn't sample; it just
observes and tracks what happens
• Velocity: big data is often available in real-time
• Variety: big data draws from text, images, audio,
video; plus it completes missing pieces through
data fusion
• Machine learning: big data often doesn't ask
why and simply detects patterns
• Digital footprint: big data is often a cost-free
byproduct of digital interaction
COMPANIES WHO USE BIG DATA
• eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB
Hadoop cluster for search, consumer recommendations, and merchandising.
• Amazon.com handles millions of back-end operations every day, as well as
queries from more than half a million third-party sellers. The core technology
that keeps Amazon running is Linux-based and as of 2005 they had the world's
three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB.
• Facebook handles 50 billion photos from its user base.
• Google was handling roughly 100 billion searches per month as of August 2012.
• Oracle NoSQL Database has been tested to past the 1M ops/sec mark with 8
shards and proceeded to hit 1.2M ops/sec with 10 shards.
APPLICATIONS OF BIG DATA (I)
APPLICATIONS OF BIG DATA (II)
• Fashion Trends 2016: Google Data
Shows What Shoppers Want
In April, searches for bomber jackets grew
297% YoY in the U.K. and 612% YoY in the
U.S.
APPLICATIONS OF BIG DATA (III)
• IN CASE YOU WERE WONDERING WHAT EXACTLY IS
“BOMBER JACKET”!
HOW BIG IS BIG DATA?
ADVANTAGES OF BIG DATA
• Cheap and mass storage
• Faster processors
• Cheap open source platforms such as 'Hadoop’
• Cloud computing is a huge advancement in the field
when dealing with Big Data
• Parallel processing, large grid environments and high
connectivity
HOW WILL BIG DATA HELP US?
• Predict what customers want before they ask for it
• Get customers excited about their own data
• Improve customer service interactions
• Identify customer pain points and solve them
• Reduce health care costs and improve treatment
SENTIMENT ANALYSIS
SENTIMENT ANALYSIS
WHAT IS SENTIMENT ANALYSIS?
•The movie was awesome
👍
•The movie was awful 👎
•The movie was long 😕
SENTIMENT ANALYSIS WORK
FLOW
input
tokenization
Stop-word
filtering
Negation
handling
stemming
classificatio
n
Sentiment
analysis
TWITTER SENTIMENT ANALYSIS
WORK-FLOW
Tokenization
Tweet
Speech
Tagging
WordNet
WSD
SentiWordNet
Interpretation
Sentiment
Orientation
Tweet
Classified
PREPROCESSING
• Removing non-English Tweets
• Replacing Emoticons by their polarity
• Remove URL, Target Mentions, Hashtags, Numbers
• Replace Negative Mentions
• Replace Sequence of Repeated Characters eg.
‘coooooooool’ by ‘coool’
• Remove Nouns and Prepositions
EXAMPLE OF TWITTER
SENTIMENT ANALYSIS
@BonksMullet @chet_sellers This is very accurate and hilarious. Well
done :)
tweet
accurate#1 conforming exactly or almost exactly to fact or to a standard
or performing with total accuracy; "an accurate reproduction"; "the
accounting was accurate"; "accurate measurements"; "an accurate scale"
synset
WSD
SentiWordNet
Pos_score Neg_score Obj_score
0.5 0 0.5
score
PREPROCESSINGDATASETS
WORDNET
Is a dictionary-like database of English which has
all the words and their synonyms.
The Persian equivalent of wordnet is Farsnet
available at Shahid Beheshti University.
http://dadegan.ir/catalog/farsnet
SENTIWORDNET
• SentiWordNet
Is an extended version of wordnet which has the
sentiment of each word written.
AFINN
• AFINN list of English words which are rated by their
sentiment, from -5 (negative) to +5 (positive).
• AFINN-111 contains 2477 words.
• Examples:
Abilities 2
Ability 2
Aboard 1
Absentee -1
GETTING TO KNOW SOME
TOOLS
APACHE NUTCH
• We use apache Nutch as a web crawler because it’s
blazingly fast.
ELASTICSEARCH DATABASE
• Elasticsearch is one of the fastest dabases, using
elasticsearch helps speeding up the process
APACHE HADOOP
• Hadoop uses MapReduce algorithm for stream
processing which is extremely fast and reliable.
APACHE SPARK
• Apache Spark is a fast and general engine for big data
processing, with built-in modules for streaming, SQL,
machine learning and graph processing.
APACHE CASSANDRA
• The Apache Cassandra database is the right choice
when you need scalability and high availability without
compromising performance.
REFERENCES AND FURTHER
STUDY
• What Is Big Data? | SAS. (n.d.). Retrieved from
http://www.sas.com/en_us
• 5 ways companies are using big data to help their
customers | VentureBeat | Business | by
brianabillingham. (n.d.). Retrieved from
http://venturebeat.com/2014/04/21/5-ways-big-data-
is-helping-companies-help-their-customers/
• http://sentiwordnet.isti.cnr.it/
• SentiWordNet 3.0: An Enhanced Lexical Resource for
Sentiment Analysis and Opinion Mining
• https://github.com/linkTDP/BigDataAnalysis_TweetSen
timent
REFERENCES AND FURTHER
STUDY
• AFFIN-111 -
http://www2.imm.dtu.dk/pubdb/views/publication_det
ails.php?id=6010
• Reviews ClassificationUsing SentiWordNet Lexicon -
http://www.academia.edu/1336655/Reviews_Classificat
ion_Using_SentiWordNet_Lexicon
• Using SentiWordNet and Sentiment Analysis for
Detecting Radical Content on Web Forums -
http://www.jeremyellman.com/jeremy_unn/pdfs/1_____
Chalothorn_Ellman_SKIMA_2012.pdf
• From tweets to polls: Linking text sentiment to public
opinion time series -
http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/
paper/viewFile/1536/1842

More Related Content

What's hot

Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Sagar Ahire
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Rachit Goel
 
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and TweetsSentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
🧑‍💻 Manuel Coppotelli
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
Sunil Kandari
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
Ayushi Dalmia
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
SonuCreation
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
Iswarya M
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment Analysis
Aditya Joshi
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitterpiya chauhan
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Naveen Kumar
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
prnk08
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
Greater Noida Institute Of Technology
 
Datapedia Analysis Report
Datapedia Analysis ReportDatapedia Analysis Report
Datapedia Analysis Report
Abanoub Amgad
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
CJ Jenkins
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
Hetu Bhavsar
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using Solr
Pradeep Pujari
 
Sentiment Analysis
Sentiment Analysis Sentiment Analysis
Sentiment Analysis
prnk08
 
Experiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter ZadroznyExperiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter Zadroznypadatascience
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 

What's hot (20)

Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and TweetsSentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment Analysis
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitter
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
 
Datapedia Analysis Report
Datapedia Analysis ReportDatapedia Analysis Report
Datapedia Analysis Report
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using Solr
 
Sentiment Analysis
Sentiment Analysis Sentiment Analysis
Sentiment Analysis
 
Experiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter ZadroznyExperiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter Zadrozny
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 

Similar to Big Data + Sentiment Analysis = Awesome

Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Simplilearn
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
Tomy Rhymond
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
Dunn Solutions Group
 
Big Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case studyBig Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case study
Sharjeel Imtiaz
 
Big Data Boom
Big Data BoomBig Data Boom
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AI
Amazon Web Services
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
6535ANURAGANURAG
 
Anatomy of a Big Data Application (BDA)
Anatomy of a Big Data Application (BDA)Anatomy of a Big Data Application (BDA)
Anatomy of a Big Data Application (BDA)
BloomReach
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Big data
Big dataBig data
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
Prakalp Agarwal
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Deepika ParthaSarathy
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
Dr.K.Sreenivas Rao
 
Hadoop and SAP BI
Hadoop and SAP BI   Hadoop and SAP BI
Hadoop and SAP BI
Praveen Kumar (Tyagi)
 
Big Data
Big DataBig Data
Big Data
Priyanka Tuteja
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Md. Salman Ahmed
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
vidhi171881
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
Venkata Reddy Konasani
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
Caserta
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
Hagar Alaa el-din
 

Similar to Big Data + Sentiment Analysis = Awesome (20)

Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
Big Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case studyBig Data with IOT approach and trends with case study
Big Data with IOT approach and trends with case study
 
Big Data Boom
Big Data BoomBig Data Boom
Big Data Boom
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AI
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
 
Anatomy of a Big Data Application (BDA)
Anatomy of a Big Data Application (BDA)Anatomy of a Big Data Application (BDA)
Anatomy of a Big Data Application (BDA)
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Big data
Big dataBig data
Big data
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Hadoop and SAP BI
Hadoop and SAP BI   Hadoop and SAP BI
Hadoop and SAP BI
 
Big Data
Big DataBig Data
Big Data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 

More from Adel Rahimi

Singapore's Macroeconomics analysis
Singapore's Macroeconomics analysisSingapore's Macroeconomics analysis
Singapore's Macroeconomics analysis
Adel Rahimi
 
Artificial Bee Colony: An introduction
Artificial Bee Colony: An introductionArtificial Bee Colony: An introduction
Artificial Bee Colony: An introduction
Adel Rahimi
 
Talking Animals
Talking AnimalsTalking Animals
Talking Animals
Adel Rahimi
 
Neural Networks with Focus on Language Modeling
Neural Networks with Focus on Language ModelingNeural Networks with Focus on Language Modeling
Neural Networks with Focus on Language Modeling
Adel Rahimi
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
Adel Rahimi
 
Improvement of English to Persian Machine Translation via N-grams of Part-of-...
Improvement of English to Persian Machine Translation via N-grams of Part-of-...Improvement of English to Persian Machine Translation via N-grams of Part-of-...
Improvement of English to Persian Machine Translation via N-grams of Part-of-...
Adel Rahimi
 
corpus study of multi token units
corpus study of multi token unitscorpus study of multi token units
corpus study of multi token units
Adel Rahimi
 
Detecting negative words
Detecting negative wordsDetecting negative words
Detecting negative words
Adel Rahimi
 
Persian Intonation
Persian IntonationPersian Intonation
Persian Intonation
Adel Rahimi
 
X bar theory
X bar theoryX bar theory
X bar theory
Adel Rahimi
 
Content based language learning I
Content based language learning IContent based language learning I
Content based language learning I
Adel Rahimi
 
Phonological CA
Phonological CAPhonological CA
Phonological CA
Adel Rahimi
 

More from Adel Rahimi (13)

Singapore's Macroeconomics analysis
Singapore's Macroeconomics analysisSingapore's Macroeconomics analysis
Singapore's Macroeconomics analysis
 
Artificial Bee Colony: An introduction
Artificial Bee Colony: An introductionArtificial Bee Colony: An introduction
Artificial Bee Colony: An introduction
 
Talking Animals
Talking AnimalsTalking Animals
Talking Animals
 
Neural Networks with Focus on Language Modeling
Neural Networks with Focus on Language ModelingNeural Networks with Focus on Language Modeling
Neural Networks with Focus on Language Modeling
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Improvement of English to Persian Machine Translation via N-grams of Part-of-...
Improvement of English to Persian Machine Translation via N-grams of Part-of-...Improvement of English to Persian Machine Translation via N-grams of Part-of-...
Improvement of English to Persian Machine Translation via N-grams of Part-of-...
 
corpus study of multi token units
corpus study of multi token unitscorpus study of multi token units
corpus study of multi token units
 
Detecting negative words
Detecting negative wordsDetecting negative words
Detecting negative words
 
Persian Intonation
Persian IntonationPersian Intonation
Persian Intonation
 
X bar theory
X bar theoryX bar theory
X bar theory
 
Content based language learning I
Content based language learning IContent based language learning I
Content based language learning I
 
Phonological CA
Phonological CAPhonological CA
Phonological CA
 
Suprasegmentals
SuprasegmentalsSuprasegmentals
Suprasegmentals
 

Recently uploaded

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 

Recently uploaded (20)

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 

Big Data + Sentiment Analysis = Awesome

  • 1. Big Data + Sentiment Analysis = Awesome Adel Rahimi Sharif University of Technology
  • 2. TABLE OF CONTENT • Introduction to big data and its usage • Sentiment analysis and its use in NLP • How to big data?! • Tools to use • Further study
  • 3.
  • 5. WHAT IS BIG DATA? • Big data is a term denoting the storage and usage of vast amount of data, either structured or unstructured, on the cloud.
  • 6. USAGES OF BIG DATA • Internet Search • Finance • Business Informatics
  • 7. SPECIFICATIONS OF BIG DATA • Volume: big data doesn't sample; it just observes and tracks what happens • Velocity: big data is often available in real-time • Variety: big data draws from text, images, audio, video; plus it completes missing pieces through data fusion • Machine learning: big data often doesn't ask why and simply detects patterns • Digital footprint: big data is often a cost-free byproduct of digital interaction
  • 8. COMPANIES WHO USE BIG DATA • eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising. • Amazon.com handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 they had the world's three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. • Facebook handles 50 billion photos from its user base. • Google was handling roughly 100 billion searches per month as of August 2012. • Oracle NoSQL Database has been tested to past the 1M ops/sec mark with 8 shards and proceeded to hit 1.2M ops/sec with 10 shards.
  • 10. APPLICATIONS OF BIG DATA (II) • Fashion Trends 2016: Google Data Shows What Shoppers Want In April, searches for bomber jackets grew 297% YoY in the U.K. and 612% YoY in the U.S.
  • 11. APPLICATIONS OF BIG DATA (III) • IN CASE YOU WERE WONDERING WHAT EXACTLY IS “BOMBER JACKET”!
  • 12. HOW BIG IS BIG DATA?
  • 13. ADVANTAGES OF BIG DATA • Cheap and mass storage • Faster processors • Cheap open source platforms such as 'Hadoop’ • Cloud computing is a huge advancement in the field when dealing with Big Data • Parallel processing, large grid environments and high connectivity
  • 14. HOW WILL BIG DATA HELP US? • Predict what customers want before they ask for it • Get customers excited about their own data • Improve customer service interactions • Identify customer pain points and solve them • Reduce health care costs and improve treatment
  • 17. WHAT IS SENTIMENT ANALYSIS? •The movie was awesome 👍 •The movie was awful 👎 •The movie was long 😕
  • 20. PREPROCESSING • Removing non-English Tweets • Replacing Emoticons by their polarity • Remove URL, Target Mentions, Hashtags, Numbers • Replace Negative Mentions • Replace Sequence of Repeated Characters eg. ‘coooooooool’ by ‘coool’ • Remove Nouns and Prepositions
  • 21. EXAMPLE OF TWITTER SENTIMENT ANALYSIS @BonksMullet @chet_sellers This is very accurate and hilarious. Well done :) tweet accurate#1 conforming exactly or almost exactly to fact or to a standard or performing with total accuracy; "an accurate reproduction"; "the accounting was accurate"; "accurate measurements"; "an accurate scale" synset WSD SentiWordNet Pos_score Neg_score Obj_score 0.5 0 0.5 score
  • 23. WORDNET Is a dictionary-like database of English which has all the words and their synonyms. The Persian equivalent of wordnet is Farsnet available at Shahid Beheshti University. http://dadegan.ir/catalog/farsnet
  • 24. SENTIWORDNET • SentiWordNet Is an extended version of wordnet which has the sentiment of each word written.
  • 25. AFINN • AFINN list of English words which are rated by their sentiment, from -5 (negative) to +5 (positive). • AFINN-111 contains 2477 words. • Examples: Abilities 2 Ability 2 Aboard 1 Absentee -1
  • 26. GETTING TO KNOW SOME TOOLS
  • 27. APACHE NUTCH • We use apache Nutch as a web crawler because it’s blazingly fast.
  • 28. ELASTICSEARCH DATABASE • Elasticsearch is one of the fastest dabases, using elasticsearch helps speeding up the process
  • 29. APACHE HADOOP • Hadoop uses MapReduce algorithm for stream processing which is extremely fast and reliable.
  • 30. APACHE SPARK • Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
  • 31. APACHE CASSANDRA • The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance.
  • 32. REFERENCES AND FURTHER STUDY • What Is Big Data? | SAS. (n.d.). Retrieved from http://www.sas.com/en_us • 5 ways companies are using big data to help their customers | VentureBeat | Business | by brianabillingham. (n.d.). Retrieved from http://venturebeat.com/2014/04/21/5-ways-big-data- is-helping-companies-help-their-customers/ • http://sentiwordnet.isti.cnr.it/ • SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining • https://github.com/linkTDP/BigDataAnalysis_TweetSen timent
  • 33. REFERENCES AND FURTHER STUDY • AFFIN-111 - http://www2.imm.dtu.dk/pubdb/views/publication_det ails.php?id=6010 • Reviews ClassificationUsing SentiWordNet Lexicon - http://www.academia.edu/1336655/Reviews_Classificat ion_Using_SentiWordNet_Lexicon • Using SentiWordNet and Sentiment Analysis for Detecting Radical Content on Web Forums - http://www.jeremyellman.com/jeremy_unn/pdfs/1_____ Chalothorn_Ellman_SKIMA_2012.pdf • From tweets to polls: Linking text sentiment to public opinion time series - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/ paper/viewFile/1536/1842

Editor's Notes

  1. It allows business to track:         - Flame detection (bad rants)         - New product perception         - Brand perception         - Reputation management          Identifying child-suitability of videos based on comments Bias identification in news sources Identifying (in)appropriate content for ad placement Question: “Why aren't consumers buying our laptop?” We know the concrete data: price, specs, competition, etc. We want to know subjective data: “the design is tacky,” “customer service was condescending” Misperceptions are also important, e.g. “updated drivers aren't available” (even though they are)