SlideShare a Scribd company logo
1 of 15
Download to read offline
Introduction to R
Ali Ghods
Aix-Marseille Université
What is R?
R is a system for statistical analysis and graphics
R is freely distributed under the terms of the GNU
Free
To install R, Visit:
https://cran.r-project.org/
R has +15000 packages + Documentation
2 Ali Ghods | Aix-Marseille Université 2018-12-07
R Popularity
The number of scholarly articles found in each year by Google Scholar:
(source: http://r4stats.com/articles/popularity/)
3 Ali Ghods | Aix-Marseille Université 2018-12-07
Advantages
It seems complex for beginners, but it’s not true because the main feature of R is the
Flexibility
New methods are available sooner
Vast selection of analytic and graphics
It runs on any computer
Powerful object-oriented language
Vast selection of input/output formats, accessibility from Excel, SPSS, SAS, etc.
4 Ali Ghods | Aix-Marseille Université 2018-12-07
RStudio
5 Ali Ghods | Aix-Marseille Université 2018-12-07
Find and Install a package
Find packages:
R Archive Network
Statistical Data Analysis
https://awesome-r.com/
Install packages:
installed.packages()
install.packages(”package name”)
update.packages(”package name”)
6 Ali Ghods | Aix-Marseille Université 2018-12-07
What are the most Popular Packages in R?
To manipulate data: dplyr, tidyr
To visualize data: ggplot2, plotly
To report results: shiny, xtable, rmarkdown
To analyze data: psych, pls
Network Analysis: igraph
Text-mining: tm, tidytext
7 Ali Ghods | Aix-Marseille Université 2018-12-07
Data sets
Your data set from your research
Public data sets
Google: https://toolbox.google.com/datasetsearch
https://github.com/awesomedata/awesome-public-datasets
Data Mock: e.g. https://www.mockaroo.com/
8 Ali Ghods | Aix-Marseille Université 2018-12-07
Where to learn R?
R help, package support documents
R for Beginners
Introduction to Probability and Statistics Using R
Introduction à la programmation en R
https://www.r-bloggers.com/
https://rdrr.io/
https://www.rdocumentation.org/
9 Ali Ghods | Aix-Marseille Université 2018-12-07
Example: Text analysis, sentiment analysis
Data: A list of 3150 Amazon customers reviews for Alexa Echo, Firestick, Echo Dot etc.
Source: https://www.kaggle.com/sid321axn/amazon-alexa-reviews
Objective:
1 Find out the most frequent words
2 Find out the most positive and negative words
10 Ali Ghods | Aix-Marseille Université 2018-12-07
Example
#demanded packages
library(tidytext, dplyr, readr, tokenizers, ggplot2)
#retrieve data
data <- read_tsv("amazon_alexa.tsv")
#tokenization
comments_token <- tokenize_words(data$verified_reviews, lowercase = TRUE, stopwords = TRUE,
strip_numeric = TRUE, strip_punct = TRUE)
#the most frequent words
comments_token <- comments_token %>% anti_join(stop_words) %>% count(word, sort = TRUE)
g <- ggplot(comments_token[1:10,], aes(x = reorder(word, -n), y = n)) +
geom_bar(stat = "identity", fill = "steelblue") +
geom_text(aes(label = n), position = position_dodge(0.9), vjust = 0)
plot(g)
11 Ali Ghods | Aix-Marseille Université 2018-12-07
Example
#demanded packages
library(tidytext, dplyr, readr, tokenizers, ggplot2)
#retrieve data
data <- read_tsv("amazon_alexa.tsv")
#tokenization
comments_token <- tokenize_words(data$verified_reviews, lowercase = TRUE, stopwords = TRUE,
strip_numeric = TRUE, strip_punct = TRUE)
#the most frequent words
comments_token <- comments_token %>% anti_join(stop_words) %>% count(word, sort = TRUE)
g <- ggplot(comments_token[1:10,], aes(x = reorder(word, -n), y = n)) +
geom_bar(stat = "identity", fill = "steelblue") +
geom_text(aes(label = n), position = position_dodge(0.9), vjust = 0)
plot(g)
12 Ali Ghods | Aix-Marseille Université 2018-12-07
Example
#Sentiment Analysis
bing_word_counts %>%
filter(n > 50) %>%
mutate(n = ifelse(sentiment == 'negative', -n, n)) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_bar(stat = 'identity') +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ylab('Contribution to sentiment') + ggtitle('Most common positive and negative words')
13 Ali Ghods | Aix-Marseille Université 2018-12-07
Example
#Sentiment Analysis
bing_word_counts %>%
filter(n > 50) %>%
mutate(n = ifelse(sentiment == 'negative', -n, n)) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_bar(stat = 'identity') +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ylab('Contribution to sentiment') + ggtitle('Most common positive and negative words')
14 Ali Ghods | Aix-Marseille Université 2018-12-07
!"#$%&'($!
Questions?

More Related Content

Similar to Introduction to R

Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analyticstempledf
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programmingUmang Singh
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoopIRJET Journal
 
Recommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesRecommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesAndrea Gigli
 
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...BRNSSPublicationHubI
 
R language tutorial
R language tutorialR language tutorial
R language tutorialDavid Chiu
 
Graph analytics in Linkurious Enterprise
Graph analytics in Linkurious EnterpriseGraph analytics in Linkurious Enterprise
Graph analytics in Linkurious EnterpriseLinkurious
 
20180420 hk-the powerofmysql8
20180420 hk-the powerofmysql820180420 hk-the powerofmysql8
20180420 hk-the powerofmysql8Ivan Ma
 
Regular expressions tutorial for SEO & Website Analysis
Regular expressions tutorial for SEO & Website AnalysisRegular expressions tutorial for SEO & Website Analysis
Regular expressions tutorial for SEO & Website AnalysisGlobal Media Insight
 
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0vithakur
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0Russell Jurney
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
 
Agile development of data science projects | Part 1
Agile development of data science projects | Part 1 Agile development of data science projects | Part 1
Agile development of data science projects | Part 1 Anubhav Dhiman
 

Similar to Introduction to R (20)

Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analytics
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programming
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoop
 
Recommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesRecommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial Services
 
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
 
R language tutorial
R language tutorialR language tutorial
R language tutorial
 
Graph analytics in Linkurious Enterprise
Graph analytics in Linkurious EnterpriseGraph analytics in Linkurious Enterprise
Graph analytics in Linkurious Enterprise
 
Rclass
RclassRclass
Rclass
 
Resume_Vignesh_ThulasiDass
Resume_Vignesh_ThulasiDass Resume_Vignesh_ThulasiDass
Resume_Vignesh_ThulasiDass
 
User 2013-oracle-big-data-analytics-1971985
User 2013-oracle-big-data-analytics-1971985User 2013-oracle-big-data-analytics-1971985
User 2013-oracle-big-data-analytics-1971985
 
20180420 hk-the powerofmysql8
20180420 hk-the powerofmysql820180420 hk-the powerofmysql8
20180420 hk-the powerofmysql8
 
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
 
2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements
 
Cv zamir siddiqui
Cv zamir siddiquiCv zamir siddiqui
Cv zamir siddiqui
 
Regular expressions tutorial for SEO & Website Analysis
Regular expressions tutorial for SEO & Website AnalysisRegular expressions tutorial for SEO & Website Analysis
Regular expressions tutorial for SEO & Website Analysis
 
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
 
Agile development of data science projects | Part 1
Agile development of data science projects | Part 1 Agile development of data science projects | Part 1
Agile development of data science projects | Part 1
 

Recently uploaded

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

Introduction to R

  • 1. Introduction to R Ali Ghods Aix-Marseille Université
  • 2. What is R? R is a system for statistical analysis and graphics R is freely distributed under the terms of the GNU Free To install R, Visit: https://cran.r-project.org/ R has +15000 packages + Documentation 2 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 3. R Popularity The number of scholarly articles found in each year by Google Scholar: (source: http://r4stats.com/articles/popularity/) 3 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 4. Advantages It seems complex for beginners, but it’s not true because the main feature of R is the Flexibility New methods are available sooner Vast selection of analytic and graphics It runs on any computer Powerful object-oriented language Vast selection of input/output formats, accessibility from Excel, SPSS, SAS, etc. 4 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 5. RStudio 5 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 6. Find and Install a package Find packages: R Archive Network Statistical Data Analysis https://awesome-r.com/ Install packages: installed.packages() install.packages(”package name”) update.packages(”package name”) 6 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 7. What are the most Popular Packages in R? To manipulate data: dplyr, tidyr To visualize data: ggplot2, plotly To report results: shiny, xtable, rmarkdown To analyze data: psych, pls Network Analysis: igraph Text-mining: tm, tidytext 7 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 8. Data sets Your data set from your research Public data sets Google: https://toolbox.google.com/datasetsearch https://github.com/awesomedata/awesome-public-datasets Data Mock: e.g. https://www.mockaroo.com/ 8 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 9. Where to learn R? R help, package support documents R for Beginners Introduction to Probability and Statistics Using R Introduction à la programmation en R https://www.r-bloggers.com/ https://rdrr.io/ https://www.rdocumentation.org/ 9 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 10. Example: Text analysis, sentiment analysis Data: A list of 3150 Amazon customers reviews for Alexa Echo, Firestick, Echo Dot etc. Source: https://www.kaggle.com/sid321axn/amazon-alexa-reviews Objective: 1 Find out the most frequent words 2 Find out the most positive and negative words 10 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 11. Example #demanded packages library(tidytext, dplyr, readr, tokenizers, ggplot2) #retrieve data data <- read_tsv("amazon_alexa.tsv") #tokenization comments_token <- tokenize_words(data$verified_reviews, lowercase = TRUE, stopwords = TRUE, strip_numeric = TRUE, strip_punct = TRUE) #the most frequent words comments_token <- comments_token %>% anti_join(stop_words) %>% count(word, sort = TRUE) g <- ggplot(comments_token[1:10,], aes(x = reorder(word, -n), y = n)) + geom_bar(stat = "identity", fill = "steelblue") + geom_text(aes(label = n), position = position_dodge(0.9), vjust = 0) plot(g) 11 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 12. Example #demanded packages library(tidytext, dplyr, readr, tokenizers, ggplot2) #retrieve data data <- read_tsv("amazon_alexa.tsv") #tokenization comments_token <- tokenize_words(data$verified_reviews, lowercase = TRUE, stopwords = TRUE, strip_numeric = TRUE, strip_punct = TRUE) #the most frequent words comments_token <- comments_token %>% anti_join(stop_words) %>% count(word, sort = TRUE) g <- ggplot(comments_token[1:10,], aes(x = reorder(word, -n), y = n)) + geom_bar(stat = "identity", fill = "steelblue") + geom_text(aes(label = n), position = position_dodge(0.9), vjust = 0) plot(g) 12 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 13. Example #Sentiment Analysis bing_word_counts %>% filter(n > 50) %>% mutate(n = ifelse(sentiment == 'negative', -n, n)) %>% mutate(word = reorder(word, n)) %>% ggplot(aes(word, n, fill = sentiment)) + geom_bar(stat = 'identity') + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + ylab('Contribution to sentiment') + ggtitle('Most common positive and negative words') 13 Ali Ghods | Aix-Marseille Université 2018-12-07
  • 14. Example #Sentiment Analysis bing_word_counts %>% filter(n > 50) %>% mutate(n = ifelse(sentiment == 'negative', -n, n)) %>% mutate(word = reorder(word, n)) %>% ggplot(aes(word, n, fill = sentiment)) + geom_bar(stat = 'identity') + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + ylab('Contribution to sentiment') + ggtitle('Most common positive and negative words') 14 Ali Ghods | Aix-Marseille Université 2018-12-07