Data Analytics using R with Yelp Dataset

Cédric Poottaren
Cédric PoottarenSoftware Developer, Founder J.C.P Laboratory
Text Analytics on Dataset
#DevConMru
Data ScienceBig
Processes and systems to extract knowledge or
insights from data
Large and complex data that has been collected
over several years
What is yelp ?
Dataset
yelp_academic_dataset_business – Business Information
yelp_academic_dataset_review – User Reviews
Combine the 2 using
DEMO
Text Analytics Methodologies
Natural Language Processing (NLP)
Part 1
Natural
Language
Processing
(NLP)
Microsoft Text Analytics
Sentiment Analysis
Keyword Extraction
Topic Detection
Language Detection
DEMO
Stemming
Revert the word into its original or root form
Stemming
(Results)
Common
Words
Removal
DEMO
Part 2
Natural
Language
Processing
(NLP)
Stanford CoreNLP
Part 2
Natural
Language
Processing
(NLP)
Stanford CoreNLP
Part of Speech
POS Tag Description Example
CC coordinating conjunction and
CD cardinal number 1, third
DT determiner the
EX existential there there is
FW foreign word d’hoevre
IN preposition/subordinating conjunction in, of, like
JJ adjective big
JJR adjective, comparative bigger
JJS adjective, superlative biggest
DEMO
Part 2
Natural
Language
Processing
(NLP)
Stanford CoreNLP
Part 2
Natural
Language
Processing
(NLP)
Stanford CoreNLP
DEMO
Term
Document
Matrix
Describes the frequency of terms that occur in a collection of documents
Term
Document
Matrix
Term frequency and weighting
TF-IDF weighting: give higher weight to terms that are
rare
Unsupervised
Learning
K-means Clustering
“Art of finding groups in data” – Kaufman, Rousseeuw
• No clear picture of what is within the document
•
• Natural pair of groupings
•
• Simple to run
K-means
Clustering
Data Analytics using R with Yelp Dataset
DEMO
Conclusion
1. Data Manipulation in R
2. Natural Language Processing
3. Machine Learning
4. Visualization
Text Analytics on Dataset
Thank you.
1 of 26

More Related Content

Similar to Data Analytics using R with Yelp Dataset(20)

Open nlp presentationssOpen nlp presentationss
Open nlp presentationss
Chandan Deb1.8K views
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
seungwoo kim2.3K views
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
Susang Kim3.6K views
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
Manohar Swamynathan1.2K views
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
Massimo Schenone255 views
NltkNltk
Nltk
Anirudh 1.2K views
Resume_Clasification.pptxResume_Clasification.pptx
Resume_Clasification.pptx
MOINDALVS17 views

Recently uploaded(20)

Microsoft Fabric.pptxMicrosoft Fabric.pptx
Microsoft Fabric.pptx
Shruti Chaurasia17 views
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdf
ishaniuudeshika19 views
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East River
ErickANDRADE909 views
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptx
noraelstela164 views
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
thomasjvarghese4917 views
MOSORE_BRESCIAMOSORE_BRESCIA
MOSORE_BRESCIA
Federico Karagulian5 views
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann88 views
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docxRIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
JaysonGarabilesEspej6 views
PTicketInput.pdfPTicketInput.pdf
PTicketInput.pdf
stuartmcphersonflipm286 views
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
Abdul salam 12 views
PROGRAMME.pdfPROGRAMME.pdf
PROGRAMME.pdf
HiNedHaJar7 views

Data Analytics using R with Yelp Dataset

Editor's Notes

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26