SlideShare a Scribd company logo
1 of 11
Download to read offline
Presented By: Sarfaraz Hussain
Software Consultant
Knoldus Inc
Text Mining using R
Agenda
01 Introduction to R
02 Data Structures in R
03 Machine Learning
04 Data Visualization
05 Text Mining
Introduction to R
◾ R is a programming language and software environment for statistical analysis, graphics representation
and reporting.
◾ It is made by statisticians and data miners for statistical analysis and graphical representation for
statistical computation.
◾ It was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in the
year 1993 and is currently developed by the R Development Core Team.
◾ It is an interpreted language.
◾ It allows integration with the procedures written in the C, C++, .Net, Python or FORTRAN languages for
efficiency.
R vs. Python
2013 2014 201 2016 2017 2018
◾ Speed – Winner = R
◾ Memory management – Winner = Python
◾ Visualization – Winner = R
◾ Deep Learning support – Winner = Python
◾ R is great for statistical analysis and RStudio is a big advantage for ease of task.
◾ Python is great for deep learning task.
Data Structures in R
Machine Learning
Linear Regression:
◾ Regression analysis is a form of predictive modeling technique which investigates the relationship
between a dependent (target) and independent variable(s) (predictor).
◾ It falls under Supervised Learning technique.
◾ Here, we fit a curve / line to the data points, in such a manner that the differences between the
distances of data points from the curve or line is minimized.
◾ In this technique, the dependent variable is continuous, independent variable(s) can be continuous or
discrete, and nature of regression line is linear.
◾ Linear Regression establishes a relationship between dependent variable (Y) and one or more
independent variables (X) using a best fit straight line (also known as regression line).
Machine Learning (contd.)
Data Visualization
Using package “plotrix”, we will see the following visualizations in action:
- 2D Pie Chart
- 3D Pie Chart
- Bar Chart
Text Mining
◾ Text Mining or Text Analytics is one of the branch of Data Analytics where we specifically look at
the textual data.
◾ It is the process of extracting meaning insights from text (unstructured).
◾ We can analyze words, clusters of words used in documents using various
algorithms/packages.
◾ In the most general terms, text mining will “turn text into numbers”.
Text Mining (contd.)
◾ CORPUS: A text corpus is a large and unstructured set of texts.
◾ Term Document Matrix and Document Term Matrix:
Thank You !

More Related Content

What's hot

Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in RAshraf Uddin
 
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Sean Golliher
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackBhaskar Mitra
 
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevImage Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevDatabricks
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalBhaskar Mitra
 
Scalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large FoldersScalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large Foldersfeiwin
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligencevini89
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Mehwish Alam
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubFramester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubMehwish Alam
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisMehwish Alam
 
Python and R for quantitative finance
Python and R for quantitative financePython and R for quantitative finance
Python and R for quantitative financeLuca Sbardella
 
K anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseK anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseLeMeniz Infotech
 

What's hot (20)

Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
 
Ir 02
Ir   02Ir   02
Ir 02
 
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
 
Ju3517011704
Ju3517011704Ju3517011704
Ju3517011704
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevImage Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
 
Scalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large FoldersScalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large Folders
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubFramester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data Hub
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
 
Python and R for quantitative finance
Python and R for quantitative financePython and R for quantitative finance
Python and R for quantitative finance
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
K anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseK anonymity for crowdsourcing database
K anonymity for crowdsourcing database
 

Similar to Text Mining Using R

GET STARTED WITH R FOR DATA SCIENCE
GET STARTED WITH R FOR DATA SCIENCEGET STARTED WITH R FOR DATA SCIENCE
GET STARTED WITH R FOR DATA SCIENCEUSDSI
 
Linear Regression with R programming.pptx
Linear Regression with R programming.pptxLinear Regression with R programming.pptx
Linear Regression with R programming.pptxanshikagoel52
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docxrohithprabhas1
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptxranapoonam1
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningEditor IJCATR
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxShahbazKhan77289
 
Adarsh_Masekar(2GP19CS003).pptx
Adarsh_Masekar(2GP19CS003).pptxAdarsh_Masekar(2GP19CS003).pptx
Adarsh_Masekar(2GP19CS003).pptxhkabir55
 
IRJET- Automatic Text Summarization using Text Rank
IRJET- Automatic Text Summarization using Text RankIRJET- Automatic Text Summarization using Text Rank
IRJET- Automatic Text Summarization using Text RankIRJET Journal
 
Running Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docx
Running Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docxRunning Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docx
Running Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docxjeanettehully
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programmingUmang Singh
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdfRohanBorgalli
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxmyworld93
 
IRJET - House Price Predictor using ML through Artificial Neural Network
IRJET - House Price Predictor using ML through Artificial Neural NetworkIRJET - House Price Predictor using ML through Artificial Neural Network
IRJET - House Price Predictor using ML through Artificial Neural NetworkIRJET Journal
 
IRJET- Information Retrieval & Text Analytics using Artificial Intelligence
IRJET- Information Retrieval & Text Analytics using Artificial IntelligenceIRJET- Information Retrieval & Text Analytics using Artificial Intelligence
IRJET- Information Retrieval & Text Analytics using Artificial IntelligenceIRJET Journal
 

Similar to Text Mining Using R (20)

GET STARTED WITH R FOR DATA SCIENCE
GET STARTED WITH R FOR DATA SCIENCEGET STARTED WITH R FOR DATA SCIENCE
GET STARTED WITH R FOR DATA SCIENCE
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
 
Linear Regression with R programming.pptx
Linear Regression with R programming.pptxLinear Regression with R programming.pptx
Linear Regression with R programming.pptx
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptx
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docx
 
bda ppt.pptx
bda ppt.pptxbda ppt.pptx
bda ppt.pptx
 
Adarsh_Masekar(2GP19CS003).pptx
Adarsh_Masekar(2GP19CS003).pptxAdarsh_Masekar(2GP19CS003).pptx
Adarsh_Masekar(2GP19CS003).pptx
 
IRJET- Automatic Text Summarization using Text Rank
IRJET- Automatic Text Summarization using Text RankIRJET- Automatic Text Summarization using Text Rank
IRJET- Automatic Text Summarization using Text Rank
 
R tutorial
R tutorialR tutorial
R tutorial
 
Running Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docx
Running Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docxRunning Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docx
Running Head E-GRANT REQUIREMENTS2E-GRANT REQUIREMENTS .docx
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
Text Mining in R
Text Mining in RText Mining in R
Text Mining in R
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programming
 
resume_MH
resume_MHresume_MH
resume_MH
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdf
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptx
 
IRJET - House Price Predictor using ML through Artificial Neural Network
IRJET - House Price Predictor using ML through Artificial Neural NetworkIRJET - House Price Predictor using ML through Artificial Neural Network
IRJET - House Price Predictor using ML through Artificial Neural Network
 
IRJET- Information Retrieval & Text Analytics using Artificial Intelligence
IRJET- Information Retrieval & Text Analytics using Artificial IntelligenceIRJET- Information Retrieval & Text Analytics using Artificial Intelligence
IRJET- Information Retrieval & Text Analytics using Artificial Intelligence
 

More from Knoldus Inc.

Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxKnoldus Inc.
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationKnoldus Inc.
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIsKnoldus Inc.
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II PresentationKnoldus Inc.
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAKnoldus Inc.
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Knoldus Inc.
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxKnoldus Inc.
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinKnoldus Inc.
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks PresentationKnoldus Inc.
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Knoldus Inc.
 
NoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxNoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxKnoldus Inc.
 
Mastering Distributed Performance Testing
Mastering Distributed Performance TestingMastering Distributed Performance Testing
Mastering Distributed Performance TestingKnoldus Inc.
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxKnoldus Inc.
 
Introduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationIntroduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationKnoldus Inc.
 
CQRS with dot net services presentation.
CQRS with dot net services presentation.CQRS with dot net services presentation.
CQRS with dot net services presentation.Knoldus Inc.
 

More from Knoldus Inc. (20)

Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptx
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and Kotlin
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks Presentation
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 
NoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxNoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptx
 
Mastering Distributed Performance Testing
Mastering Distributed Performance TestingMastering Distributed Performance Testing
Mastering Distributed Performance Testing
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
 
Introduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationIntroduction to Ansible Tower Presentation
Introduction to Ansible Tower Presentation
 
CQRS with dot net services presentation.
CQRS with dot net services presentation.CQRS with dot net services presentation.
CQRS with dot net services presentation.
 

Recently uploaded

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 

Recently uploaded (20)

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 

Text Mining Using R

  • 1. Presented By: Sarfaraz Hussain Software Consultant Knoldus Inc Text Mining using R
  • 2. Agenda 01 Introduction to R 02 Data Structures in R 03 Machine Learning 04 Data Visualization 05 Text Mining
  • 3. Introduction to R ◾ R is a programming language and software environment for statistical analysis, graphics representation and reporting. ◾ It is made by statisticians and data miners for statistical analysis and graphical representation for statistical computation. ◾ It was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in the year 1993 and is currently developed by the R Development Core Team. ◾ It is an interpreted language. ◾ It allows integration with the procedures written in the C, C++, .Net, Python or FORTRAN languages for efficiency.
  • 4. R vs. Python 2013 2014 201 2016 2017 2018 ◾ Speed – Winner = R ◾ Memory management – Winner = Python ◾ Visualization – Winner = R ◾ Deep Learning support – Winner = Python ◾ R is great for statistical analysis and RStudio is a big advantage for ease of task. ◾ Python is great for deep learning task.
  • 6. Machine Learning Linear Regression: ◾ Regression analysis is a form of predictive modeling technique which investigates the relationship between a dependent (target) and independent variable(s) (predictor). ◾ It falls under Supervised Learning technique. ◾ Here, we fit a curve / line to the data points, in such a manner that the differences between the distances of data points from the curve or line is minimized. ◾ In this technique, the dependent variable is continuous, independent variable(s) can be continuous or discrete, and nature of regression line is linear. ◾ Linear Regression establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line (also known as regression line).
  • 8. Data Visualization Using package “plotrix”, we will see the following visualizations in action: - 2D Pie Chart - 3D Pie Chart - Bar Chart
  • 9. Text Mining ◾ Text Mining or Text Analytics is one of the branch of Data Analytics where we specifically look at the textual data. ◾ It is the process of extracting meaning insights from text (unstructured). ◾ We can analyze words, clusters of words used in documents using various algorithms/packages. ◾ In the most general terms, text mining will “turn text into numbers”.
  • 10. Text Mining (contd.) ◾ CORPUS: A text corpus is a large and unstructured set of texts. ◾ Term Document Matrix and Document Term Matrix: