SlideShare a Scribd company logo
TEXT MINING
Presented By:
Prakhyath Rai
Asst. Professor, Dept. of ISE,
SCEM, Mangaluru
Outline
 Introduction
 Data Mining Vs. Text Mining
 Motivation for Text Mining
 I/O Model for Text Mining
 Steps for Text Mining
 Key Terms in Text Mining
 Text Mining Frameworks
 Merits of Text Mining
 Applications of Text Mining
 Demerits of Text Mining
 References
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Introduction
Text Mining is a Discovery
Text Mining is also referred as Text Data Mining (TDM)
and Knowledge Discovery in Textual Database (KDT).
Text Mining is used to extract relevant information or
knowledge or pattern from different sources that are in
unstructured or semi-structured form.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Introduction Cont.
Extract and discover knowledge hidden in text
automatically
Aid domain experts by automatically:
 identifying concepts
extracting facts/relations
discovering implicit links
generating hypotheses
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Data Mining vs. Text Mining
Data Mining Text Mining
Process directly Linguistic processing or natural
language processing (NLP)
Identify causal relationship Discover heretofore unknown
information
Structured Data Semi-structured & Unstructured
Data (Text)
Structured numeric transaction
data residing in rational data
warehouse
Applications deal with much
more diverse and eclectic
collections of systems and
formats
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Motivation for Text Mining
Approximately 90% of the world’s data is held in
unstructured formats (source: Oracle Corporation)
Information intensive business processes demand that we
transcend from simple document retrieval to “knowledge”
discovery.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Input-Output Model for Text Mining
Input
Text Mining
Technique
Output
Patterns
Connections
Trends
Documents
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Steps for Text Mining
Pre-Processing the Text
Applying Text Mining Techniques
Summarization
Classification
Clustering
Visualization
Information Extraction
Analyzing the Text
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Keywords Terms in Text Mining
 Information Extraction (IE)
The science of searching for
Information in documents
Documents themselves
Metadata which describe
documents
Text, sound, images or data,
within database: relational
stand-alone database or
hypertext networked
databases such as the
Internet or intranets.
 Artificial Intelligence (AI)
Artificial intelligence
(AI) is a branch of
computer science and
engineering that deals
with intelligent behavior,
learning, and adaptation
in machines.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Merits of Text Mining
Database limits itself to Storage of less Information
whereas Text Mining overcomes this limitation
Extraction of relevant Information and Relationships
from Natural Documents
Extraction of Information from Unstructured or Semi-
structured Documents
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Applications of Text Mining
Analysis of Market Trends
Classification Technique
Information Extraction Technique
Analysis and Screening of Junk Emails
 Classification on the basis of pre-defined frequently
occurring items
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Demerits of Text Mining
Requires Initial Learned Information System for
Initial Extraction
Suitable programs are not been defined to Analyze
Text from Mining Knowledge or Information
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
References
[1] R Baeza-Yates and B Ribeiro-Neto. “Modern Information Retrieval”, ACM
Press, New York, 1999.
[2] Ning Zhong, Yuefeng Li and T. Grance, “Effective Pattern Discovery for Text
Mining,” IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 1,
January 2012.
[3] Raymond J Mooney and Un Yong Nahm, “ Text Mining with Information
Extraction”, Proceedings of the 4th International MIDP Colloquium, pages 141-
160, Van Schaik Pub., South Africa, 2005.
[4] M E Califf and R J Mooney, “Relational Learning of Pattern-Match Rules for
Information Extraction”, Proceedings of the 16th National Conference on Artificial
Intelligence (AAAI-99), pages 328-334, Orlando, FL, July 1999.
[5] D Freitag and N Kushmerick, “Boosted Wrapper Induction”, Proceedings of
the 17th National Conference on Artificial Intelligence (AAAI-2000), pages 577-
583, Austin, TX, July 2000.
Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
Text MIning
Text MIning

More Related Content

What's hot

Machine learning ppt
Machine learning pptMachine learning ppt
Machine learning ppt
Rajat Sharma
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
neelamoberoi1030
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 
Text categorization
Text categorizationText categorization
Text categorization
KU Leuven
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
Lalit Jain
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Makrand Patil
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
data mining
data miningdata mining
data mining
uoitc
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 
Text Mining
Text MiningText Mining
Text Mining
Biniam Asnake
 
Data mining
Data miningData mining
Data mining
Birju Tank
 
Web content mining
Web content miningWeb content mining
Web content mining
Daminda Herath
 
Web mining
Web miningWeb mining
Clustering
ClusteringClustering
Clustering
M Rizwan Aqeel
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
DataminingTools Inc
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 

What's hot (20)

Machine learning ppt
Machine learning pptMachine learning ppt
Machine learning ppt
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
web mining
web miningweb mining
web mining
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Text categorization
Text categorizationText categorization
Text categorization
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Machine learning
Machine learningMachine learning
Machine learning
 
data mining
data miningdata mining
data mining
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Text Mining
Text MiningText Mining
Text Mining
 
Data mining
Data miningData mining
Data mining
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Web mining
Web miningWeb mining
Web mining
 
Clustering
ClusteringClustering
Clustering
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
 

Viewers also liked

Big Data & Text Mining
Big Data & Text MiningBig Data & Text Mining
Big Data & Text Mining
Michel Bruley
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
Yanchang Zhao
 
Text Mining and Visualization
Text Mining and VisualizationText Mining and Visualization
Text Mining and Visualization
Seth Grimes
 
Introduction to text mining
Introduction to text miningIntroduction to text mining
Introduction to text miningLars Juhl Jensen
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
Aleksei Beloshytski
 
Text mining
Text miningText mining
Text mining
ike kurniati
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining Framework
Prakhyath Rai
 
Introduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsIntroduction to Text Mining and Semantics
Introduction to Text Mining and Semantics
Seth Grimes
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
John Breslin
 
Text mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehText mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehHadi Mohammadzadeh
 
Lecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data WarehouseLecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data Warehouse
phanleson
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the EnterpriseDATAVERSITY
 
Planning
PlanningPlanning
Planning
Prakhyath Rai
 
Unstructured Data in BI
Unstructured Data in BIUnstructured Data in BI
Unstructured Data in BI
Monaheng Diaho
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
Jahnab Kumar Deka
 
Text data mining1
Text data mining1Text data mining1
Text data mining1KU Leuven
 
Directing and Controlling
Directing and ControllingDirecting and Controlling
Directing and Controlling
Prakhyath Rai
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
Jaganadh Gopinadhan
 
Text mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehText mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehHadi Mohammadzadeh
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text mining
Sakthi Dasans
 

Viewers also liked (20)

Big Data & Text Mining
Big Data & Text MiningBig Data & Text Mining
Big Data & Text Mining
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
 
Text Mining and Visualization
Text Mining and VisualizationText Mining and Visualization
Text Mining and Visualization
 
Introduction to text mining
Introduction to text miningIntroduction to text mining
Introduction to text mining
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
 
Text mining
Text miningText mining
Text mining
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining Framework
 
Introduction to Text Mining and Semantics
Introduction to Text Mining and SemanticsIntroduction to Text Mining and Semantics
Introduction to Text Mining and Semantics
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
 
Text mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi MohammadzadehText mining by examples, By Hadi Mohammadzadeh
Text mining by examples, By Hadi Mohammadzadeh
 
Lecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data WarehouseLecture 11 Unstructured Data and the Data Warehouse
Lecture 11 Unstructured Data and the Data Warehouse
 
Unstructured Data and the Enterprise
Unstructured Data and the EnterpriseUnstructured Data and the Enterprise
Unstructured Data and the Enterprise
 
Planning
PlanningPlanning
Planning
 
Unstructured Data in BI
Unstructured Data in BIUnstructured Data in BI
Unstructured Data in BI
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
 
Text data mining1
Text data mining1Text data mining1
Text data mining1
 
Directing and Controlling
Directing and ControllingDirecting and Controlling
Directing and Controlling
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
 
Text mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehText mining, By Hadi Mohammadzadeh
Text mining, By Hadi Mohammadzadeh
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text mining
 

Similar to Text MIning

Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
ijceronline
 
Post 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docxPost 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docx
stilliegeorgiana
 
Post 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text miniPost 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text mini
anhcrowley
 
Text Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewText Mining at Feature Level: A Review
Text Mining at Feature Level: A Review
INFOGAIN PUBLICATION
 
IRJET - BOT Virtual Guide
IRJET -  	  BOT Virtual GuideIRJET -  	  BOT Virtual Guide
IRJET - BOT Virtual Guide
IRJET Journal
 
MINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMENMINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMEN
guest469de8
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET Journal
 
TEXT MINING.pptx
TEXT MINING.pptxTEXT MINING.pptx
TEXT MINING.pptx
AdityaSharma107259
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
kevig
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
Kausar Mukadam
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web Engineering
IRJET Journal
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_Habib
El Habib NFAOUI
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitationbutest
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitationbutest
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
IJCSIS Research Publications
 
Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.
Sandeep Wakchaure
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
Willard Van De Bogart
 
Text mining presentation in Data mining Area
Text mining presentation in Data mining AreaText mining presentation in Data mining Area
Text mining presentation in Data mining Area
MahamudHasanCSE
 

Similar to Text MIning (20)

Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
 
B0410206010
B0410206010B0410206010
B0410206010
 
Ijetcas14 409
Ijetcas14 409Ijetcas14 409
Ijetcas14 409
 
Post 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docxPost 1What is text analytics How does it differ from text mini.docx
Post 1What is text analytics How does it differ from text mini.docx
 
Post 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text miniPost 1What is text analytics How does it differ from text mini
Post 1What is text analytics How does it differ from text mini
 
Text Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewText Mining at Feature Level: A Review
Text Mining at Feature Level: A Review
 
IRJET - BOT Virtual Guide
IRJET -  	  BOT Virtual GuideIRJET -  	  BOT Virtual Guide
IRJET - BOT Virtual Guide
 
MINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMENMINI Electrophysiology and CARMEN
MINI Electrophysiology and CARMEN
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
 
TEXT MINING.pptx
TEXT MINING.pptxTEXT MINING.pptx
TEXT MINING.pptx
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web Engineering
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_Habib
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitation
 
Text Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards ExploitationText Mining: Beyond Extraction Towards Exploitation
Text Mining: Beyond Extraction Towards Exploitation
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 
Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.Introduction of Semantic Web using NLP techniques.
Introduction of Semantic Web using NLP techniques.
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Text mining presentation in Data mining Area
Text mining presentation in Data mining AreaText mining presentation in Data mining Area
Text mining presentation in Data mining Area
 

More from Prakhyath Rai

Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Prakhyath Rai
 
Software Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringSoftware Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements Engineering
Prakhyath Rai
 
Ethics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging TechnologiesEthics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging Technologies
Prakhyath Rai
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)
Prakhyath Rai
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Prakhyath Rai
 
Data Science
Data ScienceData Science
Data Science
Prakhyath Rai
 
Emerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & IntroductionEmerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & Introduction
Prakhyath Rai
 
Preparation of Project
Preparation of ProjectPreparation of Project
Preparation of Project
Prakhyath Rai
 
Small Scale Industry
Small Scale IndustrySmall Scale Industry
Small Scale Industry
Prakhyath Rai
 
Entrepreneurship
EntrepreneurshipEntrepreneurship
Entrepreneurship
Prakhyath Rai
 
Introduction to Management
Introduction to Management Introduction to Management
Introduction to Management
Prakhyath Rai
 

More from Prakhyath Rai (11)

Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...
 
Software Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringSoftware Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements Engineering
 
Ethics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging TechnologiesEthics, Professionalism and Other Emerging Technologies
Ethics, Professionalism and Other Emerging Technologies
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Data Science
Data ScienceData Science
Data Science
 
Emerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & IntroductionEmerging Exponential Technologies - History & Introduction
Emerging Exponential Technologies - History & Introduction
 
Preparation of Project
Preparation of ProjectPreparation of Project
Preparation of Project
 
Small Scale Industry
Small Scale IndustrySmall Scale Industry
Small Scale Industry
 
Entrepreneurship
EntrepreneurshipEntrepreneurship
Entrepreneurship
 
Introduction to Management
Introduction to Management Introduction to Management
Introduction to Management
 

Recently uploaded

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
ShahidSultan24
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 

Recently uploaded (20)

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 

Text MIning

  • 1. TEXT MINING Presented By: Prakhyath Rai Asst. Professor, Dept. of ISE, SCEM, Mangaluru
  • 2. Outline  Introduction  Data Mining Vs. Text Mining  Motivation for Text Mining  I/O Model for Text Mining  Steps for Text Mining  Key Terms in Text Mining  Text Mining Frameworks  Merits of Text Mining  Applications of Text Mining  Demerits of Text Mining  References Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 3. Introduction Text Mining is a Discovery Text Mining is also referred as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT). Text Mining is used to extract relevant information or knowledge or pattern from different sources that are in unstructured or semi-structured form. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 4. Introduction Cont. Extract and discover knowledge hidden in text automatically Aid domain experts by automatically:  identifying concepts extracting facts/relations discovering implicit links generating hypotheses Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 5. Data Mining vs. Text Mining Data Mining Text Mining Process directly Linguistic processing or natural language processing (NLP) Identify causal relationship Discover heretofore unknown information Structured Data Semi-structured & Unstructured Data (Text) Structured numeric transaction data residing in rational data warehouse Applications deal with much more diverse and eclectic collections of systems and formats Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 6. Motivation for Text Mining Approximately 90% of the world’s data is held in unstructured formats (source: Oracle Corporation) Information intensive business processes demand that we transcend from simple document retrieval to “knowledge” discovery. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 7. Input-Output Model for Text Mining Input Text Mining Technique Output Patterns Connections Trends Documents Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 8. Steps for Text Mining Pre-Processing the Text Applying Text Mining Techniques Summarization Classification Clustering Visualization Information Extraction Analyzing the Text Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 9. Keywords Terms in Text Mining  Information Extraction (IE) The science of searching for Information in documents Documents themselves Metadata which describe documents Text, sound, images or data, within database: relational stand-alone database or hypertext networked databases such as the Internet or intranets.  Artificial Intelligence (AI) Artificial intelligence (AI) is a branch of computer science and engineering that deals with intelligent behavior, learning, and adaptation in machines. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 10. Merits of Text Mining Database limits itself to Storage of less Information whereas Text Mining overcomes this limitation Extraction of relevant Information and Relationships from Natural Documents Extraction of Information from Unstructured or Semi- structured Documents Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 11. Applications of Text Mining Analysis of Market Trends Classification Technique Information Extraction Technique Analysis and Screening of Junk Emails  Classification on the basis of pre-defined frequently occurring items Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 12. Demerits of Text Mining Requires Initial Learned Information System for Initial Extraction Suitable programs are not been defined to Analyze Text from Mining Knowledge or Information Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007
  • 13. References [1] R Baeza-Yates and B Ribeiro-Neto. “Modern Information Retrieval”, ACM Press, New York, 1999. [2] Ning Zhong, Yuefeng Li and T. Grance, “Effective Pattern Discovery for Text Mining,” IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 1, January 2012. [3] Raymond J Mooney and Un Yong Nahm, “ Text Mining with Information Extraction”, Proceedings of the 4th International MIDP Colloquium, pages 141- 160, Van Schaik Pub., South Africa, 2005. [4] M E Califf and R J Mooney, “Relational Learning of Pattern-Match Rules for Information Extraction”, Proceedings of the 16th National Conference on Artificial Intelligence (AAAI-99), pages 328-334, Orlando, FL, July 1999. [5] D Freitag and N Kushmerick, “Boosted Wrapper Induction”, Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-2000), pages 577- 583, Austin, TX, July 2000. Prakhyath Rai, Asst. Professor, Dept. of ISE, SCEM, Mangaluru-575007