SlideShare a Scribd company logo
Knowledge Discovery through Data Mining C. Devakumar Indian Council of Agricultural Research New Delhi-110 012 [email_address]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why Data Mining?  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Evolution of Database Technology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What is Data Mining? ,[object Object],[object Object],[object Object]
What is not Data Mining? ,[object Object],[object Object]
Origins of Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],Machine Learning/ Pattern   Recognition Statistics/ AI Data Mining Database systems
+ = Data Interestingness criteria Hidden patterns
+ = Data Interestingness criteria Hidden patterns Type of  Patterns
+ = Data Interestingness criteria Hidden patterns Type of data Type of  Interestingness criteria
Type of Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Type of Interestingness ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Statistics: Conceptual Model (Hypothesis) Statistical Reasoning “ Proof” (Validation of Hypothesis)
Data mining: Mining Algorithm Based on  Interestingness Data Pattern  (model, rule,  hypothesis) discovery
Explores Your Data Finds Patterns Performs Predictions
Presentation Exploration Discovery Passive Interactive Proactive Role of Software Business Insight Predictive Analysis Canned reporting Ad-hoc reporting OLAP Data mining
Data Mining Tasks ,[object Object],[object Object],[object Object],[object Object]
Data Mining Tasks... ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Challenges of Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining and Business Intelligence  Increasing potential to support business decisions End User Business Analyst Data Analyst DBA Decision   Making Data Presentation Visualization Techniques Data Mining Information Discovery Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses Data Sources Paper, Files, Web documents, Scientific experiments, Database Systems
Data Mining: Confluence of Multiple Disciplines  Data Mining Database  Technology Statistics Machine Learning Pattern Recognition Algorithm Other Disciplines Visualization
Architecture: Typical Data Mining System data cleaning, integration, and selection Database or Data Warehouse Server Data Mining Engine Pattern Evaluation Graphical User Interface Knowledge-Base Database Data  Warehouse World-Wide Web Other Info Repositories
Multi-Dimensional View of Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining: Classification Schemes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining Functionalities ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining Functionalities (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Major Issues in Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bioinformatics, Computational Biology, Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],Problems in Bioinformatics Domain
Tokenization , which splits a text document into a stream of words by removing all punctuation marks and by replacing tabs and other non-text characters with single white spaces Filtering  methods remove words like articles, conjunctions, prepositions, etc.  Lemmatization  methods try to map verb forms to the infinite tense and nouns to their singular form.  Stemming  methods attempt to build the basic forms of words, for example, by stripping the plural 's' from nouns, the 'ing' from verbs, or other affixes.  Additional linguistic preprocessing N-grams  individualization, which is n-word generic sequences that do not necessarily correspond to an idiomatic use;  Anaphora  resolution, which can identify relationships among a linguistic expression (anaphora) and its preceding phrase, thus, determining the corresponding reference;  Part-of-speech tagging  (POS) determines the part of speech tag, noun, verb, adjective, etc. for each term; Text chunking aims at grouping adjacent words in a sentence;  Word Sense Disambiguation  (WSD) tries to resolve the ambiguity in the meaning of single words or phrases;  Parsing  produces a full parse tree of a sentence (subject, object, etc.).
Castellano, M. et al.  A bioinformatics knowledge discovery in text application for grid Computing  BMC Bioinformatics 2009, 10(Suppl 6):S23
BIOINFORMATICS ARCHITECTURE The Layer Architecture consisting of GATE 4.0 Toolkit for Text Mining, a Middleware solution written by Java API, the grid infrastructure middleware, and a physical layer that consists of a Gnu/Linux Operating System. The integrated development environment, GATE was used for the text mining process. GATE operated on a collection of scientific publications in full text available on MedLine/Pubmed (in pdf format) using the process of Text Mining
Technology Platform
What is New in SQL Server 2008? Data Mining Enhancements ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Microsoft DM Competitors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
Sushil Kulkarni
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
Institute of Technology Telkom
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
Fazle Rabbi Ador
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
Phi Jack
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Data mining
Data miningData mining
Data mining
Kinza Razzaq
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
Ali Abbasi
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
SHIVANI SONI
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
Krish_ver2
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
malathieswaran29
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
Krish_ver2
 
Machine learning and types
Machine learning and typesMachine learning and types
Machine learning and types
Padma Metta
 
data mining
data miningdata mining
data mining
manasa polu
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
DataminingTools Inc
 
supervised learning
supervised learningsupervised learning
supervised learning
Amar Tripathi
 
Data cubes
Data cubesData cubes
Data cubes
Mohammed
 

What's hot (20)

Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Machine learning and types
Machine learning and typesMachine learning and types
Machine learning and types
 
data mining
data miningdata mining
data mining
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
supervised learning
supervised learningsupervised learning
supervised learning
 
Data cubes
Data cubesData cubes
Data cubes
 

Viewers also liked

Browsing The Source Code of Linux Packages
Browsing The Source Code of Linux PackagesBrowsing The Source Code of Linux Packages
Browsing The Source Code of Linux Packages
Motaz Saad
 
Hewahi, saad 2006 - class outliers mining distance-based approach
Hewahi, saad   2006 - class outliers mining distance-based approachHewahi, saad   2006 - class outliers mining distance-based approach
Hewahi, saad 2006 - class outliers mining distance-based approach
Motaz Saad
 
3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysis
Krish_ver2
 
The x86 Family
The x86 FamilyThe x86 Family
The x86 Family
Motaz Saad
 
Assembly Language Lecture 5
Assembly Language Lecture 5Assembly Language Lecture 5
Assembly Language Lecture 5
Motaz Saad
 
مقدمة في تكنواوجيا المعلومات
مقدمة في تكنواوجيا المعلوماتمقدمة في تكنواوجيا المعلومات
مقدمة في تكنواوجيا المعلومات
Motaz Saad
 
Intel 64bit Architecture
Intel 64bit ArchitectureIntel 64bit Architecture
Intel 64bit Architecture
Motaz Saad
 
OS Lab: Introduction to Linux
OS Lab: Introduction to LinuxOS Lab: Introduction to Linux
OS Lab: Introduction to Linux
Motaz Saad
 
Open Source Business Models
Open Source Business ModelsOpen Source Business Models
Open Source Business Models
Motaz Saad
 
Browsing Linux Kernel Source
Browsing Linux Kernel SourceBrowsing Linux Kernel Source
Browsing Linux Kernel Source
Motaz Saad
 
Cross Language Concept Mining
Cross Language Concept Mining Cross Language Concept Mining
Cross Language Concept Mining
Motaz Saad
 
Class Outlier Mining
Class Outlier MiningClass Outlier Mining
Class Outlier Mining
Motaz Saad
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
Motaz Saad
 
Assembly Language Lecture 3
Assembly Language Lecture 3Assembly Language Lecture 3
Assembly Language Lecture 3
Motaz Saad
 
Assembly Language Lecture 4
Assembly Language Lecture 4Assembly Language Lecture 4
Assembly Language Lecture 4
Motaz Saad
 
Structured Vs, Object Oriented Analysis and Design
Structured Vs, Object Oriented Analysis and DesignStructured Vs, Object Oriented Analysis and Design
Structured Vs, Object Oriented Analysis and Design
Motaz Saad
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Salah Amean
 
Introduction to CLIPS Expert System
Introduction to CLIPS Expert SystemIntroduction to CLIPS Expert System
Introduction to CLIPS Expert System
Motaz Saad
 

Viewers also liked (18)

Browsing The Source Code of Linux Packages
Browsing The Source Code of Linux PackagesBrowsing The Source Code of Linux Packages
Browsing The Source Code of Linux Packages
 
Hewahi, saad 2006 - class outliers mining distance-based approach
Hewahi, saad   2006 - class outliers mining distance-based approachHewahi, saad   2006 - class outliers mining distance-based approach
Hewahi, saad 2006 - class outliers mining distance-based approach
 
3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysis
 
The x86 Family
The x86 FamilyThe x86 Family
The x86 Family
 
Assembly Language Lecture 5
Assembly Language Lecture 5Assembly Language Lecture 5
Assembly Language Lecture 5
 
مقدمة في تكنواوجيا المعلومات
مقدمة في تكنواوجيا المعلوماتمقدمة في تكنواوجيا المعلومات
مقدمة في تكنواوجيا المعلومات
 
Intel 64bit Architecture
Intel 64bit ArchitectureIntel 64bit Architecture
Intel 64bit Architecture
 
OS Lab: Introduction to Linux
OS Lab: Introduction to LinuxOS Lab: Introduction to Linux
OS Lab: Introduction to Linux
 
Open Source Business Models
Open Source Business ModelsOpen Source Business Models
Open Source Business Models
 
Browsing Linux Kernel Source
Browsing Linux Kernel SourceBrowsing Linux Kernel Source
Browsing Linux Kernel Source
 
Cross Language Concept Mining
Cross Language Concept Mining Cross Language Concept Mining
Cross Language Concept Mining
 
Class Outlier Mining
Class Outlier MiningClass Outlier Mining
Class Outlier Mining
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Assembly Language Lecture 3
Assembly Language Lecture 3Assembly Language Lecture 3
Assembly Language Lecture 3
 
Assembly Language Lecture 4
Assembly Language Lecture 4Assembly Language Lecture 4
Assembly Language Lecture 4
 
Structured Vs, Object Oriented Analysis and Design
Structured Vs, Object Oriented Analysis and DesignStructured Vs, Object Oriented Analysis and Design
Structured Vs, Object Oriented Analysis and Design
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Introduction to CLIPS Expert System
Introduction to CLIPS Expert SystemIntroduction to CLIPS Expert System
Introduction to CLIPS Expert System
 

Similar to Knowledge discovery thru data mining

Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
Cognizant Technology Solutions
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
butest
 
Data-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptxData-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptx
Parvathyparu25
 
Data-Mining-ppt.pptx
Data-Mining-ppt.pptxData-Mining-ppt.pptx
Data-Mining-ppt.pptx
ayush309565
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
bhagathk
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
hktripathy
 
Chapter 1. Introduction.ppt
Chapter 1. Introduction.pptChapter 1. Introduction.ppt
Chapter 1. Introduction.ppt
Subrata Kumer Paul
 
Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
VrushaliSolanke
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
VaishnavGhadge1
 
Talk
TalkTalk
Talk
sumit621
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
PadmajaLaksh
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
hktripathy
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
Take1As
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
Salah Amean
 
Data Mining Intro
Data Mining IntroData Mining Intro
Data Mining Intro
ShubhamSamrat5
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
AidaMustapha6
 
01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt
admsoyadm4
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
VaibhavGupta447155
 
data mining
data miningdata mining
data mining
AMITKUMAR202236
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
VijayasankariS
 

Similar to Knowledge discovery thru data mining (20)

Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Data-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptxData-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptx
 
Data-Mining-ppt.pptx
Data-Mining-ppt.pptxData-Mining-ppt.pptx
Data-Mining-ppt.pptx
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Chapter 1. Introduction.ppt
Chapter 1. Introduction.pptChapter 1. Introduction.ppt
Chapter 1. Introduction.ppt
 
Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 
Talk
TalkTalk
Talk
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
Data Mining Intro
Data Mining IntroData Mining Intro
Data Mining Intro
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
 
01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
 
data mining
data miningdata mining
data mining
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
 

More from Devakumar Jain

Emerging research agenda in pesticide science
Emerging research agenda in pesticide scienceEmerging research agenda in pesticide science
Emerging research agenda in pesticide science
Devakumar Jain
 
Jain philosophical insights- I
Jain philosophical insights- IJain philosophical insights- I
Jain philosophical insights- I
Devakumar Jain
 
Particle physics article
Particle physics articleParticle physics article
Particle physics article
Devakumar Jain
 
Synthetic pest management chemicals
Synthetic pest management chemicalsSynthetic pest management chemicals
Synthetic pest management chemicals
Devakumar Jain
 
Botanical pesticides in pm
Botanical pesticides in pmBotanical pesticides in pm
Botanical pesticides in pm
Devakumar Jain
 
Research Avenues in Drug discovery of natural products
Research Avenues in Drug discovery of natural productsResearch Avenues in Drug discovery of natural products
Research Avenues in Drug discovery of natural products
Devakumar Jain
 
Acarya kund kund and samayasara
Acarya kund kund and samayasaraAcarya kund kund and samayasara
Acarya kund kund and samayasara
Devakumar Jain
 
Particle physics article
Particle physics articleParticle physics article
Particle physics article
Devakumar Jain
 
Performance Related Incentive Scheme for Indian Agricutural Scientists
Performance Related Incentive Scheme for Indian Agricutural ScientistsPerformance Related Incentive Scheme for Indian Agricutural Scientists
Performance Related Incentive Scheme for Indian Agricutural Scientists
Devakumar Jain
 
MALDI-TOF: Pricinple and Its Application in Biochemistry and Biotechnology
MALDI-TOF: Pricinple  and Its Application in Biochemistry and BiotechnologyMALDI-TOF: Pricinple  and Its Application in Biochemistry and Biotechnology
MALDI-TOF: Pricinple and Its Application in Biochemistry and Biotechnology
Devakumar Jain
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
Devakumar Jain
 
Consortium on Digitization of Indian Agricultural Library Resources
Consortium on Digitization of Indian Agricultural Library  ResourcesConsortium on Digitization of Indian Agricultural Library  Resources
Consortium on Digitization of Indian Agricultural Library Resources
Devakumar Jain
 

More from Devakumar Jain (12)

Emerging research agenda in pesticide science
Emerging research agenda in pesticide scienceEmerging research agenda in pesticide science
Emerging research agenda in pesticide science
 
Jain philosophical insights- I
Jain philosophical insights- IJain philosophical insights- I
Jain philosophical insights- I
 
Particle physics article
Particle physics articleParticle physics article
Particle physics article
 
Synthetic pest management chemicals
Synthetic pest management chemicalsSynthetic pest management chemicals
Synthetic pest management chemicals
 
Botanical pesticides in pm
Botanical pesticides in pmBotanical pesticides in pm
Botanical pesticides in pm
 
Research Avenues in Drug discovery of natural products
Research Avenues in Drug discovery of natural productsResearch Avenues in Drug discovery of natural products
Research Avenues in Drug discovery of natural products
 
Acarya kund kund and samayasara
Acarya kund kund and samayasaraAcarya kund kund and samayasara
Acarya kund kund and samayasara
 
Particle physics article
Particle physics articleParticle physics article
Particle physics article
 
Performance Related Incentive Scheme for Indian Agricutural Scientists
Performance Related Incentive Scheme for Indian Agricutural ScientistsPerformance Related Incentive Scheme for Indian Agricutural Scientists
Performance Related Incentive Scheme for Indian Agricutural Scientists
 
MALDI-TOF: Pricinple and Its Application in Biochemistry and Biotechnology
MALDI-TOF: Pricinple  and Its Application in Biochemistry and BiotechnologyMALDI-TOF: Pricinple  and Its Application in Biochemistry and Biotechnology
MALDI-TOF: Pricinple and Its Application in Biochemistry and Biotechnology
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
 
Consortium on Digitization of Indian Agricultural Library Resources
Consortium on Digitization of Indian Agricultural Library  ResourcesConsortium on Digitization of Indian Agricultural Library  Resources
Consortium on Digitization of Indian Agricultural Library Resources
 

Recently uploaded

Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 

Recently uploaded (20)

Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 

Knowledge discovery thru data mining

  • 1. Knowledge Discovery through Data Mining C. Devakumar Indian Council of Agricultural Research New Delhi-110 012 [email_address]
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. + = Data Interestingness criteria Hidden patterns
  • 9. + = Data Interestingness criteria Hidden patterns Type of Patterns
  • 10. + = Data Interestingness criteria Hidden patterns Type of data Type of Interestingness criteria
  • 11.
  • 12.
  • 13. Statistics: Conceptual Model (Hypothesis) Statistical Reasoning “ Proof” (Validation of Hypothesis)
  • 14. Data mining: Mining Algorithm Based on Interestingness Data Pattern (model, rule, hypothesis) discovery
  • 15. Explores Your Data Finds Patterns Performs Predictions
  • 16. Presentation Exploration Discovery Passive Interactive Proactive Role of Software Business Insight Predictive Analysis Canned reporting Ad-hoc reporting OLAP Data mining
  • 17.
  • 18.
  • 19.
  • 20. Data Mining and Business Intelligence Increasing potential to support business decisions End User Business Analyst Data Analyst DBA Decision Making Data Presentation Visualization Techniques Data Mining Information Discovery Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses Data Sources Paper, Files, Web documents, Scientific experiments, Database Systems
  • 21. Data Mining: Confluence of Multiple Disciplines Data Mining Database Technology Statistics Machine Learning Pattern Recognition Algorithm Other Disciplines Visualization
  • 22. Architecture: Typical Data Mining System data cleaning, integration, and selection Database or Data Warehouse Server Data Mining Engine Pattern Evaluation Graphical User Interface Knowledge-Base Database Data Warehouse World-Wide Web Other Info Repositories
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. Tokenization , which splits a text document into a stream of words by removing all punctuation marks and by replacing tabs and other non-text characters with single white spaces Filtering methods remove words like articles, conjunctions, prepositions, etc. Lemmatization methods try to map verb forms to the infinite tense and nouns to their singular form. Stemming methods attempt to build the basic forms of words, for example, by stripping the plural 's' from nouns, the 'ing' from verbs, or other affixes. Additional linguistic preprocessing N-grams individualization, which is n-word generic sequences that do not necessarily correspond to an idiomatic use; Anaphora resolution, which can identify relationships among a linguistic expression (anaphora) and its preceding phrase, thus, determining the corresponding reference; Part-of-speech tagging (POS) determines the part of speech tag, noun, verb, adjective, etc. for each term; Text chunking aims at grouping adjacent words in a sentence; Word Sense Disambiguation (WSD) tries to resolve the ambiguity in the meaning of single words or phrases; Parsing produces a full parse tree of a sentence (subject, object, etc.).
  • 31. Castellano, M. et al. A bioinformatics knowledge discovery in text application for grid Computing BMC Bioinformatics 2009, 10(Suppl 6):S23
  • 32. BIOINFORMATICS ARCHITECTURE The Layer Architecture consisting of GATE 4.0 Toolkit for Text Mining, a Middleware solution written by Java API, the grid infrastructure middleware, and a physical layer that consists of a Gnu/Linux Operating System. The integrated development environment, GATE was used for the text mining process. GATE operated on a collection of scientific publications in full text available on MedLine/Pubmed (in pdf format) using the process of Text Mining
  • 34.
  • 35.
  • 36.  
  • 37.