SlideShare a Scribd company logo
1 of 55
06/04/16 07:10 PM1
DATA MINING IN AGRICULTURE
PRESENTED BY:-
SIBANANDA KHATAI
06/04/16 07:10 PM2
Contents
 HISTORY
 INTRODUCTION
 DATA VS. INFORMATION
 DATA MINING- CONCEPT
 TECHNIQUES USED IN DATA MINING
 ALGORITHMS USED IN DATA MINING
 ROLE OF DATA MINING
 REVIEW OF LITERATURE
 CONCLUSION
06/04/16 07:10 PM3
 Agriculture is a business with risk
 Depends on climate, geography, political and economic factors
 Some risks which can be quantified by mathematical, statistical
methods, and advanced computing
 Challenge is to extract information from agri. Databases
 Data mining is such a technology which can bring the knowledge to
agriculture development
IntRoDUCtIon
HIstoRY
06/04/16 07:10 PM4
eVoLUtIon oF DAtA
sCIenCe
06/04/16 07:10 PM5
Evolutionary Steps Enabling Technologies Product
Providers
Data Collection
(1960s)
Computers, tapes IBM
Data Access
(1980s)
Relational databases Oracle,
Informix, IBM,
Microsoft
Data Warehousing OLAP, multi dimensional
databases, data
warehouses
Pilot,
Microstrategy
Data Mining
(Emerging Today)
Advanced algorithms,
multiprocessor
computers,
Pilot,
IBM,others
DAtA Vs. InFoRMAtIon
06/04/16 07:10 PM
6
MeAsUReMents oF DAtA
06/04/16 07:10 PM7
 Databases today are huge:
– More than 1,000,000 entities/records/rows
– From 10 to 10,000 fields/attributes/variables
– Giga-bytes ,tera-bytes and now in peta & exa
 Databases a growing at an unprecendented rate
 The corporate world is a cut-throat world
– Decisions must be made rapidly
– Decisions must be made with maximum knowledge
06/04/16 07:10 PM8
RApID gRowtH oF DAtA
06/04/16 07:10 PM
9
KnowLeDge DIsCoVeRY In
DAtABAses (KDD)
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
KNOWLEDGEKNOWLEDGE
06/04/16 07:10 PM10
DAtA pResentAtIon
Graphical-bar charts, pie charts histograms Geometric- scatter plot
Icon-based- using colors figures as icons Pixel-based- data as colored
pixels
0
5
10
15
20
25
30
35
40
10000 30000 50000 70000 90000
wHAt Is DAtA MInIng ?
06/04/16 07:10 PM11
DATA MINING
DeFInItIon
06/04/16 07:10 PM12
The process of discovering interesting and useful patterns and
relationships in large volumes of data.
Encyclopaedia Britannica
• Other synonym of data mining
are knowledge extraction, pattern
analysis, data archeology
06/04/16 07:10 PM13
goALs oF DAtA MInIng
 Prediction: How certain attibutes within the data will behave in
the future.
 Identification: Identify the existence of an item, an event,an
activity.
 Classification: Partition the data into categories.
 Optimization: Optimize the use of limited resources
06/04/16 07:10 PM14
Why Mine Data?
 Lots of data is being collected and warehoused
– Web data, e-commerce
– Purchases at department/grocery stores
– Bank/Credit Card transactions
– Crop databases
 Computers have become cheaper and more powerful
06/04/16 07:10 PM15
Why Mine Data? Scientific
VieWpoint
 Data collected and stored at
enormous speeds (GB/hour)
– Remote sensors on a satellite
– Telescopes scanning the skies
– Microarrays generating gene
expression data
– Scientific simulations
generating terabytes of data
 Traditional techniques infeasible for raw data
 Data mining may help scientists
– In classifying and segmenting data
– In Hypothesis Formation
06/04/16 07:10 PM16
SoURceS of Data foR MininG
 Data warehouses
 Transactional databases
 Spatial and Temporal
 Time-series
 Multimedia, text
06/04/16 07:10 PM17
06/04/16 07:10 PM18
What aRe techniqUeS
USeD in Data MininG?
1.STATISTICS
2.MACHINE LEARNING
3.DATA BASE SYSTEMS
4,5,6,7…… LET’S SEE
1.StatiSticS
06/04/16 07:10 PM19
2.Machine LeaRninG
06/04/16 07:10 PM20
Ability to automatically learn to recognize complex patterns
and make intelligent decisions based on data.
3.Data baSe SySteMS anD Data
WaRehoUSeS
 A repository of information collection from multiple sources,
stored under a unified schema and that usually resides at a
single site.
06/04/16 07:10 PM21
4.infoRMation RetRieVaL (iR)
 activity of obtaining information resources relevant to an information
from a collection of information resources.
 textual data mining and multimedia mining ,integrated with
information retrieval methods have great importance.
06/04/16 07:10 PM
22
5.aLGoRithMS
06/04/16 07:10 PM23
6.patteRn RecoGnition
 Branch of machine learning that focuses on the
recognition of patterns and regularities in data
06/04/16 07:10 PM24
7.hiGh peRfoRMance
coMpUtinG
 it is the use of parallel processing for running advanced
application programs efficiently, reliably and quickly.
06/04/16 07:10 PM25
ReGReSSion MoDeLS
 Regression is a data mining (machine learning) technique
used to fit an equation to a dataset.
 A straight line is given by the equation y = mx + c and
determines the approximate values for m and c to
calculate the value of y based on a particular value of x.
 multiple regression, uses more than one input variable
and allows for the fitting of more complex models
06/04/16 07:10 PM26
06/04/16 07:10 PM27
SUMMaRy
Data MininG MethoDoLoGieS
 Neural Networks
 K-means
 Fuzzy set
 Bayesian network
 K-nearest Neighbour
 Support Vector Machine
 Decision Tree Analysis
 WEKA Tool
06/04/16 07:10 PM28
Neural Networks
 An information processing paradigm that is inspired by
the way biological nervous systems, such as the brain,
process information.
06/04/16 07:10 PM29
BayesiaN Network
06/04/16 07:10 PM30
• originated from Bayes’ theorm
• Also knownas posterior probability
waikato eNviroNmeNt for
kNowledge aNalysis (weka)
06/04/16 07:10 PM31
i. Machine learning
software written in
java
ii. Free software
iii. Analyze data from
agricultural domains
iv. Visualization tools
and algorithms for
data analysis and
predictive modelling
advaNtages of weka iNclude
• Runs on almost any modern computing platform.
• A comprehensive collection of data preprocessing and modeling
techniques.
• Ease of use due to its graphical user interfaces.
• Supports several standard data minig tasks, more specifically, data
preprocessing, clustering, classification regression, visualization, and
feature selection
06/04/16 07:10 PM32
commercial tools
 Oracle Data Miner
– http://www.oracle.com
 Data To Knowledge
– http://alg.ncsa.uiuc.edu
 SAS
– http://www.sas.com/
 Clementine
– http://spss.com/clemetine/
 Intelligent Miner
– http://www-306.ibm.com/software
06/04/16 07:10 PM33
matlaB
 MATLAB (matrix laboratory) is a multi-paradigm numerical computing
environment and fourth-generation programming language.
 A proprietary programming language developed by MathWorks,
 MATLAB allows matrix manipulations, plotting of functions and data,
implementation of algorithms, creation of user interfaces
 interfacing with programs written in other languages, including C, C++,
Java, Fortran and Python.
06/04/16 07:10 PM34
r
06/04/16 07:10 PM35
• R is a programming language and software environment for statistical
computing and graphics supported by the R Foundation for Statistical
Computing
• widely used among statisticians and data miners for developing
statistical software & data analysis.
• Polls, surveys of data miners, and studies of scholarly literature
databases show that R's popularity has increased substantially in
recent years.
role of data miNiNg iN
agriculture
 Influence of climate on kharif and rabi crops
 Crop yield estimation
 Estimation of Damage caused by pest
 Mushroom grading
 Spatial data mining reveals interesting pattern related to
agriculture
06/04/16 07:10 PM36
role iN agriculture domaiN
06/04/16 07:10 PM37
Data mining methodologies Applications
Neural Networks Focuses on weather forecasts, Prediction of
rainfall
K-means Classifying soil in combination with GPS, Wine
fermentation problem, Yield Prediction
Fuzzy set For detecting weeds in precision agriculture
Bayesian network Developed the model for agriculture purpose
based on the Bayesian network learning method
K-nearest Neighbour Simulating daily precipitations and other
weather conditions
typical applicatioNs of data
miNiNg
06/04/16 07:10 PM38
review of literature
06/04/16 07:10 PM39
a study oN effect of weatHer parameters
By artificial Neural Networks oN yield
of aoNla (iNdiaN gooseBerry) uNder
differeNt fertiliZers treatmeNts
 April Month’s Range of temperature has high correlation
coefficient with yield Highest
 Extreme variations negatively, affect the AONLA fruit
yield.
06/04/16 07:10 PM40
Kulshrestha et al,2010Anand
geospatial data miNiNg tecHNiques
kNowledge discovery iN
agriculture
Visualization of the spatial OLAP of Gujarat
Concluded that integration of computer science with
agriculture will generate new emission in management of
agricultural information
06/04/16 07:10 PM41
Bhojani, 2013Anand
time series forecastiNg of losses due to pod Borer,
pod
fly aNd productivity of pigeoNpea (caJaNus caJaN)
for
NortH west plaiN ZoNe (NwpZ) By usiNg artificial
Neural
Network (aNN)
Pod damage by pod fly
in2010-11–damage found as 22.21%
2011-12 –damage found as18.77%
2012-13 – damage found found as 16.72%
A linear regression between network outputs and the
corresponding targets with the R2
value as 0.88
Indicating the fit was reasonably good for all data sets
06/04/16 07:10 PM
42
Kumari et. al. 2014Varanasi
A survey on DAtA Mining techniques for
crop yielD preDiction
06/04/16 07:10 PM43
 K-Means algorithm is used to perform forecast of the
pollution
 K Nearest Neighbor (KNN) is applied for simulating daily
precipitations
 K-Means approach is used for classifying soils in
combination with GPS
Ramesh et al,2014Bangalore
A survey on DAtA Mining techniques in
Agriculture
 Use of information technology in agriculture can change the
situation of decision making and farmers can yield in better
way
 Data mining plays a crucial role for decision making on several issues
related to agriculture field.
06/04/16 07:10 PM44
Geetha et al,2015Coimbatore
preDiction of stuDents recruitMent
process using DAtA Mining techniques
with clAssificAtion rules
06/04/16 07:10 PM45
Malathi et al,2015Tamilnadu
text recognition
 Feature Extraction,
 Clustering, and Pattern Matching
 k-NN classifier
06/04/16 07:10 PM46
Haridwar Krishan et al,2016
the ApplicAtion of DAtA Mining
techniques to chArActerize
AgriculturAl soil profiles
06/04/16 07:10 PM47
Armstrong et al.Australia
iMpleMentAtion of DAtA Mining
techniques for MeteorologicAl DAtA
AnAlysis
06/04/16 07:10 PM48
palestine Sarah et al.
DAtA Mining - An evolutionAry view of
Agriculture
 GPS techniques may be employed for discovering important
information from agricultural-related like soil identification.
 Data Mining techniques were adopted in order to estimate crop
yield analysis with existing data and their use in data mining.
06/04/16 07:10 PM49
Amaravati Abhishek et al. 2014
DAtA Mining in Agriculture on crop
price preDiction: techniques AnD
ApplicAtions
 Data Mining techniques were adopted in order to estimate crop
price analysis with existing data
 K-Means approach, utilize only the basic algorithm
06/04/16 07:10 PM50
Bangalore Manpreet et al.,2014
DAtA Mining techniques for preDicting
crop proDuctivity – A review Article
 Data mining is relatively a novel research field and it is
expected to grow in the future
 multidisciplinary approach of integrating computer science with
agriculture will help in forecasting/ managing agricultural
crops effectively.
06/04/16 07:10 PM51
Bhopal Veenadhari et al. ,2011
Applying DAtA Mining techniques in the
fielD of
Agriculture AnD AllieD sciences
 Mentioned different techniques of data mining used in
agriculture domaim
 K-means algorithm,ID3 algoritm,k nearest neighbour are
mostly used
06/04/16 07:10 PM52
Bangalore Yethiraj,2012
DAtA Mining techniques AnD
ApplicAtions to
AgriculturAl yielD DAtA
06/04/16 07:10 PM53
Andhra pradesh
Ramesh,2013
MLR Technique is given as 98 % and using K-Means algorithm is
given as 96% accuracy
06/04/16 07:10 PM54
 Data Mining is boon for large data in agriculture
 Extraction of knowledge is a big challenge
 A lot of data mining techniques are developed today to
tackle the challenge
 Skill is also required to handle the tools and techniques
conclusion
06/04/16 07:10 PM55
THANK YOU

More Related Content

What's hot

What's hot (20)

Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Big data frameworks
Big data frameworksBig data frameworks
Big data frameworks
 
Data mining
Data mining Data mining
Data mining
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Big data
Big dataBig data
Big data
 
Data Mining applications in agribusiness and agriculture
Data Mining applications in agribusiness and agricultureData Mining applications in agribusiness and agriculture
Data Mining applications in agribusiness and agriculture
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
IoT in Agriculture
IoT in AgricultureIoT in Agriculture
IoT in Agriculture
 
Big data
Big dataBig data
Big data
 
Dynamic Itemset Counting
Dynamic Itemset CountingDynamic Itemset Counting
Dynamic Itemset Counting
 
Smart farming using IOT
Smart farming using IOTSmart farming using IOT
Smart farming using IOT
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 
Agriculture big data
Agriculture big dataAgriculture big data
Agriculture big data
 
IOT in Agriculture slide.pptx
IOT in Agriculture slide.pptxIOT in Agriculture slide.pptx
IOT in Agriculture slide.pptx
 

Viewers also liked

Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
Saif Ullah
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
James Wong
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regression
kompellark
 

Viewers also liked (20)

Data mining in Telecommunications
Data mining in TelecommunicationsData mining in Telecommunications
Data mining in Telecommunications
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
USE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTORUSE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTOR
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Predicting crop yield and response to Nutrients from soil spectra at WCSS 201...
Predicting crop yield and response to Nutrients from soil spectra at WCSS 201...Predicting crop yield and response to Nutrients from soil spectra at WCSS 201...
Predicting crop yield and response to Nutrients from soil spectra at WCSS 201...
 
Data Mining in Retail Industries
Data Mining in Retail IndustriesData Mining in Retail Industries
Data Mining in Retail Industries
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Mining
 
2010-11 CIARD - Bridging Rural Digital Divide (Brasil) - English
2010-11 CIARD - Bridging Rural Digital Divide (Brasil) - English2010-11 CIARD - Bridging Rural Digital Divide (Brasil) - English
2010-11 CIARD - Bridging Rural Digital Divide (Brasil) - English
 
Ijetcas14 379
Ijetcas14 379Ijetcas14 379
Ijetcas14 379
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
DOSUG Intro to google prediction api
DOSUG Intro to google prediction apiDOSUG Intro to google prediction api
DOSUG Intro to google prediction api
 
Stock market prediction using data mining
Stock market prediction using data miningStock market prediction using data mining
Stock market prediction using data mining
 
Analysis of crop yield prediction using data mining techniques
Analysis of crop yield prediction using data mining techniquesAnalysis of crop yield prediction using data mining techniques
Analysis of crop yield prediction using data mining techniques
 
Predicting the future with Google Prediction API
Predicting the future with Google Prediction APIPredicting the future with Google Prediction API
Predicting the future with Google Prediction API
 
Scale Invariant Feature Tranform
Scale Invariant Feature TranformScale Invariant Feature Tranform
Scale Invariant Feature Tranform
 

Similar to Data mining in agriculture

Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression
IJECEIAES
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
Ian Foster
 
Team 05 linked data generation
Team 05 linked data generationTeam 05 linked data generation
Team 05 linked data generation
plan4all
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
braycarissa250
 

Similar to Data mining in agriculture (20)

Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
Inspire hack 2017-linked-data
Inspire hack 2017-linked-dataInspire hack 2017-linked-data
Inspire hack 2017-linked-data
 
Team 05 linked data generation
Team 05 linked data generationTeam 05 linked data generation
Team 05 linked data generation
 
Development of Data Integration & Analysis System in Japan
Development of Data Integration & Analysis System in JapanDevelopment of Data Integration & Analysis System in Japan
Development of Data Integration & Analysis System in Japan
 
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
 
OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024
 
Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
 
Data repository for sensor network a data mining approach
Data repository for sensor network  a data mining approachData repository for sensor network  a data mining approach
Data repository for sensor network a data mining approach
 
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
 
IRJET- A Workflow Management System for Scalable Data Mining on Clouds
IRJET- A Workflow Management System for Scalable Data Mining on CloudsIRJET- A Workflow Management System for Scalable Data Mining on Clouds
IRJET- A Workflow Management System for Scalable Data Mining on Clouds
 
11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining
 
Analysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic ToolsAnalysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic Tools
 
Seminar h&m
Seminar h&mSeminar h&m
Seminar h&m
 
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesData Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Recently uploaded (20)

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 

Data mining in agriculture

  • 1. 06/04/16 07:10 PM1 DATA MINING IN AGRICULTURE PRESENTED BY:- SIBANANDA KHATAI
  • 2. 06/04/16 07:10 PM2 Contents  HISTORY  INTRODUCTION  DATA VS. INFORMATION  DATA MINING- CONCEPT  TECHNIQUES USED IN DATA MINING  ALGORITHMS USED IN DATA MINING  ROLE OF DATA MINING  REVIEW OF LITERATURE  CONCLUSION
  • 3. 06/04/16 07:10 PM3  Agriculture is a business with risk  Depends on climate, geography, political and economic factors  Some risks which can be quantified by mathematical, statistical methods, and advanced computing  Challenge is to extract information from agri. Databases  Data mining is such a technology which can bring the knowledge to agriculture development IntRoDUCtIon
  • 5. eVoLUtIon oF DAtA sCIenCe 06/04/16 07:10 PM5 Evolutionary Steps Enabling Technologies Product Providers Data Collection (1960s) Computers, tapes IBM Data Access (1980s) Relational databases Oracle, Informix, IBM, Microsoft Data Warehousing OLAP, multi dimensional databases, data warehouses Pilot, Microstrategy Data Mining (Emerging Today) Advanced algorithms, multiprocessor computers, Pilot, IBM,others
  • 8.  Databases today are huge: – More than 1,000,000 entities/records/rows – From 10 to 10,000 fields/attributes/variables – Giga-bytes ,tera-bytes and now in peta & exa  Databases a growing at an unprecendented rate  The corporate world is a cut-throat world – Decisions must be made rapidly – Decisions must be made with maximum knowledge 06/04/16 07:10 PM8 RApID gRowtH oF DAtA
  • 9. 06/04/16 07:10 PM 9 KnowLeDge DIsCoVeRY In DAtABAses (KDD) Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection Data Mining Pattern Evaluation KNOWLEDGEKNOWLEDGE
  • 10. 06/04/16 07:10 PM10 DAtA pResentAtIon Graphical-bar charts, pie charts histograms Geometric- scatter plot Icon-based- using colors figures as icons Pixel-based- data as colored pixels 0 5 10 15 20 25 30 35 40 10000 30000 50000 70000 90000
  • 11. wHAt Is DAtA MInIng ? 06/04/16 07:10 PM11 DATA MINING
  • 12. DeFInItIon 06/04/16 07:10 PM12 The process of discovering interesting and useful patterns and relationships in large volumes of data. Encyclopaedia Britannica • Other synonym of data mining are knowledge extraction, pattern analysis, data archeology
  • 14. goALs oF DAtA MInIng  Prediction: How certain attibutes within the data will behave in the future.  Identification: Identify the existence of an item, an event,an activity.  Classification: Partition the data into categories.  Optimization: Optimize the use of limited resources 06/04/16 07:10 PM14
  • 15. Why Mine Data?  Lots of data is being collected and warehoused – Web data, e-commerce – Purchases at department/grocery stores – Bank/Credit Card transactions – Crop databases  Computers have become cheaper and more powerful 06/04/16 07:10 PM15
  • 16. Why Mine Data? Scientific VieWpoint  Data collected and stored at enormous speeds (GB/hour) – Remote sensors on a satellite – Telescopes scanning the skies – Microarrays generating gene expression data – Scientific simulations generating terabytes of data  Traditional techniques infeasible for raw data  Data mining may help scientists – In classifying and segmenting data – In Hypothesis Formation 06/04/16 07:10 PM16
  • 17. SoURceS of Data foR MininG  Data warehouses  Transactional databases  Spatial and Temporal  Time-series  Multimedia, text 06/04/16 07:10 PM17
  • 18. 06/04/16 07:10 PM18 What aRe techniqUeS USeD in Data MininG? 1.STATISTICS 2.MACHINE LEARNING 3.DATA BASE SYSTEMS 4,5,6,7…… LET’S SEE
  • 20. 2.Machine LeaRninG 06/04/16 07:10 PM20 Ability to automatically learn to recognize complex patterns and make intelligent decisions based on data.
  • 21. 3.Data baSe SySteMS anD Data WaRehoUSeS  A repository of information collection from multiple sources, stored under a unified schema and that usually resides at a single site. 06/04/16 07:10 PM21
  • 22. 4.infoRMation RetRieVaL (iR)  activity of obtaining information resources relevant to an information from a collection of information resources.  textual data mining and multimedia mining ,integrated with information retrieval methods have great importance. 06/04/16 07:10 PM 22
  • 24. 6.patteRn RecoGnition  Branch of machine learning that focuses on the recognition of patterns and regularities in data 06/04/16 07:10 PM24
  • 25. 7.hiGh peRfoRMance coMpUtinG  it is the use of parallel processing for running advanced application programs efficiently, reliably and quickly. 06/04/16 07:10 PM25
  • 26. ReGReSSion MoDeLS  Regression is a data mining (machine learning) technique used to fit an equation to a dataset.  A straight line is given by the equation y = mx + c and determines the approximate values for m and c to calculate the value of y based on a particular value of x.  multiple regression, uses more than one input variable and allows for the fitting of more complex models 06/04/16 07:10 PM26
  • 28. Data MininG MethoDoLoGieS  Neural Networks  K-means  Fuzzy set  Bayesian network  K-nearest Neighbour  Support Vector Machine  Decision Tree Analysis  WEKA Tool 06/04/16 07:10 PM28
  • 29. Neural Networks  An information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. 06/04/16 07:10 PM29
  • 30. BayesiaN Network 06/04/16 07:10 PM30 • originated from Bayes’ theorm • Also knownas posterior probability
  • 31. waikato eNviroNmeNt for kNowledge aNalysis (weka) 06/04/16 07:10 PM31 i. Machine learning software written in java ii. Free software iii. Analyze data from agricultural domains iv. Visualization tools and algorithms for data analysis and predictive modelling
  • 32. advaNtages of weka iNclude • Runs on almost any modern computing platform. • A comprehensive collection of data preprocessing and modeling techniques. • Ease of use due to its graphical user interfaces. • Supports several standard data minig tasks, more specifically, data preprocessing, clustering, classification regression, visualization, and feature selection 06/04/16 07:10 PM32
  • 33. commercial tools  Oracle Data Miner – http://www.oracle.com  Data To Knowledge – http://alg.ncsa.uiuc.edu  SAS – http://www.sas.com/  Clementine – http://spss.com/clemetine/  Intelligent Miner – http://www-306.ibm.com/software 06/04/16 07:10 PM33
  • 34. matlaB  MATLAB (matrix laboratory) is a multi-paradigm numerical computing environment and fourth-generation programming language.  A proprietary programming language developed by MathWorks,  MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces  interfacing with programs written in other languages, including C, C++, Java, Fortran and Python. 06/04/16 07:10 PM34
  • 35. r 06/04/16 07:10 PM35 • R is a programming language and software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing • widely used among statisticians and data miners for developing statistical software & data analysis. • Polls, surveys of data miners, and studies of scholarly literature databases show that R's popularity has increased substantially in recent years.
  • 36. role of data miNiNg iN agriculture  Influence of climate on kharif and rabi crops  Crop yield estimation  Estimation of Damage caused by pest  Mushroom grading  Spatial data mining reveals interesting pattern related to agriculture 06/04/16 07:10 PM36
  • 37. role iN agriculture domaiN 06/04/16 07:10 PM37 Data mining methodologies Applications Neural Networks Focuses on weather forecasts, Prediction of rainfall K-means Classifying soil in combination with GPS, Wine fermentation problem, Yield Prediction Fuzzy set For detecting weeds in precision agriculture Bayesian network Developed the model for agriculture purpose based on the Bayesian network learning method K-nearest Neighbour Simulating daily precipitations and other weather conditions
  • 38. typical applicatioNs of data miNiNg 06/04/16 07:10 PM38
  • 40. a study oN effect of weatHer parameters By artificial Neural Networks oN yield of aoNla (iNdiaN gooseBerry) uNder differeNt fertiliZers treatmeNts  April Month’s Range of temperature has high correlation coefficient with yield Highest  Extreme variations negatively, affect the AONLA fruit yield. 06/04/16 07:10 PM40 Kulshrestha et al,2010Anand
  • 41. geospatial data miNiNg tecHNiques kNowledge discovery iN agriculture Visualization of the spatial OLAP of Gujarat Concluded that integration of computer science with agriculture will generate new emission in management of agricultural information 06/04/16 07:10 PM41 Bhojani, 2013Anand
  • 42. time series forecastiNg of losses due to pod Borer, pod fly aNd productivity of pigeoNpea (caJaNus caJaN) for NortH west plaiN ZoNe (NwpZ) By usiNg artificial Neural Network (aNN) Pod damage by pod fly in2010-11–damage found as 22.21% 2011-12 –damage found as18.77% 2012-13 – damage found found as 16.72% A linear regression between network outputs and the corresponding targets with the R2 value as 0.88 Indicating the fit was reasonably good for all data sets 06/04/16 07:10 PM 42 Kumari et. al. 2014Varanasi
  • 43. A survey on DAtA Mining techniques for crop yielD preDiction 06/04/16 07:10 PM43  K-Means algorithm is used to perform forecast of the pollution  K Nearest Neighbor (KNN) is applied for simulating daily precipitations  K-Means approach is used for classifying soils in combination with GPS Ramesh et al,2014Bangalore
  • 44. A survey on DAtA Mining techniques in Agriculture  Use of information technology in agriculture can change the situation of decision making and farmers can yield in better way  Data mining plays a crucial role for decision making on several issues related to agriculture field. 06/04/16 07:10 PM44 Geetha et al,2015Coimbatore
  • 45. preDiction of stuDents recruitMent process using DAtA Mining techniques with clAssificAtion rules 06/04/16 07:10 PM45 Malathi et al,2015Tamilnadu
  • 46. text recognition  Feature Extraction,  Clustering, and Pattern Matching  k-NN classifier 06/04/16 07:10 PM46 Haridwar Krishan et al,2016
  • 47. the ApplicAtion of DAtA Mining techniques to chArActerize AgriculturAl soil profiles 06/04/16 07:10 PM47 Armstrong et al.Australia
  • 48. iMpleMentAtion of DAtA Mining techniques for MeteorologicAl DAtA AnAlysis 06/04/16 07:10 PM48 palestine Sarah et al.
  • 49. DAtA Mining - An evolutionAry view of Agriculture  GPS techniques may be employed for discovering important information from agricultural-related like soil identification.  Data Mining techniques were adopted in order to estimate crop yield analysis with existing data and their use in data mining. 06/04/16 07:10 PM49 Amaravati Abhishek et al. 2014
  • 50. DAtA Mining in Agriculture on crop price preDiction: techniques AnD ApplicAtions  Data Mining techniques were adopted in order to estimate crop price analysis with existing data  K-Means approach, utilize only the basic algorithm 06/04/16 07:10 PM50 Bangalore Manpreet et al.,2014
  • 51. DAtA Mining techniques for preDicting crop proDuctivity – A review Article  Data mining is relatively a novel research field and it is expected to grow in the future  multidisciplinary approach of integrating computer science with agriculture will help in forecasting/ managing agricultural crops effectively. 06/04/16 07:10 PM51 Bhopal Veenadhari et al. ,2011
  • 52. Applying DAtA Mining techniques in the fielD of Agriculture AnD AllieD sciences  Mentioned different techniques of data mining used in agriculture domaim  K-means algorithm,ID3 algoritm,k nearest neighbour are mostly used 06/04/16 07:10 PM52 Bangalore Yethiraj,2012
  • 53. DAtA Mining techniques AnD ApplicAtions to AgriculturAl yielD DAtA 06/04/16 07:10 PM53 Andhra pradesh Ramesh,2013 MLR Technique is given as 98 % and using K-Means algorithm is given as 96% accuracy
  • 54. 06/04/16 07:10 PM54  Data Mining is boon for large data in agriculture  Extraction of knowledge is a big challenge  A lot of data mining techniques are developed today to tackle the challenge  Skill is also required to handle the tools and techniques conclusion

Editor's Notes

  1. Temporal data is data that varies over time Time series forecasting is the use of a model to predict future values based on previously observed values. While regression analysis is often employed in such a way as to test theories that the current values of one or more independent time series affect the current value of another time series, this type of analysis of time series is not called "time series analysis", which focuses on comparing values of a single time series or multiple dependent time series at different points in time.[2]
  2. mathematical function in terms of random variables and probaility functions)
  3. Data warehouse needs consistent integration of quality data Flat files are simple data files in text or binary forma
  4. In mathematics, fuzzy sets are sets whose elements have degrees of membership. Fuzzy sets were introduced by Lotfi A. Zade As an extension of the case of multi-valued logic