With the growing trend of machine learning, it is needless to say how machine learning can help reap benefits in agriculture. It will be boon for the farmer welfare.
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Application of Machine Learning in Agriculture
1.
2. Master Seminar-I
Application of Machine Learning in Agriculture
Aman Vasisht
PGS20AGR8404
Dept. of Agricultural Statistics
UNIVERSITY OF AGRICULTURAL SCIENCES,
DHARWAD
COLLEGE OF AGRICULTURE, DHARWAD
3. OUTLINE
MACHINE LEARNING AND ITS
APPLICATIONS
TYPES OF MACHINE LEARNING
ALGORITHMS
CASE STUDIES
CONCLUSION
REFERENCES
4. QUICK QUESTIONNAIRE
How many of you have heard about Machine Learning ?
How many of you know about Machine Learning ?
How many of you are using Machine Learning ?
5. What is Machine Learning ?
• It is the science of programming computers so they can learn
from data.
• A type of AI that allows applications to become more accurate
in predicting outcomes.
Artificial Intelligence
Machine Learning
Deep Learning
Data Science
AI : Programs with the ability to learn
and reason like humans
ML : Algorithms with the ability to
learn and make informer decisions
DL : Artificial neural networks adapt and
learn from vast amounts of data
6. APPLICATIONS IN AGRICULTURE :
• Yield Prediction : An accurate model can help farm owners to take
informed management decisions for their farm.
• Disease Detection : Use of algorithms can help to identify diseased
plants with good accuracy.
• Crop quality : Grading of commodities can be done using some
parameters.
• Livestock Management : Managing farms according to the
controlled conditions and parameters.
7. TERMINOLOGY :
• Features : The number of features or distinct traits that can be used to
describe a label in a quantitative manner.
• Label or Target : The final outcome or variable which is dependent
on the contribution of the features.
• Training : To train the algorithm with dataset.
• Testing : To check accuracy of predicted values.
Let’s understand in a better way
8. Training :
Apple
Features :
1. Color : Reddish
2. Type : Fruit
3. Shape
etc..
Features :
1. Color : Greyish
2. Type : Company Logo
3. Shape
etc..
Features :
1. Color : Yellowish
2. Type : Fruit
3. Shape
etc..
10. TYPES OF MACHINE LEARNING :
Supervised Learning: - We are able to predict future outcomes
based on past data. It requires both features and labels to be given to
the model for it to be trained.
Unsupervised Learning: - We are able to identify hidden patterns
from the input data provided. By making the data more readable and
organized, the patterns, similarities, or anomalies become more
evident.
11. SUPERVISED LEARNING :
• Let feature variables be ‘X’ and output or label variable be ‘Y’. you use an
algorithm to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new
input data (X) that you can predict the output variables (Y) for that data.
Classification : A classification problem is when the output variable is a category,
such as :
Effective or non-effective
Disease or no disease, etc.
12. MAJOR ALGORITHMS
Classification.
• KNN
• SVM
• Logistic Regression
• Decision Tree Classifier
• Naive bayes
Regression : A regression problem is
when the output variable is a real
value, such as “yield” or “weight”. The
algorithms under this category are :
• Linear Regression
• Multiple Regression
• Polynomial Regression
• Lasso Regression
• Ridge Regression
13. UNSUPERVISED LEARNING :
• Unsupervised learning is where you only have input data (X) and no corresponding
output variables.
• The goal for unsupervised learning is to model the underlying structure or
distribution in the data in order to learn more about the data.
• Clustering: A clustering problem is where you want to discover the inherent
groupings in the data, such as grouping customers by purchasing behavior.
Algorithms : DBSCAN
K Means clustering
Hierarchical clustering
14. Input data
These are known fruits
Model It’s an apple
Prediction
Input data
Model
Unsupervised Learning
Supervised Learning
15. GOOD OR BAD MACHINE LEARNING MODEL :
• The main goal of each machine learning model is to generalize well.
• Here generalization defines the ability of an ML model to provide training on
the dataset, which can produce reliable and accurate output.
• Underfitting and overfitting are the two terms that need to be checked for the
performance of the model and whether the model is generalizing well or not.
Before understanding overfitting and underfitting, let's understand some basic
terms :
Bias
Variance
16. Bias-Variance Tradeoff :
Y = 𝑓(X) + ϵ [Let Y be dependent variable and X be independent
variable] ϵ∼N(0,σϵ).
We may estimate a model 𝑓(X) of 𝑓(X) using regression,
Therefore the error,
Err(x)=E[(Y− 𝑓(X))
2
]
This error may then be decomposed into bias
and variance components:
Err(X) = (E[𝑓(X)]− 𝑓(X))
2
+ E[(𝑓(X)−E[𝑓(X)])
2
] + σ
2
e
Err(X) = Bias
2
+ Variance + Irreducible Error
Low Variance High Variance
Low
Bias
High
Bias
17.
18. SIMPLE REGRESSION :
• Linear regression is one of the easiest and most popular Machine Learning
algorithms that is used for predictive analysis.
• y= a0+a1x+ ε
y= Dependent variable (target variable)
x= Independent variable (predictor variable)
a0= Intercept of the line
a1 = Linear regression coefficient
We wish to find a0 and a1 such that 𝛴(𝑦𝑖 − (𝑎0 + 𝑎1𝑥))2 is minimum.
a0 = 𝑦 - a1𝑥 and a1 =
𝛴(𝑥i − 𝑥)(𝑦i − 𝑦 )
𝛴 𝑥𝑖
− 𝑥 2
19. 0
2
4
6
8
10
12
14
16
18
20
0 5 10 15 20 25
Y
X
Scatter plot
y = 0.7019x + 2.4094
R² = 0.8363
0
2
4
6
8
10
12
14
16
18
20
0 5 10 15 20 25
Y
X
Best fit line
Imagine if we add a couple of large
values in the data, will it affect the
regression line?
Let’s check it
y = 1.5952x - 4.3564
R² = 0.4629
-10
0
10
20
30
40
50
60
70
0 5 10 15 20 25
Y
X
Best fit line Outlier
20. Detection of outliers in Machine Learning model:
• Using Z score :
• Z score helps to understand if a data value is greater or smaller than mean
and how many standard deviations away it is from the mean.
• 𝑍 =
𝑥−𝑥
𝜎
• Values above and below 𝑥 ± 3𝜎 are considered outliers.
Q. What is the most appropriate
measure of central tendency when the
data has outliers?
The median is usually preferred in these
situations because the value of the mean
can be distorted by the outliers.
21. • Inter-Quartile Range (IQR) proximity rule :
The data points which fall
below Q1 – 1.5 IQR or
above Q3 + 1.5 IQR are outliers.
Box plot diagram also termed as Whisker’s plot
is a graphical method.
The very purpose of this diagram is to identify
outliers and discard it from the data series.
Crop production
Crop area
22. ASSUMPTIONS OF REGRESSION ANALYSIS IN ML :
• Linear and additive
• No auto correlation
• No multicollinearity
• Homoscedasticity
• Normal distribution of errors
These assumptions are violated a lot and this violation if overlooked by a
researcher, can make the model bad and not good for predictions.
23. REGULARIZATION :
• Regularization is an important concept that is used to avoid overfitting of the data,
especially when the trained and test data are much varying.
• Two types :
L2 Ridge regression
L1 Lasso regression
L2 Ridge regression :
• It performs ‘L2 regularization’, i.e. adds penalty equivalent to square of the
magnitude of coefficients. Thus, it optimizes the following:
Objective = RSS + 𝜆* (sum of square of coefficients)
24. Loss Penalty
• 𝜆 is the tuning parameter which balances the
amount of emphasis given to minimizing RSS vs
minimizing sum of square of coefficients.
• In majority of cases, it is used to prevent
overfitting.
• It is mostly used to prevent multicollinearity.
• It reduces the model complexity by coefficient shrinkage.
L1 Lasso regression :
LASSO stands for Least Absolute Shrinkage and Selection Operator..
• Lasso regression performs L1 regularization, i.e. it adds a factor of sum of
absolute value of coefficients in the optimization objective.
25. • Objective = RSS + 𝜆* (sum of absolute value of coefficients)
It is generally used
when we have
more number of
features, because it
automatically does
feature selection
which makes it
better from ridge
regression.
Constraint region
RSS as it moves
away from
minimum
26. CLASSIFICATION :
• A common job of machine learning algorithms is to recognize objects and being
able to separate them into categories.
KNN (K-Nearest Neighbor) algorithm:
• K-NN is a non-parametric algorithm.
• It is also called a lazy learner algorithm.
• KNN algorithm at the training phase just stores the dataset and when it gets new
data, then it classifies that data into a category based on some distance measures.
• One of these measures is Minkowski distance.
c : a parameter
p,q are two points
27. 𝑖=1
𝑛
𝑝𝑖 − 𝑞𝑖 2
Euclidean distance :
When c = 2
𝑖=1
𝑛
|𝑝𝑖 − 𝑞𝑖|
Manhattan distance :
When c = 1
P
Q
0
2
4
6
8
10
12
14
16
18
0 5 10 15 20 25 30 35 40 45
• K Number of Neighbors are
generally taken as odd : 3, 5.. etc.
• Very simple
• Works with any number of classes
• Re-scaling is very important as it
is a distance-based algorithm.
K = 5
28. Accuracy :
Predicted Value
Actual
Value
n 0 1
0 TN FP
1 FN TP
Let’s see
an example
TN : True Negative
FP : False Positive
FN : False Negative
TP : True Positive
Accuracy :
Predicted Value
Actual
Value
n = 150 Healthy unhealthy
Healthy 40 10
Unhealthy 5 95
Accuracy = Correctly predicted / TN +
FP + FN + TP
Error rate = Wrong predicted / TN + FP
+ FN + TP
Therefore,
Accuracy = (40 + 95)/(40 + 10 + 5 + 95)
= 0.9 or 90%
Error rate = 15/150 = 0.1
But how do we know which number
to take as K ?
Is it 5, 7 or any other number?
29. Support Vector Machine (SVM) :
• In the SVM algorithm, we plot each data item as a point in n-dimensional
space (where n is a number of features you have) with the value of each
feature being the value of a particular coordinate.
• The goal is to find decision boundary that is
separating the classes.
Two types :
• Linear SVM : if a dataset can be classified
into two classes by using a single straight
line.
• Non-linear SVM : a dataset cannot be
classified by using a straight line -3
2
7
12
17
22
-3 2 7 12 17 22
30. -3
2
7
12
17
22
-3 2 7 12 17 22
Maximum Margin
Max. Margin
Hyperplane
Support vectors
Terminology :
Hyperplane : The best decision line or boundary.
Support vectors : the closest point of the lines
from both the classes.
Margin : The distance between the vectors and
the hyperplane. It should be maximum.
Kernel : Kernel Function generally transforms the
non-linear data into linear separable data.
31. To transform the non-linear data :
• Y = x2 (for 1D non-linear data)
By adding this dimension, we will get two-dimensional space.
• Z = x2 + y2 (for 2D non-linear data)
By adding this dimension, we will get three-dimensional space.
Radial Basis Function (RBF) :
• It computes the similarity or how close points x1 and x2 are to each other.
𝑘(𝑥1, 𝑥2) = ⅇ𝑥𝑝 −
| 𝑥1 − 𝑥2 |2
2𝜎2
32. UNSUPERVISED MACHINE LEARNING :
• Unsupervised learning is the training of a machine using information that is neither
classified nor labeled.
• It groups unsorted information according to similarities, patterns, and differences without
any prior training of data.
Hierarchical Clustering:
It involves creating clusters in a predefined order when similar clusters are grouped together
and are arranged in a hierarchical manner.
Non Hierarchical Clustering :
It involves formation of new clusters by merging or splitting the clusters. It does not follow a
tree like structure like hierarchical clustering. K means clustering and DBSCAN are two
effective algorithms.
33. 3 clusters formed when the data is
uniform i.e. when data is easily
separable with naked eye. What if the
data is non-uniform?
Clusters
34. DBSCAN (DENSITY-BASED SPATIAL CLUSTERING OF
APPLICATIONS WITH NOISE) :
• The DBSCAN algorithm has a key idea that for each point of a cluster, the
neighborhood of a given radius has to contain at least a minimum number of points.
• DBSCAN algorithm requires two parameters:
Min_pts: The minimum number of points (a threshold) clustered together for a region
to be considered dense.
Eps (ε): A distance measure that will be used to locate the points in the neighborhood
of any point.
• In this algorithm, we have 3 types of data points.
Core Point: A point if it has more than MinPts needed within eps.
Border Point: A point which has fewer than MinPts within eps but
it is in the neighborhood of a core point.
Noise or outlier: A point which is not a core point or border point.
35. Noise
Min_pts : 4
Red : Core points
Green : Border points that are still part of
cluster because they are within epsilon of a
core point, but do not meet the min_points
criteria.
Blue : Noise point, not assigned to cluster.
Important points :
• Other clusters are suitable only for compact and well separated clusters. In non-
uniform data, DBSCAN is much better.
• It is robust to outliers.
• It takes count of dense regions and accordingly makes clusters and lower density
points are not taken care of.
• Minimum points we should take are 3.
• DBSCAN uses Euclidean distance by default. 𝑖=1
𝑛
𝑝𝑖 − 𝑞𝑖 2
36. IMAGE FEATURE EXTRACTION :
Texture extraction
• Number of different intensity levels
in the image. This identifies the
size of a GLCM.
• Find intermediate matrix A by
finding how frequently a pixel p
occurs in a particular spatial
relationship with pixel q.
• Calculate GLCM by dividing each
element of matrix A by the sum of
elements of matrix A.
Color extraction
• Extract three components red,
green and blue from image.
• Convert color image to HSV
image.
• Extract hue, saturation and
intensity of image.
• For each component extracted,
compute mean, variance and
range.
Grey Level Co-occurrence matrix (GLCM) is used to find:
38. CASE STUDY-1 :
CLASSIFICATION OF GRAPE LEAVES USING KNN
AND SVM CLASSIFIERS
Anil A. Bharate, M. S. Shirdhonkar (2020)
39. DATA SOURCE:
• This case study proposes a technique to classify the grape leaf as healthy and unhealthy.
• Database consisted of 90 images of grape leaves.
Training : 30 images of healthy and 30 images of unhealthy leaves.
Testing : 30 images including healthy and unhealthy leaves.
• Feature extraction (Image processing) :
Texture and color features are extracted using Grey Level Co-occurrence Matrix
(GLCM).
a) Healthy
leaf
b) Unhealthy
leaf
40. RESULTS :
Parameter : Proposed method (SVM) Proposed method (KNN)
Features 4 texture & 18 color 4 texture & 18 color
Classifier SVM kNN
Number of samples 30 30
90
96.66
SVM KNN
Accuracy
(%)
Comparison of Results
It is noticed that accuracy of KNN is
better than SVM model. This is because
KNN computes distance to all neighbors
from a point, then finds nearest neighbor
and then decides about the class. On the
other hand, SVM considers only support
vectors to find hyper plane and then
decides about the class.
41. CONCLUSIONS OF CASE STUDY :
• Automation will be a boon for farmers to prevent their plants from diseases and increase
the yield.
• The KNN classifier gives better accuracy than SVM classifier.
• As a future work system can be trained to identify the diseases present on the grape leaves
and also provide the possible solution.
• Automatic image capturing camera can be installed with the help of government bodies
and thus the images captured can be sent for feature selection and then tested and trained
with some algorithms, concluding best algorithm with best accuracy for future
identification of scalability of infected leaves.
42. CASE STUDY-2 :
CROPAND FERTILIZER RECOMMENDATION
SYSTEM BASED ON SOIL CLASSIFICATION
Akshatha et al. (2022)
43. DATA SOURCE :
• The case study mainly focuses on classifying the soil records gathered from GKVK UAS,
Bangalore, Karnataka.
• It includes samples from various taluk of Chikkamagaluru district like Tarikere, Kadur,
Sringeri and Koppa.
Soil samples : 1550 (Training – 70%, Testing – 30%)
Attributes : N, P, K, Ca, Mg, Lime, C, S and moisture.
Algorithm used : SVM, KNN
Classification of soil nutrition into 4 classes Crops suggested
Class 0 (low fertile) Beans, green peas, carrot, onion
Class 1 (moderately fertile) Radish, cowpea, cabbage, cauliflower
Class 2 (high fertile) Sugarcane, paddy, bajra, guava
Class 3 (very high fertile) Barley, cotton, tobacco, sunflower
44. Results :
Ca Mg K S N Lime C P Moistur
e
Class
9.653 6.585 142 108 226.05 5.83 1.29 18 0.9 1
19.88 22.2 339.35 77 308.25 6.45 2 298 0.8 2
2.931 41.22 514.29 108 277.42 6.43 0.74 48 0.6 1
True
class
Predicted class
Confusion matrix for SVM model
Correctly
classified
Incorrectly
classified
Total
testing
data
Accuracy of
SVM
845 240 1085 77.85%
Class-0 labels (1st row) :
58% predicted same
27% misclassified as class-1
1.1% misclassified as class-2
14% misclassified as class-3
45. CONCLUSIONS OF CASE STUDY :
• KNN algorithm was also used which gave less accuracy of 72.04%.
• SVM algorithm obtained higher accuracy as it captured non-linearity in data.
• Based on the classification of soil class, crops can be recommended.
• This can help farmers to grow the best-suited crop that is adaptable to their soil
condition.
• The model can be improved with more hyper parametric tuning which can help
increase accuracy of the model and ultimately help farmers get to know about their
farm soil fertility level and crop suggested based on the fertility levels.
46. CASE STUDY-3 :
Sugar Cane Crop Yield Estimation Using K-Nearest
Neighbors
Kumar et al.
47. • The dataset includes predictors : Rainfall, pH, Organic Carbon, Area,
S, Cu, Fe, P, Mn, N, Fibre.
• Dependent variable : Yield (tons)
• Crop considered : Sugarcane
• State : Telangana
• Period : 1901 to 2016 annual data.
• Data re-scaled before analysis
DATA SOURCE :
49. • Accurate yield predictions across different areas can help the
farmers get better profit from the crops.
• KNN can be an alternative approach for regression as usually it is
used mostly for classification problems.
• In future we can make predictions using different algorithms and
compare the accuracies to chose best among them.
CONCLUSIONS OF CASE STUDY :
51. • Dataset : 8 attributes used from daily weather data.
• Place : Semarang city, Indonesia.
• Algorithm used :
DBSCAN & PCA
DATA SOURCE :
Attribute Data type
Min. temperature Numerical
Max. temperature Numerical
Average temperature Numerical
Average Humidity (%) Numerical
Sun exposure time (hours) Numerical
Maximum wind speed (m/s) Numerical
Average wind speed (m/s) Numerical
Rainfall (mm) Numerical
52. RESULTS :
0.19 eps
PC1 mainly consisted of : Avg. temperature, Max.
temperature and Avg. humidity.
PC2 mainly consisted of Tn temperature.
53. • The result showed that anomalous weather is characterized by high
humidity and low temperature.
• The experimental result had demonstrated that DBSCAN is capable of
identifying peculiar data points that are deviating from the ‘normal’
data distribution.
• The anomalous weather was characterized by high humidity and low
temperature.
• PCA can be utilized with DBSCAN in detection of noise.
CONCLUSIONS :
54. CONCLUSIONS OF MACHINE LEARNING :
• No algorithm is appropriate for all situations.
• Choosing a technique depends on pattern, type of data and experience of the
analyst.
• Using ML algorithms as a pipeline can save time of the analyst and give fast
solutions to the farmer.
• There is a wide scope of application of ML in agriculture, especially in plant
disease classification, soil or crop classification and prediction of yield of
crops.
• Automation can help reduce biotic and abiotic stress in fields that is
prevailing in the country.
55. REFERENCES :
• Akshatha, G.C. and Shastry, K.A., 2022. Crop and fertilizer recommendation system
based on soil classification, Recent Advances in Artificial Intelligence and Data
Engineering (pp. 29-40).
• Bharate, A.A. and Shirdhonkar, M.S., 2020. Classification of grape leaves using KNN
and SVM classifiers, 2020 Fourth International Conference on Computing
Methodologies and Communication (ICCMC) (pp. 745-749).
• Naveen N. Kumar, Balakrishnan, M., 2018. Sugar cane crop yield estimation using K-
Nearest Neighbors, Journal of Advance Research in Dynamical and Control Systems,
10(4), (pp. 199-207).
• Wibisono, S., Anwar, M.T., Supriyanto, A. and Amin, I.H.A., 2021. Multivariate weather
anomaly detection using DBSCAN clustering algorithm, Journal of Physics: Conference
Series (Vol. 1869, No. 1, p. 012077).