With the growing trend of machine learning, it is needless to say how machine learning can help reap benefits in agriculture. It will be boon for the farmer welfare.
Machine Learning in Agriculture Module 1Prasenjit Dey
Discuss the opportunities of incorporation of machine learning in agriculture. Briefly discuss different machine learning strategies. Briefly discuss the ways of machine learning can be used
Artificial intelligence : Basics and application in AgricultureAditi Chourasia
Agriculture is the mainstay of Indian economy as about 60% of our population depends directly or indirectly on agriculture.Exploration of technology in digital world gave birth to a whole new field of making intelligent machines i.e. Artificial intelligence (AI). AI is making a huge impact in all domains of the industry. Every industry looking to automate certain jobs through the use of intelligent machinery. Factors such as climate change, population growth and food security concerns have propelled the industry into seeking more innovative approaches to protecting and improving crop yield. As a result, AI is steadily emerging as part of the Agricultural industry’s technological evolution. The automation in agriculture is the main concern and the emerging subject across the world. AI in agriculture not only helping farmers to automate their farming but also shifts to precise cultivation for higher crop yield and better quality while using fewer resources.Technological advancement in the future will provide more useful applications to the sector helping the world deal with various farming challenges used to be faced in traditional agricultural practices.
Machine Learning in Agriculture Module 1Prasenjit Dey
Discuss the opportunities of incorporation of machine learning in agriculture. Briefly discuss different machine learning strategies. Briefly discuss the ways of machine learning can be used
Artificial intelligence : Basics and application in AgricultureAditi Chourasia
Agriculture is the mainstay of Indian economy as about 60% of our population depends directly or indirectly on agriculture.Exploration of technology in digital world gave birth to a whole new field of making intelligent machines i.e. Artificial intelligence (AI). AI is making a huge impact in all domains of the industry. Every industry looking to automate certain jobs through the use of intelligent machinery. Factors such as climate change, population growth and food security concerns have propelled the industry into seeking more innovative approaches to protecting and improving crop yield. As a result, AI is steadily emerging as part of the Agricultural industry’s technological evolution. The automation in agriculture is the main concern and the emerging subject across the world. AI in agriculture not only helping farmers to automate their farming but also shifts to precise cultivation for higher crop yield and better quality while using fewer resources.Technological advancement in the future will provide more useful applications to the sector helping the world deal with various farming challenges used to be faced in traditional agricultural practices.
AI bots in the agriculture field can harvest crops at a higher volume and faster pace than human laborers. By leveraging computer vision helps to monitor the weed and spray them. Thus, Artificial Intelligence is helping farmers find more efficient ways to protect their crops from weeds.
A confluence of factors have converged to afford the opportunity to apply data science at large scale to agricultural production. The demand for agricultural outputs is growing and there is a need to meet this demand by utilizing increasingly mechanized precision agriculture and enormous data volumes collected to intelligently optimize agriculture outputs. We will consider the machine learning challenges related to optimizing global food production.
Artificial Intelligence is one of the emerging technologies in the field of agriculture which tries to simulate human reasoning in intelligent systems. It is making a revolution in agriculture by replacing inefficient traditional methods with more efficient AI based methods. AI is used in agriculture in various ways such as automation, robots, drones, soil and crop monitoring, and predictive analytics. This paper provides various applications of AI tools in agriculture. Matthew N. O. Sadiku | Sarhan M. Musa | Abayomi Ajayi-Majebi "Artificial Intelligence in Agriculture" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38513.pdf Paper Url: https://www.ijtsrd.com/engineering/electrical-engineering/38513/artificial-intelligence-in-agriculture/matthew-n-o-sadiku
Crop Yield Prediction and Efficient use of Fertilizers
To buy this project in ONLINE, Contact:
Email: jpinfotechprojects@gmail.com,
Website: https://www.jpinfotech.org
Artificial Intelligence is an approach to make a computer, a robot, or a product to think about how smart humans think. AI is a study of how the human brain thinks, learns, decides and work when it tries to solve problems. And finally, this study outputs intelligent software systems. The aim of AI is to improve computer functions that are related to human knowledge, for example, reasoning, learning, and problem-solving.
Indian agriculture: Mechanization to DigitizationICRISAT
India is characterized by small farm holdings. More than 80% of the land holdings are less than 2 ha (5 acres). About 55% of India’s population is engaged in Agriculture with 40% farm mechanization. Due to non-remunerative nature of farming, more than 50% farmers in India are in debt. This situation has constrained farmers from investing in mechanization and other technologies.
-> ICRISAT Director General Dr David Bergvinson's presentation at the CII Agri business and Mechanization Summit held in New Delhi, India on 01 Sep 2015.
Artifial intellegence in Plant diseases detection and diagnosis N.H. Shankar Reddy
in advancement with technology, nowadays plant diseases are detected by using AI, this topic clearly demonstrates various ways of AI in plant disease detection and technologies involved in it.
Recent techniques and Modern tools in weed managementAshokh Aravind S
weed science, emerging issues in weed science, new tools and improvements in weed management, future advancements in weed management, biological weed control, harvest weed seed control
To remain profitable in agriculture under the present condition every farmer should consider that fertility level must be measured, this presentation based on how recommendation gives based on soil testings.
Machine Learning techniques used in Artificial Intelligence- Supervised, Unsupervised, Reinforcement Learning. It discusses about Linear Regression, Logistic Regression, SVM, Random forest, KNN, K-Means Clustering and Apriori Algorithm. It also Illustrates the applications of AI in various fields.
Machine learning and linear regression programmingSoumya Mukherjee
Overview of AI and ML
Terminology awareness
Applications in real world
Use cases within Nokia
Types of Learning
Regression
Classification
Clustering
Linear Regression Single Variable with python
AI bots in the agriculture field can harvest crops at a higher volume and faster pace than human laborers. By leveraging computer vision helps to monitor the weed and spray them. Thus, Artificial Intelligence is helping farmers find more efficient ways to protect their crops from weeds.
A confluence of factors have converged to afford the opportunity to apply data science at large scale to agricultural production. The demand for agricultural outputs is growing and there is a need to meet this demand by utilizing increasingly mechanized precision agriculture and enormous data volumes collected to intelligently optimize agriculture outputs. We will consider the machine learning challenges related to optimizing global food production.
Artificial Intelligence is one of the emerging technologies in the field of agriculture which tries to simulate human reasoning in intelligent systems. It is making a revolution in agriculture by replacing inefficient traditional methods with more efficient AI based methods. AI is used in agriculture in various ways such as automation, robots, drones, soil and crop monitoring, and predictive analytics. This paper provides various applications of AI tools in agriculture. Matthew N. O. Sadiku | Sarhan M. Musa | Abayomi Ajayi-Majebi "Artificial Intelligence in Agriculture" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38513.pdf Paper Url: https://www.ijtsrd.com/engineering/electrical-engineering/38513/artificial-intelligence-in-agriculture/matthew-n-o-sadiku
Crop Yield Prediction and Efficient use of Fertilizers
To buy this project in ONLINE, Contact:
Email: jpinfotechprojects@gmail.com,
Website: https://www.jpinfotech.org
Artificial Intelligence is an approach to make a computer, a robot, or a product to think about how smart humans think. AI is a study of how the human brain thinks, learns, decides and work when it tries to solve problems. And finally, this study outputs intelligent software systems. The aim of AI is to improve computer functions that are related to human knowledge, for example, reasoning, learning, and problem-solving.
Indian agriculture: Mechanization to DigitizationICRISAT
India is characterized by small farm holdings. More than 80% of the land holdings are less than 2 ha (5 acres). About 55% of India’s population is engaged in Agriculture with 40% farm mechanization. Due to non-remunerative nature of farming, more than 50% farmers in India are in debt. This situation has constrained farmers from investing in mechanization and other technologies.
-> ICRISAT Director General Dr David Bergvinson's presentation at the CII Agri business and Mechanization Summit held in New Delhi, India on 01 Sep 2015.
Artifial intellegence in Plant diseases detection and diagnosis N.H. Shankar Reddy
in advancement with technology, nowadays plant diseases are detected by using AI, this topic clearly demonstrates various ways of AI in plant disease detection and technologies involved in it.
Recent techniques and Modern tools in weed managementAshokh Aravind S
weed science, emerging issues in weed science, new tools and improvements in weed management, future advancements in weed management, biological weed control, harvest weed seed control
To remain profitable in agriculture under the present condition every farmer should consider that fertility level must be measured, this presentation based on how recommendation gives based on soil testings.
Machine Learning techniques used in Artificial Intelligence- Supervised, Unsupervised, Reinforcement Learning. It discusses about Linear Regression, Logistic Regression, SVM, Random forest, KNN, K-Means Clustering and Apriori Algorithm. It also Illustrates the applications of AI in various fields.
Machine learning and linear regression programmingSoumya Mukherjee
Overview of AI and ML
Terminology awareness
Applications in real world
Use cases within Nokia
Types of Learning
Regression
Classification
Clustering
Linear Regression Single Variable with python
Using Python library such as numpy, scipy and pandas to carry out supervised learning operations like Support vector machine, decision tree and K-nearest neighbor.
Predict Backorder on a supply chain data for an OrganizationPiyush Srivastava
Performed cleaning and founded the important variables and created a best model using different classification techniques (Random Forest, Naïve Bayes, Decision tree, KNN, Neural Network, Support Vector Machine) to predict the back-order for an organization using the best modelling and technique approach.
Top 100+ Google Data Science Interview Questions.pdfDatacademy.ai
Data science interviews can be particularly difficult due to the many proficiencies that you'll have to demonstrate (technical skills, problem solving, communication) and the generally high bar to entry for the industry.we Provide Top 100+ Google Data Science Interview Questions : All You Need to know to Crack it
visit by :-https://www.datacademy.ai/google-data-science-interview-questions/
Naive Bayes is a classification algorithm that is suitable for binary and multiclass classification. It is suitable for binary and multiclass classification. Naïve Bayes performs well in cases of categorical input variables compared to numerical variables. It is useful for making predictions and forecasting data based on historical results.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
Application of Machine Learning in Agriculture
1.
2. Master Seminar-I
Application of Machine Learning in Agriculture
Aman Vasisht
PGS20AGR8404
Dept. of Agricultural Statistics
UNIVERSITY OF AGRICULTURAL SCIENCES,
DHARWAD
COLLEGE OF AGRICULTURE, DHARWAD
3. OUTLINE
MACHINE LEARNING AND ITS
APPLICATIONS
TYPES OF MACHINE LEARNING
ALGORITHMS
CASE STUDIES
CONCLUSION
REFERENCES
4. QUICK QUESTIONNAIRE
How many of you have heard about Machine Learning ?
How many of you know about Machine Learning ?
How many of you are using Machine Learning ?
5. What is Machine Learning ?
• It is the science of programming computers so they can learn
from data.
• A type of AI that allows applications to become more accurate
in predicting outcomes.
Artificial Intelligence
Machine Learning
Deep Learning
Data Science
AI : Programs with the ability to learn
and reason like humans
ML : Algorithms with the ability to
learn and make informer decisions
DL : Artificial neural networks adapt and
learn from vast amounts of data
6. APPLICATIONS IN AGRICULTURE :
• Yield Prediction : An accurate model can help farm owners to take
informed management decisions for their farm.
• Disease Detection : Use of algorithms can help to identify diseased
plants with good accuracy.
• Crop quality : Grading of commodities can be done using some
parameters.
• Livestock Management : Managing farms according to the
controlled conditions and parameters.
7. TERMINOLOGY :
• Features : The number of features or distinct traits that can be used to
describe a label in a quantitative manner.
• Label or Target : The final outcome or variable which is dependent
on the contribution of the features.
• Training : To train the algorithm with dataset.
• Testing : To check accuracy of predicted values.
Let’s understand in a better way
8. Training :
Apple
Features :
1. Color : Reddish
2. Type : Fruit
3. Shape
etc..
Features :
1. Color : Greyish
2. Type : Company Logo
3. Shape
etc..
Features :
1. Color : Yellowish
2. Type : Fruit
3. Shape
etc..
10. TYPES OF MACHINE LEARNING :
Supervised Learning: - We are able to predict future outcomes
based on past data. It requires both features and labels to be given to
the model for it to be trained.
Unsupervised Learning: - We are able to identify hidden patterns
from the input data provided. By making the data more readable and
organized, the patterns, similarities, or anomalies become more
evident.
11. SUPERVISED LEARNING :
• Let feature variables be ‘X’ and output or label variable be ‘Y’. you use an
algorithm to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new
input data (X) that you can predict the output variables (Y) for that data.
Classification : A classification problem is when the output variable is a category,
such as :
Effective or non-effective
Disease or no disease, etc.
12. MAJOR ALGORITHMS
Classification.
• KNN
• SVM
• Logistic Regression
• Decision Tree Classifier
• Naive bayes
Regression : A regression problem is
when the output variable is a real
value, such as “yield” or “weight”. The
algorithms under this category are :
• Linear Regression
• Multiple Regression
• Polynomial Regression
• Lasso Regression
• Ridge Regression
13. UNSUPERVISED LEARNING :
• Unsupervised learning is where you only have input data (X) and no corresponding
output variables.
• The goal for unsupervised learning is to model the underlying structure or
distribution in the data in order to learn more about the data.
• Clustering: A clustering problem is where you want to discover the inherent
groupings in the data, such as grouping customers by purchasing behavior.
Algorithms : DBSCAN
K Means clustering
Hierarchical clustering
14. Input data
These are known fruits
Model It’s an apple
Prediction
Input data
Model
Unsupervised Learning
Supervised Learning
15. GOOD OR BAD MACHINE LEARNING MODEL :
• The main goal of each machine learning model is to generalize well.
• Here generalization defines the ability of an ML model to provide training on
the dataset, which can produce reliable and accurate output.
• Underfitting and overfitting are the two terms that need to be checked for the
performance of the model and whether the model is generalizing well or not.
Before understanding overfitting and underfitting, let's understand some basic
terms :
Bias
Variance
16. Bias-Variance Tradeoff :
Y = 𝑓(X) + ϵ [Let Y be dependent variable and X be independent
variable] ϵ∼N(0,σϵ).
We may estimate a model 𝑓(X) of 𝑓(X) using regression,
Therefore the error,
Err(x)=E[(Y− 𝑓(X))
2
]
This error may then be decomposed into bias
and variance components:
Err(X) = (E[𝑓(X)]− 𝑓(X))
2
+ E[(𝑓(X)−E[𝑓(X)])
2
] + σ
2
e
Err(X) = Bias
2
+ Variance + Irreducible Error
Low Variance High Variance
Low
Bias
High
Bias
17.
18. SIMPLE REGRESSION :
• Linear regression is one of the easiest and most popular Machine Learning
algorithms that is used for predictive analysis.
• y= a0+a1x+ ε
y= Dependent variable (target variable)
x= Independent variable (predictor variable)
a0= Intercept of the line
a1 = Linear regression coefficient
We wish to find a0 and a1 such that 𝛴(𝑦𝑖 − (𝑎0 + 𝑎1𝑥))2 is minimum.
a0 = 𝑦 - a1𝑥 and a1 =
𝛴(𝑥i − 𝑥)(𝑦i − 𝑦 )
𝛴 𝑥𝑖
− 𝑥 2
19. 0
2
4
6
8
10
12
14
16
18
20
0 5 10 15 20 25
Y
X
Scatter plot
y = 0.7019x + 2.4094
R² = 0.8363
0
2
4
6
8
10
12
14
16
18
20
0 5 10 15 20 25
Y
X
Best fit line
Imagine if we add a couple of large
values in the data, will it affect the
regression line?
Let’s check it
y = 1.5952x - 4.3564
R² = 0.4629
-10
0
10
20
30
40
50
60
70
0 5 10 15 20 25
Y
X
Best fit line Outlier
20. Detection of outliers in Machine Learning model:
• Using Z score :
• Z score helps to understand if a data value is greater or smaller than mean
and how many standard deviations away it is from the mean.
• 𝑍 =
𝑥−𝑥
𝜎
• Values above and below 𝑥 ± 3𝜎 are considered outliers.
Q. What is the most appropriate
measure of central tendency when the
data has outliers?
The median is usually preferred in these
situations because the value of the mean
can be distorted by the outliers.
21. • Inter-Quartile Range (IQR) proximity rule :
The data points which fall
below Q1 – 1.5 IQR or
above Q3 + 1.5 IQR are outliers.
Box plot diagram also termed as Whisker’s plot
is a graphical method.
The very purpose of this diagram is to identify
outliers and discard it from the data series.
Crop production
Crop area
22. ASSUMPTIONS OF REGRESSION ANALYSIS IN ML :
• Linear and additive
• No auto correlation
• No multicollinearity
• Homoscedasticity
• Normal distribution of errors
These assumptions are violated a lot and this violation if overlooked by a
researcher, can make the model bad and not good for predictions.
23. REGULARIZATION :
• Regularization is an important concept that is used to avoid overfitting of the data,
especially when the trained and test data are much varying.
• Two types :
L2 Ridge regression
L1 Lasso regression
L2 Ridge regression :
• It performs ‘L2 regularization’, i.e. adds penalty equivalent to square of the
magnitude of coefficients. Thus, it optimizes the following:
Objective = RSS + 𝜆* (sum of square of coefficients)
24. Loss Penalty
• 𝜆 is the tuning parameter which balances the
amount of emphasis given to minimizing RSS vs
minimizing sum of square of coefficients.
• In majority of cases, it is used to prevent
overfitting.
• It is mostly used to prevent multicollinearity.
• It reduces the model complexity by coefficient shrinkage.
L1 Lasso regression :
LASSO stands for Least Absolute Shrinkage and Selection Operator..
• Lasso regression performs L1 regularization, i.e. it adds a factor of sum of
absolute value of coefficients in the optimization objective.
25. • Objective = RSS + 𝜆* (sum of absolute value of coefficients)
It is generally used
when we have
more number of
features, because it
automatically does
feature selection
which makes it
better from ridge
regression.
Constraint region
RSS as it moves
away from
minimum
26. CLASSIFICATION :
• A common job of machine learning algorithms is to recognize objects and being
able to separate them into categories.
KNN (K-Nearest Neighbor) algorithm:
• K-NN is a non-parametric algorithm.
• It is also called a lazy learner algorithm.
• KNN algorithm at the training phase just stores the dataset and when it gets new
data, then it classifies that data into a category based on some distance measures.
• One of these measures is Minkowski distance.
c : a parameter
p,q are two points
27. 𝑖=1
𝑛
𝑝𝑖 − 𝑞𝑖 2
Euclidean distance :
When c = 2
𝑖=1
𝑛
|𝑝𝑖 − 𝑞𝑖|
Manhattan distance :
When c = 1
P
Q
0
2
4
6
8
10
12
14
16
18
0 5 10 15 20 25 30 35 40 45
• K Number of Neighbors are
generally taken as odd : 3, 5.. etc.
• Very simple
• Works with any number of classes
• Re-scaling is very important as it
is a distance-based algorithm.
K = 5
28. Accuracy :
Predicted Value
Actual
Value
n 0 1
0 TN FP
1 FN TP
Let’s see
an example
TN : True Negative
FP : False Positive
FN : False Negative
TP : True Positive
Accuracy :
Predicted Value
Actual
Value
n = 150 Healthy unhealthy
Healthy 40 10
Unhealthy 5 95
Accuracy = Correctly predicted / TN +
FP + FN + TP
Error rate = Wrong predicted / TN + FP
+ FN + TP
Therefore,
Accuracy = (40 + 95)/(40 + 10 + 5 + 95)
= 0.9 or 90%
Error rate = 15/150 = 0.1
But how do we know which number
to take as K ?
Is it 5, 7 or any other number?
29. Support Vector Machine (SVM) :
• In the SVM algorithm, we plot each data item as a point in n-dimensional
space (where n is a number of features you have) with the value of each
feature being the value of a particular coordinate.
• The goal is to find decision boundary that is
separating the classes.
Two types :
• Linear SVM : if a dataset can be classified
into two classes by using a single straight
line.
• Non-linear SVM : a dataset cannot be
classified by using a straight line -3
2
7
12
17
22
-3 2 7 12 17 22
30. -3
2
7
12
17
22
-3 2 7 12 17 22
Maximum Margin
Max. Margin
Hyperplane
Support vectors
Terminology :
Hyperplane : The best decision line or boundary.
Support vectors : the closest point of the lines
from both the classes.
Margin : The distance between the vectors and
the hyperplane. It should be maximum.
Kernel : Kernel Function generally transforms the
non-linear data into linear separable data.
31. To transform the non-linear data :
• Y = x2 (for 1D non-linear data)
By adding this dimension, we will get two-dimensional space.
• Z = x2 + y2 (for 2D non-linear data)
By adding this dimension, we will get three-dimensional space.
Radial Basis Function (RBF) :
• It computes the similarity or how close points x1 and x2 are to each other.
𝑘(𝑥1, 𝑥2) = ⅇ𝑥𝑝 −
| 𝑥1 − 𝑥2 |2
2𝜎2
32. UNSUPERVISED MACHINE LEARNING :
• Unsupervised learning is the training of a machine using information that is neither
classified nor labeled.
• It groups unsorted information according to similarities, patterns, and differences without
any prior training of data.
Hierarchical Clustering:
It involves creating clusters in a predefined order when similar clusters are grouped together
and are arranged in a hierarchical manner.
Non Hierarchical Clustering :
It involves formation of new clusters by merging or splitting the clusters. It does not follow a
tree like structure like hierarchical clustering. K means clustering and DBSCAN are two
effective algorithms.
33. 3 clusters formed when the data is
uniform i.e. when data is easily
separable with naked eye. What if the
data is non-uniform?
Clusters
34. DBSCAN (DENSITY-BASED SPATIAL CLUSTERING OF
APPLICATIONS WITH NOISE) :
• The DBSCAN algorithm has a key idea that for each point of a cluster, the
neighborhood of a given radius has to contain at least a minimum number of points.
• DBSCAN algorithm requires two parameters:
Min_pts: The minimum number of points (a threshold) clustered together for a region
to be considered dense.
Eps (ε): A distance measure that will be used to locate the points in the neighborhood
of any point.
• In this algorithm, we have 3 types of data points.
Core Point: A point if it has more than MinPts needed within eps.
Border Point: A point which has fewer than MinPts within eps but
it is in the neighborhood of a core point.
Noise or outlier: A point which is not a core point or border point.
35. Noise
Min_pts : 4
Red : Core points
Green : Border points that are still part of
cluster because they are within epsilon of a
core point, but do not meet the min_points
criteria.
Blue : Noise point, not assigned to cluster.
Important points :
• Other clusters are suitable only for compact and well separated clusters. In non-
uniform data, DBSCAN is much better.
• It is robust to outliers.
• It takes count of dense regions and accordingly makes clusters and lower density
points are not taken care of.
• Minimum points we should take are 3.
• DBSCAN uses Euclidean distance by default. 𝑖=1
𝑛
𝑝𝑖 − 𝑞𝑖 2
36. IMAGE FEATURE EXTRACTION :
Texture extraction
• Number of different intensity levels
in the image. This identifies the
size of a GLCM.
• Find intermediate matrix A by
finding how frequently a pixel p
occurs in a particular spatial
relationship with pixel q.
• Calculate GLCM by dividing each
element of matrix A by the sum of
elements of matrix A.
Color extraction
• Extract three components red,
green and blue from image.
• Convert color image to HSV
image.
• Extract hue, saturation and
intensity of image.
• For each component extracted,
compute mean, variance and
range.
Grey Level Co-occurrence matrix (GLCM) is used to find:
38. CASE STUDY-1 :
CLASSIFICATION OF GRAPE LEAVES USING KNN
AND SVM CLASSIFIERS
Anil A. Bharate, M. S. Shirdhonkar (2020)
39. DATA SOURCE:
• This case study proposes a technique to classify the grape leaf as healthy and unhealthy.
• Database consisted of 90 images of grape leaves.
Training : 30 images of healthy and 30 images of unhealthy leaves.
Testing : 30 images including healthy and unhealthy leaves.
• Feature extraction (Image processing) :
Texture and color features are extracted using Grey Level Co-occurrence Matrix
(GLCM).
a) Healthy
leaf
b) Unhealthy
leaf
40. RESULTS :
Parameter : Proposed method (SVM) Proposed method (KNN)
Features 4 texture & 18 color 4 texture & 18 color
Classifier SVM kNN
Number of samples 30 30
90
96.66
SVM KNN
Accuracy
(%)
Comparison of Results
It is noticed that accuracy of KNN is
better than SVM model. This is because
KNN computes distance to all neighbors
from a point, then finds nearest neighbor
and then decides about the class. On the
other hand, SVM considers only support
vectors to find hyper plane and then
decides about the class.
41. CONCLUSIONS OF CASE STUDY :
• Automation will be a boon for farmers to prevent their plants from diseases and increase
the yield.
• The KNN classifier gives better accuracy than SVM classifier.
• As a future work system can be trained to identify the diseases present on the grape leaves
and also provide the possible solution.
• Automatic image capturing camera can be installed with the help of government bodies
and thus the images captured can be sent for feature selection and then tested and trained
with some algorithms, concluding best algorithm with best accuracy for future
identification of scalability of infected leaves.
42. CASE STUDY-2 :
CROPAND FERTILIZER RECOMMENDATION
SYSTEM BASED ON SOIL CLASSIFICATION
Akshatha et al. (2022)
43. DATA SOURCE :
• The case study mainly focuses on classifying the soil records gathered from GKVK UAS,
Bangalore, Karnataka.
• It includes samples from various taluk of Chikkamagaluru district like Tarikere, Kadur,
Sringeri and Koppa.
Soil samples : 1550 (Training – 70%, Testing – 30%)
Attributes : N, P, K, Ca, Mg, Lime, C, S and moisture.
Algorithm used : SVM, KNN
Classification of soil nutrition into 4 classes Crops suggested
Class 0 (low fertile) Beans, green peas, carrot, onion
Class 1 (moderately fertile) Radish, cowpea, cabbage, cauliflower
Class 2 (high fertile) Sugarcane, paddy, bajra, guava
Class 3 (very high fertile) Barley, cotton, tobacco, sunflower
44. Results :
Ca Mg K S N Lime C P Moistur
e
Class
9.653 6.585 142 108 226.05 5.83 1.29 18 0.9 1
19.88 22.2 339.35 77 308.25 6.45 2 298 0.8 2
2.931 41.22 514.29 108 277.42 6.43 0.74 48 0.6 1
True
class
Predicted class
Confusion matrix for SVM model
Correctly
classified
Incorrectly
classified
Total
testing
data
Accuracy of
SVM
845 240 1085 77.85%
Class-0 labels (1st row) :
58% predicted same
27% misclassified as class-1
1.1% misclassified as class-2
14% misclassified as class-3
45. CONCLUSIONS OF CASE STUDY :
• KNN algorithm was also used which gave less accuracy of 72.04%.
• SVM algorithm obtained higher accuracy as it captured non-linearity in data.
• Based on the classification of soil class, crops can be recommended.
• This can help farmers to grow the best-suited crop that is adaptable to their soil
condition.
• The model can be improved with more hyper parametric tuning which can help
increase accuracy of the model and ultimately help farmers get to know about their
farm soil fertility level and crop suggested based on the fertility levels.
46. CASE STUDY-3 :
Sugar Cane Crop Yield Estimation Using K-Nearest
Neighbors
Kumar et al.
47. • The dataset includes predictors : Rainfall, pH, Organic Carbon, Area,
S, Cu, Fe, P, Mn, N, Fibre.
• Dependent variable : Yield (tons)
• Crop considered : Sugarcane
• State : Telangana
• Period : 1901 to 2016 annual data.
• Data re-scaled before analysis
DATA SOURCE :
49. • Accurate yield predictions across different areas can help the
farmers get better profit from the crops.
• KNN can be an alternative approach for regression as usually it is
used mostly for classification problems.
• In future we can make predictions using different algorithms and
compare the accuracies to chose best among them.
CONCLUSIONS OF CASE STUDY :
51. • Dataset : 8 attributes used from daily weather data.
• Place : Semarang city, Indonesia.
• Algorithm used :
DBSCAN & PCA
DATA SOURCE :
Attribute Data type
Min. temperature Numerical
Max. temperature Numerical
Average temperature Numerical
Average Humidity (%) Numerical
Sun exposure time (hours) Numerical
Maximum wind speed (m/s) Numerical
Average wind speed (m/s) Numerical
Rainfall (mm) Numerical
52. RESULTS :
0.19 eps
PC1 mainly consisted of : Avg. temperature, Max.
temperature and Avg. humidity.
PC2 mainly consisted of Tn temperature.
53. • The result showed that anomalous weather is characterized by high
humidity and low temperature.
• The experimental result had demonstrated that DBSCAN is capable of
identifying peculiar data points that are deviating from the ‘normal’
data distribution.
• The anomalous weather was characterized by high humidity and low
temperature.
• PCA can be utilized with DBSCAN in detection of noise.
CONCLUSIONS :
54. CONCLUSIONS OF MACHINE LEARNING :
• No algorithm is appropriate for all situations.
• Choosing a technique depends on pattern, type of data and experience of the
analyst.
• Using ML algorithms as a pipeline can save time of the analyst and give fast
solutions to the farmer.
• There is a wide scope of application of ML in agriculture, especially in plant
disease classification, soil or crop classification and prediction of yield of
crops.
• Automation can help reduce biotic and abiotic stress in fields that is
prevailing in the country.
55. REFERENCES :
• Akshatha, G.C. and Shastry, K.A., 2022. Crop and fertilizer recommendation system
based on soil classification, Recent Advances in Artificial Intelligence and Data
Engineering (pp. 29-40).
• Bharate, A.A. and Shirdhonkar, M.S., 2020. Classification of grape leaves using KNN
and SVM classifiers, 2020 Fourth International Conference on Computing
Methodologies and Communication (ICCMC) (pp. 745-749).
• Naveen N. Kumar, Balakrishnan, M., 2018. Sugar cane crop yield estimation using K-
Nearest Neighbors, Journal of Advance Research in Dynamical and Control Systems,
10(4), (pp. 199-207).
• Wibisono, S., Anwar, M.T., Supriyanto, A. and Amin, I.H.A., 2021. Multivariate weather
anomaly detection using DBSCAN clustering algorithm, Journal of Physics: Conference
Series (Vol. 1869, No. 1, p. 012077).