3. Introduction
■ Machine Learning (ML) is a vital decision support mechanism in predicting crop yield, by
comprising information like what to cultivate and how to improve and maintain the quality of soil,
humidity etc.
■ Numerous ML techniques have been practiced in this field to aid crop yield forecasting research.
Forecasting crop yield much before cultivation, would help the farmers take commendable
measures for marketing and storage.
■ This paper aims to propose an improved classification based interactive model for crop yield
prediction for the state of Punjab in India, before cultivating onto the agricultural field based upon
major parameters viz. climate and production area.
■ Also, the performance of the system is analyzed using four ML algorithms: “Random Forest”,
“Lasso-Regression”, “Linear-Regression”, and “Support Vector Machine”. Lasso-Regression has
not been used for predicting crop yield yet. The achieved 96.72% accuracy of the system
ascertained its effectiveness in helping farmers to select most appropriate crops for harvesting.
5. Proposed Model
■ Based on literature, it is concluded that very few research has been
carried out on real data and moreover, “Lasso Regression” algorithm has
never been used in crop prediction.
■ This paper propose a novel model to predict yield for state of Punjab in
India, and is analyzed with ‘Random-Forest’, ‘linear regression’, ‘lasso
regression’, and ‘Support vector machine’algorithms.
■ Predicting crop yield prior to its harvest, would support policy makers and
farmers to take suitable measures for marketing and loading.
■ Research in this paper aims at providing stakeholders an estimate on how
much crop yield will be produced, depending upon the seasons of
farming, crop, farm area and production in previous years.
6. Results and Discussion
■ In this study, Machine learning played a predominant role in preparing a model
for crop yield prediction.
■ Various machine learning algorithms are applied to the dataset using the Python
programming language.
■ Dataset was preprocessed using the SSIS technique.
■ Different parameters are being used to estimate the efficiency of the algorithms
used.
■ Different machine learning algorithms such as Random Forest, Linear
Regression, Lasso Regression, and Support Vector Machine are applied to the
Punjab state of India from the dataset to predict the crop yield in advance before
harvesting.
■ These algorithms were compared using the R2-Score and MSE.
7. R2- Score and MSE
R2- Score: Basically, it is used to define the ‘goodness of fit’ of model, where best R2-score value
is 1. Nearer the value of r-square is to 1, enhanced the model is. R2 value can also be negative if
model is inferior to average fitted model [15].
R-square = 1- (SSres/ SStotal)
SSres: Residual sum of squares
SStotal: Total sum of squares
Mean Squared Error : MSE is a statistical measure, and can be calculated as mean or average of
‘square of the difference between actual and estimated values’. Lesser the value of MSE, the better
the model is.
8. Yield Prediction through Proposed
Model
■ Fig. illustrate the results of yield production for “rice”
in the season ‘kharif’ for all districts as it is majorly
growing crop in Punjab.
■ Column 7 in the figure shows actual production year
wise starting from 1997 up to 2014 as extracted from
dataset.
■ Whereas, column 8 represents ‘Prediction’ for that
year. As concluded from table that highest accuracy of
proposed model is 96.72% using random forest
algorithm. we can see that model is predicting crop
yield with same accuracy (96.72%). This validates the
efficiency of our model. Graph in fig. represents the
actual versus predicted production. Similarly, our
model can predict the yield for each and every crop
and district of Punjab.
9. Graph for actual and predicted yield of ‘rice’
in ‘kharif’ season for Punjab
10. CONCLUSIONS
■ This research is carried out using a ML approach, and estimated the performance with four
supervised ML algorithms: Random Forest, Linear Regression, Lasso Regression, and SVM in
predicting the crop yield in advance.
■ In our proposed model, Random Forest comes out to be most accurate with an accuracy of
96.72% in prediction of crop yield.
■ Also, Lasso Regression outperforms SVM if we have huge size real data.
■ Considerably, this model would assist to lessen troubles confronted by farmers, and will
function as a delegate in offering farmers with information that they want in maximizing the
profits.
11. References
1. Khosla, E., Dharavath, R. & Priya, R. (2020). Crop yield prediction using aggregated rainfall-based modular artificial neural networks and
support vector regression. Environ Dev Sustain 22, 5687–5708
2. Kannan E, Sundaram S. (2011).Analysis of trends in India's Agricultural Growth. Bangalore, India;
3. Lobell, D. B., & Burke, M. B. (2010). On the use of statistical models to predict crop yield responses to climate change. Agricultural and
forest meteorology, 150(11), 1443-1452
4. Dharmaraja, S., Jain, V., Anjoy, P. et al. (2020).: Empirical Analysis for Crop Yield Forecasting in India. Agric Res 9, 132–138 (2020).
5. Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research (IJSR).[Internet], 9, 381-386.
6. P. Priya, U.Muthaiah M. Balamurugan. (2018). Predicting yield of the crop using machine learning algorithm, International Journal of
Engineering Science Research Technology, 7(4) (2018).
7. J. Jeong, J.Resop, N.Mueller . (2018). Random forests for global and regional crop yield prediction ,PLoS ONE Journal,(2018).
8. E. Manjula , S. (2017).A Model for Prediction of Crop Yield, International Journal of Computational Intelligence and Informatics, Vol. 6:
No. 4.
12. 9. Paswan,R. ,Begum,S. (2013). Regression and Neural Networks Models for Prediction of Crop Production, International Journal of Scientific &
Engineering Research, Volume 4, Issue 9.
10. Veenadhari,S., Misra,B., Singh,D.. (2017). Machine learning approach for forecasting crop yield based on climatic parameters, International Conference
on Computer Communication and Informatics
11. D. Ramesh and B. Vardhan. , (2015).Analysis of crop yield prediction using data mining techniques, International Journal of Research in Engineering
and Technology, vol. 4, no. 1, pp. 47-473
12. Shweta K Shahane, Prajakta V Tawale. (2016). Prediction On Crop Cultivation. International Journal of Advanced Research in Computer Science and
Electronics Engineering (IJARCSEE) Volume 5, Issue 10,
13. Konstantinos P. Ferentinos, Costas P. Yialouris , Petros Blouchos,Georgia Moschopoulou, and Spyridon Kintzios. (2013). Pesticide Residue Screening
Using a Novel Artificial Neural Network Combined with a Bioelectric Cellular Biosensor,Hindawi Publishing Corporation BioMed Research
International Volume V
14. Chaudhary, S., Arora, Y. and Yadav, N. (2020). Optimization of Random Forest Algorithm for Breast Cancer Detection .
15. Sellam, V., & Poovammal, E. (2016). Prediction of crop yield using regression analysis. Indian Journal of Science and Technology, 9(38), 1-5 (2016).