SlideShare a Scribd company logo
1 of 23
Comparative Analysis of RMSE and MAP Metrics for
Evaluating CNN and LSTM Models
(PAPER ID – T-306) (CMT ID – 9)
Gagandeep Kaur, Satish Saini
SERB sponsored International Conference
RACCAI 2023
12th- 13th October 2023
CEC Jhangeri, Mohali
Introduction
 In the field of machine learning, evaluating and comparing model performance is crucial
 The choice of evaluation metrics can significantly impact decision-making, especially for
complex models like CNNs and LSTMs
 This presentation explores and compares the utility of two key evaluation metrics: RMSE
and MAP, specifically within the context of CNNs and LSTMs
 Machine learning models, such as CNNs and LSTMs, play vital roles in various domains,
necessitating rigorous model assessment
 RMSE and MAP are two widely used metrics, each with unique strengths and suitability
for different tasks
 The objective is to determine how RMSE and MAP perform when evaluating the
predictive abilities of CNNs and LSTMs
 This research aims to provide insights into selecting appropriate evaluation criteria,
aiding practitioners and researchers in making informed choices
 The presentation delves into the world of model evaluation, revealing the nuances of
RMSE and MAP and their applicability in the dynamic landscape of deep learning
Machine Learning Model – CNN & LSTM
CNN
 CNNs are adept at automatically learning hierarchical features from data, making them
particularly suited for image analysis
 They employ convolutional layers to extract low-level features like edges and gradually
build up to more complex features, enabling pattern recognition
 CNNs typically consist of convolutional layers, pooling layers for spatial downsampling,
and fully connected layers for classification
LSTM
 LSTMs are a subset of recurrent neural networks (RNNs) designed to capture
dependencies in sequential data
 They excel in tasks involving time series forecasting, speech recognition, and natural
language processing where past information influences future predictions
 LSTMs are composed of recurrent units with memory cells, which allow them to capture
temporal dependencies through time steps
Performance Parameters
RMSE (Root Mean Square Error)
 RMSE is a widely adopted metric in machine learning, especially in regression tasks
 It quantifies the average magnitude of errors between predicted values and actual values
by taking the square root of the mean of the squared differences
 Lower RMSE values indicate better model accuracy, making it particularly useful when
dealing with continuous numerical predictions
RMSE =
𝟏
𝐍 𝐢=𝟏
𝐍
(𝐲𝐢 − 𝐲𝐢
^
)2
where,
N represents sample’s number,
𝑦𝑖 denotes the actual value,
𝑦𝑖
^
represents the predicted value.
MAP (Mean Average Precision)
 MAP, on the other hand, is often used to assess the performance of information retrieval
and ranking systems
 It measures the precision of a model's ranking by calculating the average precision across
various recall levels
 MAP is valuable when evaluating models that need to prioritize and rank items
effectively, such as recommendation systems and search engines
𝐀𝐏 𝐪 =
𝟏
𝐧𝐫𝐞𝐥 𝐤=𝟏
𝐧
𝐏 𝐤 . 𝐫𝐞𝐥(𝐤)
where,
𝑛𝑟𝑒𝑙 represents relevant items number,
𝑃 𝑘 is the precision at rank k,
𝑟𝑒𝑙(𝑘) is an indicator function denoting relevance.
Importance of Model Evaluation
 Accurate model evaluation is critical in ensuring that machine learning models are
reliable and effective in their intended tasks
 The choice of the appropriate evaluation metric depends on the specific problem, goals,
and the type of data being processed
 Selecting the wrong metric can lead to misleading conclusions about a model's
performance
Methodology
Dataset Selection:
 We utilized a curated Kaggle flower dataset, totaling 2,746 images and Cifar-10 dataset
 The dataset was meticulously chosen to represent a diverse array of floral species, growth
stages, and environmental conditions, ensuring its representativeness
Programming Environment:
 Our programming tasks were conducted in Python, utilizing the Google Colab
environment, which offers a convenient and collaborative platform for machine learning
development
Experimental Phases:
 Our experimental process consisted of several key phases, each contributing to the overall
research outcomes
Model Training:
 The initial phase involved model training, where Convolutional Neural Networks (CNNs)
and Long Short-Term Memory networks (LSTMs) were exposed to both the datasets
 These models learned to recognize visual features in images
Fine-Tuning:
 Subsequently, fine-tuning took place, a crucial iterative process involving adjustments to
model parameters and architectures
 This iterative refinement enhanced the models' performance and adaptability to specific
tasks, optimizing their capabilities
Results
 RMSE is an effective metric for measuring the accuracy of continuous predictions
 It has been applied to assess the precision of predictions in both CNN-generated image
intensities and LSTM-generated transcriptions
 Conversely, MAP (Mean Average Precision) serves as a comprehensive evaluation tool
 It excels in assessing the models' performance in tasks requiring precise retrieval, such as
object detection and speech transcription
FIGURE1. shows 100 epochs of CNN Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images
FIGURE2. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images
FIGURE3. shows 100 epochs of CNN Model to calculate RMSE and MAP using Cifar-10 dataset
FIGURE4. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Cifar-10 dataset
Model RMSE MAP
CNN 1.77 1.40
LSTM 1.77 1.36
Model RMSE MAP
CNN 2.39 1.12
LSTM 2.89 1.61
TABLE 1. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM
USING KAGGLE DATASET CONTAINING 2746 IMAGES
TABLE 2. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM
USING CIFAR-10 DATASET
Conclusion
 RMSE and MAP are robust evaluation metrics for CNNs and LSTMs
 Understanding their complementary nature enhances model assessment
 Model effectiveness depends on factors like dataset quality, epochs, batch sizes, and
kernel dimensions
 Choosing the right metric is crucial for diverse tasks
 Choose CNNs for image tasks and LSTMs for sequential data based on specific needs
and available resources
Future Scope
 Future research can explore hybrid CNN-LSTM architectures for tasks that demand both
spatial and sequential understanding
 Optimization techniques can further enhance model performance for specific applications
References
[1] Kaur, G., & Saini, S. (2023, March). Comparison of State Vector Machine and Decision Tree-Content Based Image Retrieval Algorithms to Perceive
Accuracy. In 2023 1st International Conference on ] Innovations in High Speed Communication and Signal Processing (IHCSP) (pp. 11-15). IEEE.
[2] Kaur, G., Saini, S., & Sehgal, A. (2022). Introduction to Artificial Intelligence. In Artificial Intelligence (pp. 1-20). Chapman and Hall/CRC.
[3] Kaur, G., Saini, S., & Sehgal, A. (2022). Machine Learning–Principles and Algorithms. In Artificial Intelligence (pp. 21-54). Chapman and Hall/CRC.
[4] Kaur, G., Saini, S., & Sehgal, A. (2022). Applications of Machine Learning and Deep Learning. In Artificial Intelligence (pp. 55-70). Chapman and
Hall/CRC.
[5] Devi, P., & Parmar, M. (2017). A Survey on CBIR Techniques and Learning Algorithm Comparison. International Journal of Latest Trends in
Engineering and Technology, 8(1), 197-205.
[6] Xue, W., Wenxia, X., & Guodong, L. (2016, August). Image Edge Detection Algorithm Research Based on the CNN's Neighborhood Radius Equals 2.
In 2016 International Conference on Smart Grid and Electrical Automation (ICSGEA) (pp. 115-119). IEEE.
[7] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11),
2278-2324.
[8] Bhatia, N. (2010). Survey of nearest neighbor techniques. arXiv preprint arXiv:1007.0085
[9] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., ... & Zieba, K. (2016). End to end learning for self-driving cars. arXiv
preprint arXiv:1604.07316.
[10] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In
Proceedings of the IEEE international conference on computer vision (pp. 1026-1034).
[11] Liu, S. S., & Tian, Y. T. (2010, June). Facial expression recognition method based on gabor wavelet features and fractional power polynomial kernel [
PCA. In International Symposium on Neural Networks (pp. 144-151). Springer, Berlin, Heidelberg.
[12] Waibel, A., & Lee, K. F. (Eds.). (1990). Readings in speech recognition. Elsevier.
[13] Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine learning, 27(3), 313-331.
[14] Chan, P., & Stolfo, S. J. (1999). Toward scalable learning with non-uniform distributions: Effects and a multi-classifier approach. In In Proceedings of
the Fourth International Conference on Knowledge Discovery and Data Mining.
[15] Guzella, T. S., & Caminhas, W. M. (2009). A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7),
10206-10222.
[16] Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert systems with
applications, 33(4), 847-856.
[17] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
[18] McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4),
115-133.
[19] Van Houdt, G., Mosquera, C., & Nápoles, G. (2020). A review on the long short-term memory model. Artificial Intelligence Review, 53, 5929-5955.
Thank You

More Related Content

Similar to Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models

SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.bhavinecindus
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningPramit Choudhary
 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...IRJET Journal
 
Identification of Geometric Shapes with RealTime Neural Networks
Identification of Geometric Shapes with RealTime Neural NetworksIdentification of Geometric Shapes with RealTime Neural Networks
Identification of Geometric Shapes with RealTime Neural NetworksEswar Publications
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningijitcs
 
A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...
A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...
A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...cscpconf
 
A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction IJECEIAES
 
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...IJERA Editor
 
Application_of_Deep_Learning_Techniques.pptx
Application_of_Deep_Learning_Techniques.pptxApplication_of_Deep_Learning_Techniques.pptx
Application_of_Deep_Learning_Techniques.pptxKiranKumar918931
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiersamreshkr19
 
network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...Ashish Gupta
 
What is pattern recognition (lecture 4 of 6)
What is pattern recognition (lecture 4 of 6)What is pattern recognition (lecture 4 of 6)
What is pattern recognition (lecture 4 of 6)Randa Elanwar
 
A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...IJECEIAES
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGijccmsjournal
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGijccmsjournal
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGijccmsjournal
 

Similar to Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models (20)

B045041114
B045041114B045041114
B045041114
 
SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
 
Identification of Geometric Shapes with RealTime Neural Networks
Identification of Geometric Shapes with RealTime Neural NetworksIdentification of Geometric Shapes with RealTime Neural Networks
Identification of Geometric Shapes with RealTime Neural Networks
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learning
 
A1802050102
A1802050102A1802050102
A1802050102
 
A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...
A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...
A NOVEL FEATURE SET FOR RECOGNITION OF SIMILAR SHAPED HANDWRITTEN HINDI CHARA...
 
A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction
 
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
 
Application_of_Deep_Learning_Techniques.pptx
Application_of_Deep_Learning_Techniques.pptxApplication_of_Deep_Learning_Techniques.pptx
Application_of_Deep_Learning_Techniques.pptx
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...
 
What is pattern recognition (lecture 4 of 6)
What is pattern recognition (lecture 4 of 6)What is pattern recognition (lecture 4 of 6)
What is pattern recognition (lecture 4 of 6)
 
A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
ACCESS.2020.3015966.pdf
ACCESS.2020.3015966.pdfACCESS.2020.3015966.pdf
ACCESS.2020.3015966.pdf
 

Recently uploaded

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 

Recently uploaded (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 

Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models

  • 1. Comparative Analysis of RMSE and MAP Metrics for Evaluating CNN and LSTM Models (PAPER ID – T-306) (CMT ID – 9) Gagandeep Kaur, Satish Saini SERB sponsored International Conference RACCAI 2023 12th- 13th October 2023 CEC Jhangeri, Mohali
  • 2. Introduction  In the field of machine learning, evaluating and comparing model performance is crucial  The choice of evaluation metrics can significantly impact decision-making, especially for complex models like CNNs and LSTMs  This presentation explores and compares the utility of two key evaluation metrics: RMSE and MAP, specifically within the context of CNNs and LSTMs  Machine learning models, such as CNNs and LSTMs, play vital roles in various domains, necessitating rigorous model assessment  RMSE and MAP are two widely used metrics, each with unique strengths and suitability for different tasks
  • 3.  The objective is to determine how RMSE and MAP perform when evaluating the predictive abilities of CNNs and LSTMs  This research aims to provide insights into selecting appropriate evaluation criteria, aiding practitioners and researchers in making informed choices  The presentation delves into the world of model evaluation, revealing the nuances of RMSE and MAP and their applicability in the dynamic landscape of deep learning
  • 4. Machine Learning Model – CNN & LSTM CNN  CNNs are adept at automatically learning hierarchical features from data, making them particularly suited for image analysis  They employ convolutional layers to extract low-level features like edges and gradually build up to more complex features, enabling pattern recognition  CNNs typically consist of convolutional layers, pooling layers for spatial downsampling, and fully connected layers for classification
  • 5. LSTM  LSTMs are a subset of recurrent neural networks (RNNs) designed to capture dependencies in sequential data  They excel in tasks involving time series forecasting, speech recognition, and natural language processing where past information influences future predictions  LSTMs are composed of recurrent units with memory cells, which allow them to capture temporal dependencies through time steps
  • 6. Performance Parameters RMSE (Root Mean Square Error)  RMSE is a widely adopted metric in machine learning, especially in regression tasks  It quantifies the average magnitude of errors between predicted values and actual values by taking the square root of the mean of the squared differences  Lower RMSE values indicate better model accuracy, making it particularly useful when dealing with continuous numerical predictions RMSE = 𝟏 𝐍 𝐢=𝟏 𝐍 (𝐲𝐢 − 𝐲𝐢 ^ )2 where, N represents sample’s number, 𝑦𝑖 denotes the actual value, 𝑦𝑖 ^ represents the predicted value.
  • 7. MAP (Mean Average Precision)  MAP, on the other hand, is often used to assess the performance of information retrieval and ranking systems  It measures the precision of a model's ranking by calculating the average precision across various recall levels  MAP is valuable when evaluating models that need to prioritize and rank items effectively, such as recommendation systems and search engines 𝐀𝐏 𝐪 = 𝟏 𝐧𝐫𝐞𝐥 𝐤=𝟏 𝐧 𝐏 𝐤 . 𝐫𝐞𝐥(𝐤) where, 𝑛𝑟𝑒𝑙 represents relevant items number, 𝑃 𝑘 is the precision at rank k, 𝑟𝑒𝑙(𝑘) is an indicator function denoting relevance.
  • 8. Importance of Model Evaluation  Accurate model evaluation is critical in ensuring that machine learning models are reliable and effective in their intended tasks  The choice of the appropriate evaluation metric depends on the specific problem, goals, and the type of data being processed  Selecting the wrong metric can lead to misleading conclusions about a model's performance
  • 10. Dataset Selection:  We utilized a curated Kaggle flower dataset, totaling 2,746 images and Cifar-10 dataset  The dataset was meticulously chosen to represent a diverse array of floral species, growth stages, and environmental conditions, ensuring its representativeness Programming Environment:  Our programming tasks were conducted in Python, utilizing the Google Colab environment, which offers a convenient and collaborative platform for machine learning development
  • 11. Experimental Phases:  Our experimental process consisted of several key phases, each contributing to the overall research outcomes Model Training:  The initial phase involved model training, where Convolutional Neural Networks (CNNs) and Long Short-Term Memory networks (LSTMs) were exposed to both the datasets  These models learned to recognize visual features in images
  • 12. Fine-Tuning:  Subsequently, fine-tuning took place, a crucial iterative process involving adjustments to model parameters and architectures  This iterative refinement enhanced the models' performance and adaptability to specific tasks, optimizing their capabilities
  • 13. Results  RMSE is an effective metric for measuring the accuracy of continuous predictions  It has been applied to assess the precision of predictions in both CNN-generated image intensities and LSTM-generated transcriptions  Conversely, MAP (Mean Average Precision) serves as a comprehensive evaluation tool  It excels in assessing the models' performance in tasks requiring precise retrieval, such as object detection and speech transcription
  • 14. FIGURE1. shows 100 epochs of CNN Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images
  • 15. FIGURE2. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images
  • 16. FIGURE3. shows 100 epochs of CNN Model to calculate RMSE and MAP using Cifar-10 dataset
  • 17. FIGURE4. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Cifar-10 dataset
  • 18. Model RMSE MAP CNN 1.77 1.40 LSTM 1.77 1.36 Model RMSE MAP CNN 2.39 1.12 LSTM 2.89 1.61 TABLE 1. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM USING KAGGLE DATASET CONTAINING 2746 IMAGES TABLE 2. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM USING CIFAR-10 DATASET
  • 19. Conclusion  RMSE and MAP are robust evaluation metrics for CNNs and LSTMs  Understanding their complementary nature enhances model assessment  Model effectiveness depends on factors like dataset quality, epochs, batch sizes, and kernel dimensions  Choosing the right metric is crucial for diverse tasks  Choose CNNs for image tasks and LSTMs for sequential data based on specific needs and available resources
  • 20. Future Scope  Future research can explore hybrid CNN-LSTM architectures for tasks that demand both spatial and sequential understanding  Optimization techniques can further enhance model performance for specific applications
  • 21. References [1] Kaur, G., & Saini, S. (2023, March). Comparison of State Vector Machine and Decision Tree-Content Based Image Retrieval Algorithms to Perceive Accuracy. In 2023 1st International Conference on ] Innovations in High Speed Communication and Signal Processing (IHCSP) (pp. 11-15). IEEE. [2] Kaur, G., Saini, S., & Sehgal, A. (2022). Introduction to Artificial Intelligence. In Artificial Intelligence (pp. 1-20). Chapman and Hall/CRC. [3] Kaur, G., Saini, S., & Sehgal, A. (2022). Machine Learning–Principles and Algorithms. In Artificial Intelligence (pp. 21-54). Chapman and Hall/CRC. [4] Kaur, G., Saini, S., & Sehgal, A. (2022). Applications of Machine Learning and Deep Learning. In Artificial Intelligence (pp. 55-70). Chapman and Hall/CRC. [5] Devi, P., & Parmar, M. (2017). A Survey on CBIR Techniques and Learning Algorithm Comparison. International Journal of Latest Trends in Engineering and Technology, 8(1), 197-205. [6] Xue, W., Wenxia, X., & Guodong, L. (2016, August). Image Edge Detection Algorithm Research Based on the CNN's Neighborhood Radius Equals 2. In 2016 International Conference on Smart Grid and Electrical Automation (ICSGEA) (pp. 115-119). IEEE. [7] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. [8] Bhatia, N. (2010). Survey of nearest neighbor techniques. arXiv preprint arXiv:1007.0085 [9] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., ... & Zieba, K. (2016). End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316. [10] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034).
  • 22. [11] Liu, S. S., & Tian, Y. T. (2010, June). Facial expression recognition method based on gabor wavelet features and fractional power polynomial kernel [ PCA. In International Symposium on Neural Networks (pp. 144-151). Springer, Berlin, Heidelberg. [12] Waibel, A., & Lee, K. F. (Eds.). (1990). Readings in speech recognition. Elsevier. [13] Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine learning, 27(3), 313-331. [14] Chan, P., & Stolfo, S. J. (1999). Toward scalable learning with non-uniform distributions: Effects and a multi-classifier approach. In In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining. [15] Guzella, T. S., & Caminhas, W. M. (2009). A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7), 10206-10222. [16] Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert systems with applications, 33(4), 847-856. [17] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386. [18] McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133. [19] Van Houdt, G., Mosquera, C., & Nápoles, G. (2020). A review on the long short-term memory model. Artificial Intelligence Review, 53, 5929-5955.