Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models

Comparative Analysis of RMSE and MAP Metrics for
Evaluating CNN and LSTM Models
(PAPER ID – T-306) (CMT ID – 9)
Gagandeep Kaur, Satish Saini
SERB sponsored International Conference
RACCAI 2023
12th- 13th October 2023
CEC Jhangeri, Mohali

Introduction
 In the field of machine learning, evaluating and comparing model performance is crucial
 The choice of evaluation metrics can significantly impact decision-making, especially for
complex models like CNNs and LSTMs
 This presentation explores and compares the utility of two key evaluation metrics: RMSE
and MAP, specifically within the context of CNNs and LSTMs
 Machine learning models, such as CNNs and LSTMs, play vital roles in various domains,
necessitating rigorous model assessment
 RMSE and MAP are two widely used metrics, each with unique strengths and suitability
for different tasks

 The objective is to determine how RMSE and MAP perform when evaluating the
predictive abilities of CNNs and LSTMs
 This research aims to provide insights into selecting appropriate evaluation criteria,
aiding practitioners and researchers in making informed choices
 The presentation delves into the world of model evaluation, revealing the nuances of
RMSE and MAP and their applicability in the dynamic landscape of deep learning

Machine Learning Model – CNN & LSTM
CNN
 CNNs are adept at automatically learning hierarchical features from data, making them
particularly suited for image analysis
 They employ convolutional layers to extract low-level features like edges and gradually
build up to more complex features, enabling pattern recognition
 CNNs typically consist of convolutional layers, pooling layers for spatial downsampling,
and fully connected layers for classification

LSTM
 LSTMs are a subset of recurrent neural networks (RNNs) designed to capture
dependencies in sequential data
 They excel in tasks involving time series forecasting, speech recognition, and natural
language processing where past information influences future predictions
 LSTMs are composed of recurrent units with memory cells, which allow them to capture
temporal dependencies through time steps

Performance Parameters
RMSE (Root Mean Square Error)
 RMSE is a widely adopted metric in machine learning, especially in regression tasks
 It quantifies the average magnitude of errors between predicted values and actual values
by taking the square root of the mean of the squared differences
 Lower RMSE values indicate better model accuracy, making it particularly useful when
dealing with continuous numerical predictions
RMSE =
𝟏
𝐍 𝐢=𝟏
𝐍
(𝐲𝐢 − 𝐲𝐢
^
)2
where,
N represents sample’s number,
𝑦𝑖 denotes the actual value,
𝑦𝑖
^
represents the predicted value.

MAP (Mean Average Precision)
 MAP, on the other hand, is often used to assess the performance of information retrieval
and ranking systems
 It measures the precision of a model's ranking by calculating the average precision across
various recall levels
 MAP is valuable when evaluating models that need to prioritize and rank items
effectively, such as recommendation systems and search engines
𝐀𝐏 𝐪 =
𝟏
𝐧𝐫𝐞𝐥 𝐤=𝟏
𝐧
𝐏 𝐤 . 𝐫𝐞𝐥(𝐤)
where,
𝑛𝑟𝑒𝑙 represents relevant items number,
𝑃 𝑘 is the precision at rank k,
𝑟𝑒𝑙(𝑘) is an indicator function denoting relevance.

Importance of Model Evaluation
 Accurate model evaluation is critical in ensuring that machine learning models are
reliable and effective in their intended tasks
 The choice of the appropriate evaluation metric depends on the specific problem, goals,
and the type of data being processed
 Selecting the wrong metric can lead to misleading conclusions about a model's
performance

Dataset Selection:
 We utilized a curated Kaggle flower dataset, totaling 2,746 images and Cifar-10 dataset
 The dataset was meticulously chosen to represent a diverse array of floral species, growth
stages, and environmental conditions, ensuring its representativeness
Programming Environment:
 Our programming tasks were conducted in Python, utilizing the Google Colab
environment, which offers a convenient and collaborative platform for machine learning
development

Experimental Phases:
 Our experimental process consisted of several key phases, each contributing to the overall
research outcomes
Model Training:
 The initial phase involved model training, where Convolutional Neural Networks (CNNs)
and Long Short-Term Memory networks (LSTMs) were exposed to both the datasets
 These models learned to recognize visual features in images

Fine-Tuning:
 Subsequently, fine-tuning took place, a crucial iterative process involving adjustments to
model parameters and architectures
 This iterative refinement enhanced the models' performance and adaptability to specific
tasks, optimizing their capabilities

Results
 RMSE is an effective metric for measuring the accuracy of continuous predictions
 It has been applied to assess the precision of predictions in both CNN-generated image
intensities and LSTM-generated transcriptions
 Conversely, MAP (Mean Average Precision) serves as a comprehensive evaluation tool
 It excels in assessing the models' performance in tasks requiring precise retrieval, such as
object detection and speech transcription

FIGURE1. shows 100 epochs of CNN Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images

FIGURE2. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images

FIGURE3. shows 100 epochs of CNN Model to calculate RMSE and MAP using Cifar-10 dataset

FIGURE4. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Cifar-10 dataset

Model RMSE MAP
CNN 1.77 1.40
LSTM 1.77 1.36
Model RMSE MAP
CNN 2.39 1.12
LSTM 2.89 1.61
TABLE 1. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM
USING KAGGLE DATASET CONTAINING 2746 IMAGES
TABLE 2. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM
USING CIFAR-10 DATASET

Conclusion
 RMSE and MAP are robust evaluation metrics for CNNs and LSTMs
 Understanding their complementary nature enhances model assessment
 Model effectiveness depends on factors like dataset quality, epochs, batch sizes, and
kernel dimensions
 Choosing the right metric is crucial for diverse tasks
 Choose CNNs for image tasks and LSTMs for sequential data based on specific needs
and available resources

Future Scope
 Future research can explore hybrid CNN-LSTM architectures for tasks that demand both
spatial and sequential understanding
 Optimization techniques can further enhance model performance for specific applications

References
[1] Kaur, G., & Saini, S. (2023, March). Comparison of State Vector Machine and Decision Tree-Content Based Image Retrieval Algorithms to Perceive
Accuracy. In 2023 1st International Conference on ] Innovations in High Speed Communication and Signal Processing (IHCSP) (pp. 11-15). IEEE.
[2] Kaur, G., Saini, S., & Sehgal, A. (2022). Introduction to Artificial Intelligence. In Artificial Intelligence (pp. 1-20). Chapman and Hall/CRC.
[3] Kaur, G., Saini, S., & Sehgal, A. (2022). Machine Learning–Principles and Algorithms. In Artificial Intelligence (pp. 21-54). Chapman and Hall/CRC.
[4] Kaur, G., Saini, S., & Sehgal, A. (2022). Applications of Machine Learning and Deep Learning. In Artificial Intelligence (pp. 55-70). Chapman and
Hall/CRC.
[5] Devi, P., & Parmar, M. (2017). A Survey on CBIR Techniques and Learning Algorithm Comparison. International Journal of Latest Trends in
Engineering and Technology, 8(1), 197-205.
[6] Xue, W., Wenxia, X., & Guodong, L. (2016, August). Image Edge Detection Algorithm Research Based on the CNN's Neighborhood Radius Equals 2.
In 2016 International Conference on Smart Grid and Electrical Automation (ICSGEA) (pp. 115-119). IEEE.
[7] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11),
2278-2324.
[8] Bhatia, N. (2010). Survey of nearest neighbor techniques. arXiv preprint arXiv:1007.0085
[9] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., ... & Zieba, K. (2016). End to end learning for self-driving cars. arXiv
preprint arXiv:1604.07316.
[10] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In
Proceedings of the IEEE international conference on computer vision (pp. 1026-1034).

[11] Liu, S. S., & Tian, Y. T. (2010, June). Facial expression recognition method based on gabor wavelet features and fractional power polynomial kernel [
PCA. In International Symposium on Neural Networks (pp. 144-151). Springer, Berlin, Heidelberg.
[12] Waibel, A., & Lee, K. F. (Eds.). (1990). Readings in speech recognition. Elsevier.
[13] Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine learning, 27(3), 313-331.
[14] Chan, P., & Stolfo, S. J. (1999). Toward scalable learning with non-uniform distributions: Effects and a multi-classifier approach. In In Proceedings of
the Fourth International Conference on Knowledge Discovery and Data Mining.
[15] Guzella, T. S., & Caminhas, W. M. (2009). A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7),
10206-10222.
[16] Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert systems with
applications, 33(4), 847-856.
[17] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
[18] McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4),
115-133.
[19] Van Houdt, G., Mosquera, C., & Nápoles, G. (2020). A review on the long short-term memory model. Artificial Intelligence Review, 53, 5929-5955.

Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models

Recommended

Recommended

More Related Content

Similar to Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models

Similar to Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models (20)

Recently uploaded

Recently uploaded (20)

Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models