This document summarizes a presentation comparing the RMSE and MAP metrics for evaluating CNN and LSTM models. It finds that RMSE is effective for measuring the accuracy of continuous predictions from CNNs and LSTMs, while MAP excels at assessing performance on tasks requiring precise retrieval. The presentation describes the methodology, including training CNNs and LSTMs on flower and CIFAR-10 datasets. Results show RMSE and MAP values after 100 epochs, with MAP generally lower. It concludes that understanding the complementary nature of RMSE and MAP enhances model assessment, and the right metric depends on the specific task.
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Comparative Analysis of RMSE and MAP for Evaluating CNN and LSTM Models
1. Comparative Analysis of RMSE and MAP Metrics for
Evaluating CNN and LSTM Models
(PAPER ID – T-306) (CMT ID – 9)
Gagandeep Kaur, Satish Saini
SERB sponsored International Conference
RACCAI 2023
12th- 13th October 2023
CEC Jhangeri, Mohali
2. Introduction
In the field of machine learning, evaluating and comparing model performance is crucial
The choice of evaluation metrics can significantly impact decision-making, especially for
complex models like CNNs and LSTMs
This presentation explores and compares the utility of two key evaluation metrics: RMSE
and MAP, specifically within the context of CNNs and LSTMs
Machine learning models, such as CNNs and LSTMs, play vital roles in various domains,
necessitating rigorous model assessment
RMSE and MAP are two widely used metrics, each with unique strengths and suitability
for different tasks
3. The objective is to determine how RMSE and MAP perform when evaluating the
predictive abilities of CNNs and LSTMs
This research aims to provide insights into selecting appropriate evaluation criteria,
aiding practitioners and researchers in making informed choices
The presentation delves into the world of model evaluation, revealing the nuances of
RMSE and MAP and their applicability in the dynamic landscape of deep learning
4. Machine Learning Model – CNN & LSTM
CNN
CNNs are adept at automatically learning hierarchical features from data, making them
particularly suited for image analysis
They employ convolutional layers to extract low-level features like edges and gradually
build up to more complex features, enabling pattern recognition
CNNs typically consist of convolutional layers, pooling layers for spatial downsampling,
and fully connected layers for classification
5. LSTM
LSTMs are a subset of recurrent neural networks (RNNs) designed to capture
dependencies in sequential data
They excel in tasks involving time series forecasting, speech recognition, and natural
language processing where past information influences future predictions
LSTMs are composed of recurrent units with memory cells, which allow them to capture
temporal dependencies through time steps
6. Performance Parameters
RMSE (Root Mean Square Error)
RMSE is a widely adopted metric in machine learning, especially in regression tasks
It quantifies the average magnitude of errors between predicted values and actual values
by taking the square root of the mean of the squared differences
Lower RMSE values indicate better model accuracy, making it particularly useful when
dealing with continuous numerical predictions
RMSE =
𝟏
𝐍 𝐢=𝟏
𝐍
(𝐲𝐢 − 𝐲𝐢
^
)2
where,
N represents sample’s number,
𝑦𝑖 denotes the actual value,
𝑦𝑖
^
represents the predicted value.
7. MAP (Mean Average Precision)
MAP, on the other hand, is often used to assess the performance of information retrieval
and ranking systems
It measures the precision of a model's ranking by calculating the average precision across
various recall levels
MAP is valuable when evaluating models that need to prioritize and rank items
effectively, such as recommendation systems and search engines
𝐀𝐏 𝐪 =
𝟏
𝐧𝐫𝐞𝐥 𝐤=𝟏
𝐧
𝐏 𝐤 . 𝐫𝐞𝐥(𝐤)
where,
𝑛𝑟𝑒𝑙 represents relevant items number,
𝑃 𝑘 is the precision at rank k,
𝑟𝑒𝑙(𝑘) is an indicator function denoting relevance.
8. Importance of Model Evaluation
Accurate model evaluation is critical in ensuring that machine learning models are
reliable and effective in their intended tasks
The choice of the appropriate evaluation metric depends on the specific problem, goals,
and the type of data being processed
Selecting the wrong metric can lead to misleading conclusions about a model's
performance
10. Dataset Selection:
We utilized a curated Kaggle flower dataset, totaling 2,746 images and Cifar-10 dataset
The dataset was meticulously chosen to represent a diverse array of floral species, growth
stages, and environmental conditions, ensuring its representativeness
Programming Environment:
Our programming tasks were conducted in Python, utilizing the Google Colab
environment, which offers a convenient and collaborative platform for machine learning
development
11. Experimental Phases:
Our experimental process consisted of several key phases, each contributing to the overall
research outcomes
Model Training:
The initial phase involved model training, where Convolutional Neural Networks (CNNs)
and Long Short-Term Memory networks (LSTMs) were exposed to both the datasets
These models learned to recognize visual features in images
12. Fine-Tuning:
Subsequently, fine-tuning took place, a crucial iterative process involving adjustments to
model parameters and architectures
This iterative refinement enhanced the models' performance and adaptability to specific
tasks, optimizing their capabilities
13. Results
RMSE is an effective metric for measuring the accuracy of continuous predictions
It has been applied to assess the precision of predictions in both CNN-generated image
intensities and LSTM-generated transcriptions
Conversely, MAP (Mean Average Precision) serves as a comprehensive evaluation tool
It excels in assessing the models' performance in tasks requiring precise retrieval, such as
object detection and speech transcription
14. FIGURE1. shows 100 epochs of CNN Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images
15. FIGURE2. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Kaggle dataset containing 2746 images
16. FIGURE3. shows 100 epochs of CNN Model to calculate RMSE and MAP using Cifar-10 dataset
17. FIGURE4. shows 100 epochs of LSTM Model to calculate RMSE and MAP using Cifar-10 dataset
18. Model RMSE MAP
CNN 1.77 1.40
LSTM 1.77 1.36
Model RMSE MAP
CNN 2.39 1.12
LSTM 2.89 1.61
TABLE 1. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM
USING KAGGLE DATASET CONTAINING 2746 IMAGES
TABLE 2. SHOWS RMSE AND MAP AFTER 100 EPOCHS FOR CNN AND LSTM
USING CIFAR-10 DATASET
19. Conclusion
RMSE and MAP are robust evaluation metrics for CNNs and LSTMs
Understanding their complementary nature enhances model assessment
Model effectiveness depends on factors like dataset quality, epochs, batch sizes, and
kernel dimensions
Choosing the right metric is crucial for diverse tasks
Choose CNNs for image tasks and LSTMs for sequential data based on specific needs
and available resources
20. Future Scope
Future research can explore hybrid CNN-LSTM architectures for tasks that demand both
spatial and sequential understanding
Optimization techniques can further enhance model performance for specific applications
21. References
[1] Kaur, G., & Saini, S. (2023, March). Comparison of State Vector Machine and Decision Tree-Content Based Image Retrieval Algorithms to Perceive
Accuracy. In 2023 1st International Conference on ] Innovations in High Speed Communication and Signal Processing (IHCSP) (pp. 11-15). IEEE.
[2] Kaur, G., Saini, S., & Sehgal, A. (2022). Introduction to Artificial Intelligence. In Artificial Intelligence (pp. 1-20). Chapman and Hall/CRC.
[3] Kaur, G., Saini, S., & Sehgal, A. (2022). Machine Learning–Principles and Algorithms. In Artificial Intelligence (pp. 21-54). Chapman and Hall/CRC.
[4] Kaur, G., Saini, S., & Sehgal, A. (2022). Applications of Machine Learning and Deep Learning. In Artificial Intelligence (pp. 55-70). Chapman and
Hall/CRC.
[5] Devi, P., & Parmar, M. (2017). A Survey on CBIR Techniques and Learning Algorithm Comparison. International Journal of Latest Trends in
Engineering and Technology, 8(1), 197-205.
[6] Xue, W., Wenxia, X., & Guodong, L. (2016, August). Image Edge Detection Algorithm Research Based on the CNN's Neighborhood Radius Equals 2.
In 2016 International Conference on Smart Grid and Electrical Automation (ICSGEA) (pp. 115-119). IEEE.
[7] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11),
2278-2324.
[8] Bhatia, N. (2010). Survey of nearest neighbor techniques. arXiv preprint arXiv:1007.0085
[9] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., ... & Zieba, K. (2016). End to end learning for self-driving cars. arXiv
preprint arXiv:1604.07316.
[10] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In
Proceedings of the IEEE international conference on computer vision (pp. 1026-1034).
22. [11] Liu, S. S., & Tian, Y. T. (2010, June). Facial expression recognition method based on gabor wavelet features and fractional power polynomial kernel [
PCA. In International Symposium on Neural Networks (pp. 144-151). Springer, Berlin, Heidelberg.
[12] Waibel, A., & Lee, K. F. (Eds.). (1990). Readings in speech recognition. Elsevier.
[13] Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine learning, 27(3), 313-331.
[14] Chan, P., & Stolfo, S. J. (1999). Toward scalable learning with non-uniform distributions: Effects and a multi-classifier approach. In In Proceedings of
the Fourth International Conference on Knowledge Discovery and Data Mining.
[15] Guzella, T. S., & Caminhas, W. M. (2009). A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7),
10206-10222.
[16] Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert systems with
applications, 33(4), 847-856.
[17] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
[18] McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4),
115-133.
[19] Van Houdt, G., Mosquera, C., & Nápoles, G. (2020). A review on the long short-term memory model. Artificial Intelligence Review, 53, 5929-5955.