The document proposes two end-to-end deep auto-encoder approaches for segmenting moving objects from surveillance videos when limited training data is available. The first approach uses transfer learning with a pre-trained VGG-16 model as the encoder and its transposed architecture as the decoder. The second approach uses a multi-depth auto-encoder with convolutional and upsampling layers. Both approaches apply data augmentation techniques like PCA and traditional methods to increase the training data size. The models are trained and evaluated on the CDnet2014 dataset, achieving better performance than other models trained with limited data.
Compact optimized deep learning model for edge: a reviewIJECEIAES
Most real-time computer vision applications, such as pedestrian detection, augmented reality, and virtual reality, heavily rely on convolutional neural networks (CNN) for real-time decision support. In addition, edge intelligence is becoming necessary for low-latency real-time applications to process the data at the source device. Therefore, processing massive amounts of data impact memory footprint, prediction time, and energy consumption, essential performance metrics in machine learning based internet of things (IoT) edge clusters. However, deploying deeper, dense, and hefty weighted CNN models on resource-constraint embedded systems and limited edge computing resources, such as memory, and battery constraints, poses significant challenges in developing the compact optimized model. Reducing the energy consumption in edge IoT networks is possible by reducing the computation and data transmission between IoT devices and gateway devices. Hence there is a high demand for making energy-efficient deep learning models for deploying on edge devices. Furthermore, recent studies show that smaller compressed models achieve significant performance compared to larger deep-learning models. This review article focuses on state-of-the-art techniques of edge intelligence, and we propose a new research framework for designing a compact optimized deep learning (DL) model deployment on edge devices.
Real-time eyeglass detection using transfer learning for non-standard facial...IJECEIAES
The aim of this paper is to build a real-time eyeglass detection framework based on deep features present in facial or ocular images, which serve as a prime factor in forensics analysis, authentication systems and many more. Generally, eyeglass detection methods were executed using cleaned and fine-tuned facial datasets; it resulted in a well-developed model, but the slightest deviation could affect the performance of the model giving poor results on real-time non-standard facial images. Therefore, a robust model is introduced which is trained on custom non-standard facial data. An Inception V3 architecture based pre-trained convolutional neural network (CNN) is used and fine-tuned using model hyper-parameters to achieve a high accuracy and good precision on non-standard facial images in real-time. This resulted in an accuracy score of about 99.2% and 99.9% for training and testing datasets respectively in less amount of time thereby showing the robustness of the model in all conditions.
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...ijaia
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...gerogepatton
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
A new approachto image classification based on adeep multiclass AdaBoosting e...IJECEIAES
In recent years, deep learning methods have been developed in order to solve the problems. These methods were effective in solving complex problems. Convolution is one of the learning methods. This method is applied in classifying and processing of images as well. Hybrid methods are another multi-component machine learning method. These methods are categorized into independent and dependent types. Ada-Boosting algorithm is one of these methods. Today, the classification of images has many applications. So far, several algorithms have been presented for binary and multi-class classification. Most of the above-mentioned methods have a high dependence on the data. The present study intends to use a combination of deep learning methods and associated hybrid methods to classify the images. It is presumed that this method is able to reduce the error rate in images classification. The proposed algorithm consists of the Ada-Boosting hybrid method and bi-layer convolutional learning method. The proposed method was analyzed after it was implemented on a multi-class Mnist data set and displayed the result of the error rate reduction. The results of this study indicate that the error rate of the proposed method is less than Ada-Boosting and convolution methods. Also, the network has more stability compared to the other methods.
Performance investigation of two-stage detection techniques using traffic lig...IAESIJAI
Using a camera to monitor an object or a group of objects over time is the process of object detection. It can be used for a variety of things, including security and surveillance, video communication, traffic light detection (TLD), object detection from compressed video in public places. In recent times, object tracking has become a popular topic in computer science particularly, the data science community, thanks to the usage of deep learning (DL) in artificial intelligence (AI). DL which convolutional neural network (CNN) as one of its techniques usually used two-stage detection methods in TLD. Despite all successes recorded in TLD through the use of two-stage detection methods, there is no study that has analyzed these methods in experimental research, studying the strength and witnesses by the researchers. Based on the needs this study analyses the applications of DL techniques in TLD. We implemented object detection for TLD using 5 two-stage detection methods with the traffic light dataset using a Jupyter notebook and the sklearn libraries. We present the achievements of two-stage detection methods in TLD, going by standard performance metrics used, FASTER-CNN was the best in detection accuracy, F1-score, precision and recall with 0.89, 0.93, 0.83 and 0.90 respectively.
Compact optimized deep learning model for edge: a reviewIJECEIAES
Most real-time computer vision applications, such as pedestrian detection, augmented reality, and virtual reality, heavily rely on convolutional neural networks (CNN) for real-time decision support. In addition, edge intelligence is becoming necessary for low-latency real-time applications to process the data at the source device. Therefore, processing massive amounts of data impact memory footprint, prediction time, and energy consumption, essential performance metrics in machine learning based internet of things (IoT) edge clusters. However, deploying deeper, dense, and hefty weighted CNN models on resource-constraint embedded systems and limited edge computing resources, such as memory, and battery constraints, poses significant challenges in developing the compact optimized model. Reducing the energy consumption in edge IoT networks is possible by reducing the computation and data transmission between IoT devices and gateway devices. Hence there is a high demand for making energy-efficient deep learning models for deploying on edge devices. Furthermore, recent studies show that smaller compressed models achieve significant performance compared to larger deep-learning models. This review article focuses on state-of-the-art techniques of edge intelligence, and we propose a new research framework for designing a compact optimized deep learning (DL) model deployment on edge devices.
Real-time eyeglass detection using transfer learning for non-standard facial...IJECEIAES
The aim of this paper is to build a real-time eyeglass detection framework based on deep features present in facial or ocular images, which serve as a prime factor in forensics analysis, authentication systems and many more. Generally, eyeglass detection methods were executed using cleaned and fine-tuned facial datasets; it resulted in a well-developed model, but the slightest deviation could affect the performance of the model giving poor results on real-time non-standard facial images. Therefore, a robust model is introduced which is trained on custom non-standard facial data. An Inception V3 architecture based pre-trained convolutional neural network (CNN) is used and fine-tuned using model hyper-parameters to achieve a high accuracy and good precision on non-standard facial images in real-time. This resulted in an accuracy score of about 99.2% and 99.9% for training and testing datasets respectively in less amount of time thereby showing the robustness of the model in all conditions.
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...ijaia
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...gerogepatton
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
A new approachto image classification based on adeep multiclass AdaBoosting e...IJECEIAES
In recent years, deep learning methods have been developed in order to solve the problems. These methods were effective in solving complex problems. Convolution is one of the learning methods. This method is applied in classifying and processing of images as well. Hybrid methods are another multi-component machine learning method. These methods are categorized into independent and dependent types. Ada-Boosting algorithm is one of these methods. Today, the classification of images has many applications. So far, several algorithms have been presented for binary and multi-class classification. Most of the above-mentioned methods have a high dependence on the data. The present study intends to use a combination of deep learning methods and associated hybrid methods to classify the images. It is presumed that this method is able to reduce the error rate in images classification. The proposed algorithm consists of the Ada-Boosting hybrid method and bi-layer convolutional learning method. The proposed method was analyzed after it was implemented on a multi-class Mnist data set and displayed the result of the error rate reduction. The results of this study indicate that the error rate of the proposed method is less than Ada-Boosting and convolution methods. Also, the network has more stability compared to the other methods.
Performance investigation of two-stage detection techniques using traffic lig...IAESIJAI
Using a camera to monitor an object or a group of objects over time is the process of object detection. It can be used for a variety of things, including security and surveillance, video communication, traffic light detection (TLD), object detection from compressed video in public places. In recent times, object tracking has become a popular topic in computer science particularly, the data science community, thanks to the usage of deep learning (DL) in artificial intelligence (AI). DL which convolutional neural network (CNN) as one of its techniques usually used two-stage detection methods in TLD. Despite all successes recorded in TLD through the use of two-stage detection methods, there is no study that has analyzed these methods in experimental research, studying the strength and witnesses by the researchers. Based on the needs this study analyses the applications of DL techniques in TLD. We implemented object detection for TLD using 5 two-stage detection methods with the traffic light dataset using a Jupyter notebook and the sklearn libraries. We present the achievements of two-stage detection methods in TLD, going by standard performance metrics used, FASTER-CNN was the best in detection accuracy, F1-score, precision and recall with 0.89, 0.93, 0.83 and 0.90 respectively.
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Content-based image retrieval (CBIR) uses the content features for
retrieving and searching the images in a given large database. Earlier,
different hand feature descriptor designs are researched based on cues that
are visual such as shape, colour, and texture used to represent these images.
Although, deep learning technologies have widely been applied as an
alternative to designing engineering that is dominant for over a decade. The
features are automatically learnt through the data. This research work
proposes integrated dual deep convolutional neural network (IDD-CNN),
IDD-CNN comprises two distinctive CNN, first CNN exploits the features
and further custom CNN is designed for exploiting the custom features.
Moreover, a novel directed graph is designed that comprises the two blocks
i.e. learning block and memory block which helps in finding the similarity
among images; since this research considers the large dataset, an optimal
strategy is introduced for compact features. Moreover, IDD-CNN is
evaluated considering the two distinctive benchmark datasets the oxford
dataset considering mean average precision (mAP) metrics and comparative
analysis shows IDD-CNN outperforms the other existing model.
Deep Learning Applications and Image Processingijtsrd
With the rapid development of digital technologies, the analysis and processing of data has become an important problem. In particular, classification, clustering and processing of complex and multi structured data required the development of new algorithms. In this process, Deep Learning solutions for solving Big Data problems are emerging. Deep Learning can be described as an advanced variant of artificial neural networks. Deep Learning algorithms are commonly used in healthcare, facial and voice recognition, defense, security and autonomous vehicles. Image processing is one of the most common applications of Deep Learning. Deep Learning software is commonly used to capture and process images by removing the errors. Image processing methods are used in many fields such as medicine, radiology, military industry, face recognition, security systems, transportation, astronomy and photography. In this study, current Deep Learning algorithms are investigated and their relationship with commonly used software in the field of image processing is determined. Ahmet Özcan | Mahmut Ünver | Atilla Ergüzen "Deep Learning Applications and Image Processing" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-2 , February 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49142.pdf Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/49142/deep-learning-applications-and-image-processing/ahmet-özcan
The advents in this technological era have resulted into enormous pool of information. This information is
stored at multiple places globally, in multiple formats. This article highlights a methodology for extracting
the video lectures delivered by experts in the domain of Computer Science by using Generalized Gamma
Mixture Model. The feature extraction is based on the DCT transformations. In order to propose the model,
the data set is pooled from the YouTube video lectures in the domain of Computer Science. The outputs
generated are evaluated using Precision and Recall.
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets to help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
A Parallel Architecture for Multiple-Face Detection Technique Using AdaBoost ...Hadi Santoso
Face detection is a very important biometric application in the field of image
analysis and computer vision. The basic face detection method is AdaBoost
algorithm with a cascading Haar-like feature classifiers based on the
framework proposed by Viola and Jones. Real-time multiple-face detection,
for instance on CCTVs with high resolution, is a computation-intensive
procedure. If the procedure is performed sequentially, an optimal real-time
performance will not be achieved. In this paper we propose an architectural
design for a parallel and multiple-face detection technique based on Viola
and Jones' framework. To do this systematically, we look at the problem
from 4 points of view, namely: data processing taxonomy, parallel memory
architecture, the model of parallel programming, as well as the design of
parallel program. We also build a prototype of the proposed parallel
technique and conduct a series of experiments to investigate the gained
acceleration.
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
This paper aims at providing insight on the transferability of deep CNN features to
unsupervised problems. We study the impact of different pretrained CNN feature extractors on
the problem of image set clustering for object classification as well as fine-grained
classification. We propose a rather straightforward pipeline combining deep-feature extraction
using a CNN pretrained on ImageNet and a classic clustering algorithm to classify sets of
images. This approach is compared to state-of-the-art algorithms in image-clustering and
provides better results. These results strengthen the belief that supervised training of deep CNN
on large datasets, with a large variability of classes, extracts better features than most carefully
designed engineering approaches, even for unsupervised tasks. We also validate our approach
on a robotic application, consisting in sorting and storing objects smartly based on clustering
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
More Related Content
Similar to End-to-end deep auto-encoder for segmenting a moving object with limited training data
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Content-based image retrieval (CBIR) uses the content features for
retrieving and searching the images in a given large database. Earlier,
different hand feature descriptor designs are researched based on cues that
are visual such as shape, colour, and texture used to represent these images.
Although, deep learning technologies have widely been applied as an
alternative to designing engineering that is dominant for over a decade. The
features are automatically learnt through the data. This research work
proposes integrated dual deep convolutional neural network (IDD-CNN),
IDD-CNN comprises two distinctive CNN, first CNN exploits the features
and further custom CNN is designed for exploiting the custom features.
Moreover, a novel directed graph is designed that comprises the two blocks
i.e. learning block and memory block which helps in finding the similarity
among images; since this research considers the large dataset, an optimal
strategy is introduced for compact features. Moreover, IDD-CNN is
evaluated considering the two distinctive benchmark datasets the oxford
dataset considering mean average precision (mAP) metrics and comparative
analysis shows IDD-CNN outperforms the other existing model.
Deep Learning Applications and Image Processingijtsrd
With the rapid development of digital technologies, the analysis and processing of data has become an important problem. In particular, classification, clustering and processing of complex and multi structured data required the development of new algorithms. In this process, Deep Learning solutions for solving Big Data problems are emerging. Deep Learning can be described as an advanced variant of artificial neural networks. Deep Learning algorithms are commonly used in healthcare, facial and voice recognition, defense, security and autonomous vehicles. Image processing is one of the most common applications of Deep Learning. Deep Learning software is commonly used to capture and process images by removing the errors. Image processing methods are used in many fields such as medicine, radiology, military industry, face recognition, security systems, transportation, astronomy and photography. In this study, current Deep Learning algorithms are investigated and their relationship with commonly used software in the field of image processing is determined. Ahmet Özcan | Mahmut Ünver | Atilla Ergüzen "Deep Learning Applications and Image Processing" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-2 , February 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49142.pdf Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/49142/deep-learning-applications-and-image-processing/ahmet-özcan
The advents in this technological era have resulted into enormous pool of information. This information is
stored at multiple places globally, in multiple formats. This article highlights a methodology for extracting
the video lectures delivered by experts in the domain of Computer Science by using Generalized Gamma
Mixture Model. The feature extraction is based on the DCT transformations. In order to propose the model,
the data set is pooled from the YouTube video lectures in the domain of Computer Science. The outputs
generated are evaluated using Precision and Recall.
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets to help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
A Parallel Architecture for Multiple-Face Detection Technique Using AdaBoost ...Hadi Santoso
Face detection is a very important biometric application in the field of image
analysis and computer vision. The basic face detection method is AdaBoost
algorithm with a cascading Haar-like feature classifiers based on the
framework proposed by Viola and Jones. Real-time multiple-face detection,
for instance on CCTVs with high resolution, is a computation-intensive
procedure. If the procedure is performed sequentially, an optimal real-time
performance will not be achieved. In this paper we propose an architectural
design for a parallel and multiple-face detection technique based on Viola
and Jones' framework. To do this systematically, we look at the problem
from 4 points of view, namely: data processing taxonomy, parallel memory
architecture, the model of parallel programming, as well as the design of
parallel program. We also build a prototype of the proposed parallel
technique and conduct a series of experiments to investigate the gained
acceleration.
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
This paper aims at providing insight on the transferability of deep CNN features to
unsupervised problems. We study the impact of different pretrained CNN feature extractors on
the problem of image set clustering for object classification as well as fine-grained
classification. We propose a rather straightforward pipeline combining deep-feature extraction
using a CNN pretrained on ImageNet and a classic clustering algorithm to classify sets of
images. This approach is compared to state-of-the-art algorithms in image-clustering and
provides better results. These results strengthen the belief that supervised training of deep CNN
on large datasets, with a large variability of classes, extracts better features than most carefully
designed engineering approaches, even for unsupervised tasks. We also validate our approach
on a robotic application, consisting in sorting and storing objects smartly based on clustering
Similar to End-to-end deep auto-encoder for segmenting a moving object with limited training data (20)
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
This research is developing an incubator system that integrates the internet of things and artificial intelligence to improve care for premature babies. The system workflow starts with sensors that collect data from the incubator. Then, the data is sent in real-time to the internet of things (IoT) broker eclipse mosquito using the message queue telemetry transport (MQTT) protocol version 5.0. After that, the data is stored in a database for analysis using the long short-term memory network (LSTM) method and displayed in a web application using an application programming interface (API) service. Furthermore, the experimental results produce as many as 2,880 rows of data stored in the database. The correlation coefficient between the target attribute and other attributes ranges from 0.23 to 0.48. Next, several experiments were conducted to evaluate the model-predicted value on the test data. The best results are obtained using a two-layer LSTM configuration model, each with 60 neurons and a lookback setting 6. This model produces an R 2 value of 0.934, with a root mean square error (RMSE) value of 0.015 and a mean absolute error (MAE) of 0.008. In addition, the R 2 value was also evaluated for each attribute used as input, with a result of values between 0.590 and 0.845.
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
Honey is produced exclusively by honeybees and stingless bees which both are well adapted to tropical and subtropical regions such as Malaysia. Stingless bees are known for producing small amounts of honey and are known for having a unique flavor profile. Problem identified that many stingless bees collapsed due to weather, temperature and environment. It is critical to understand the relationship between the production of stingless bee honey and environmental conditions to improve honey production. Thus, this paper presents a review on stingless bee's honey production and prediction modeling. About 54 previous research has been analyzed and compared in identifying the research gaps. A framework on modeling the prediction of stingless bee honey is derived. The result presents the comparison and analysis on the internet of things (IoT) monitoring systems, honey production estimation, convolution neural networks (CNNs), and automatic identification methods on bee species. It is identified based on image detection method the top best three efficiency presents CNN is at 98.67%, densely connected convolutional networks with YOLO v3 is 97.7%, and DenseNet201 convolutional networks 99.81%. This study is significant to assist the researcher in developing a model for predicting stingless honey produced by bee's output, which is important for a stable economy and food security.
A trust based secure access control using authentication mechanism for intero...IJECEIAES
The internet of things (IoT) is a revolutionary innovation in many aspects of our society including interactions, financial activity, and global security such as the military and battlefield internet. Due to the limited energy and processing capacity of network devices, security, energy consumption, compatibility, and device heterogeneity are the long-term IoT problems. As a result, energy and security are critical for data transmission across edge and IoT networks. Existing IoT interoperability techniques need more computation time, have unreliable authentication mechanisms that break easily, lose data easily, and have low confidentiality. In this paper, a key agreement protocol-based authentication mechanism for IoT devices is offered as a solution to this issue. This system makes use of information exchange, which must be secured to prevent access by unauthorized users. Using a compact contiki/cooja simulator, the performance and design of the suggested framework are validated. The simulation findings are evaluated based on detection of malicious nodes after 60 minutes of simulation. The suggested trust method, which is based on privacy access control, reduced packet loss ratio to 0.32%, consumed 0.39% power, and had the greatest average residual energy of 0.99 mJoules at 10 nodes.
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
In real world applications, data are subject to ambiguity due to several factors; fuzzy sets and fuzzy numbers propose a great tool to model such ambiguity. In case of hesitation, the complement of a membership value in fuzzy numbers can be different from the non-membership value, in which case we can model using intuitionistic fuzzy numbers as they provide flexibility by defining both a membership and a non-membership functions. In this article, we consider the intuitionistic fuzzy linear programming problem with intuitionistic polygonal fuzzy numbers, which is a generalization of the previous polygonal fuzzy numbers found in the literature. We present a modification of the simplex method that can be used to solve any general intuitionistic fuzzy linear programming problem after approximating the problem by an intuitionistic polygonal fuzzy number with n edges. This method is given in a simple tableau formulation, and then applied on numerical examples for clarity.
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
Prostate cancer is the predominant form of cancer observed in men worldwide. The application of magnetic resonance imaging (MRI) as a guidance tool for conducting biopsies has been established as a reliable and well-established approach in the diagnosis of prostate cancer. The diagnostic performance of MRI-guided prostate cancer diagnosis exhibits significant heterogeneity due to the intricate and multi-step nature of the diagnostic pathway. The development of artificial intelligence (AI) models, specifically through the utilization of machine learning techniques such as deep learning, is assuming an increasingly significant role in the field of radiology. In the realm of prostate MRI, a considerable body of literature has been dedicated to the development of various AI algorithms. These algorithms have been specifically designed for tasks such as prostate segmentation, lesion identification, and classification. The overarching objective of these endeavors is to enhance diagnostic performance and foster greater agreement among different observers within MRI scans for the prostate. This review article aims to provide a concise overview of the application of AI in the field of radiology, with a specific focus on its utilization in prostate MRI.
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
According to the World Health Organization (WHO), seventy million individuals worldwide suffer from epilepsy, a neurological disorder. While electroencephalography (EEG) is crucial for diagnosing epilepsy and monitoring the brain activity of epilepsy patients, it requires a specialist to examine all EEG recordings to find epileptic behavior. This procedure needs an experienced doctor, and a precise epilepsy diagnosis is crucial for appropriate treatment. To identify epileptic seizures, this study employed a convolutional neural network (CNN) based on raw scalp EEG signals to discriminate between preictal, ictal, postictal, and interictal segments. The possibility of these characteristics is explored by examining how well timedomain signals work in the detection of epileptic signals using intracranial Freiburg Hospital (FH), scalp Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) databases, and Temple University Hospital (TUH) EEG. To test the viability of this approach, two types of experiments were carried out. Firstly, binary class classification (preictal, ictal, postictal each versus interictal) and four-class classification (interictal versus preictal versus ictal versus postictal). The average accuracy for stage detection using CHB-MIT database was 84.4%, while the Freiburg database's time-domain signals had an accuracy of 79.7% and the highest accuracy of 94.02% for classification in the TUH EEG database when comparing interictal stage to preictal stage.
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
Modern life is strongly associated with the use of cars, but the increase in acceleration speeds and their maneuverability leads to a dangerous driving style for some drivers. In these conditions, the development of a method that allows you to track the behavior of the driver is relevant. The article provides an overview of existing methods and models for assessing the functioning of motor vehicles and driver behavior. Based on this, a combined algorithm for recognizing driving style is proposed. To do this, a set of input data was formed, including 20 descriptive features: About the environment, the driver's behavior and the characteristics of the functioning of the car, collected using OBD II. The generated data set is sent to the Kohonen network, where clustering is performed according to driving style and degree of danger. Getting the driving characteristics into a particular cluster allows you to switch to the private indicators of an individual driver and considering individual driving characteristics. The application of the method allows you to identify potentially dangerous driving styles that can prevent accidents.
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
Because of its spectral-spatial and temporal resolution of greater areas, hyperspectral imaging (HSI) has found widespread application in the field of object classification. The HSI is typically used to accurately determine an object's physical characteristics as well as to locate related objects with appropriate spectral fingerprints. As a result, the HSI has been extensively applied to object identification in several fields, including surveillance, agricultural monitoring, environmental research, and precision agriculture. However, because of their enormous size, objects require a lot of time to classify; for this reason, both spectral and spatial feature fusion have been completed. The existing classification strategy leads to increased misclassification, and the feature fusion method is unable to preserve semantic object inherent features; This study addresses the research difficulties by introducing a hybrid spectral-spatial fusion (HSSF) technique to minimize feature size while maintaining object intrinsic qualities; Lastly, a soft-margins kernel is proposed for multi-layer deep support vector machine (MLDSVM) to reduce misclassification. The standard Indian pines dataset is used for the experiment, and the outcome demonstrates that the HSSF-MLDSVM model performs substantially better in terms of accuracy and Kappa coefficient.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
End-to-end deep auto-encoder for segmenting a moving object with limited training data
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 12, No. 6, December 2022, pp. 6045~6057
ISSN: 2088-8708, DOI: 10.11591/ijece.v12i6.pp6045-6057 6045
Journal homepage: http://ijece.iaescore.com
End-to-end deep auto-encoder for segmenting a moving object
with limited training data
Abdeldjalil Kebir1
, Mahmoud Taibi2
1
Department of Electronics, Faculty of Sciences of engineers, Laboratory of Automatic and Signal Annaba,
University Badji Mokhtar, Annaba, Algeria
2
Department of Electronics, Faculty of Sciences of engineers, Laboratory LERICA, University Badji Mokhtar, Annaba, Algeria
Article Info ABSTRACT
Article history:
Received Sep 9, 2021
Revised Jun 5, 2022
Accepted Jul 2, 2022
Deep learning-based approaches have been widely used in various
applications, including segmentation and classification. However, a large
amount of data is required to train such techniques. Indeed, in the
surveillance video domain, there are few accessible data due to acquisition
and experiment complexity. In this paper, we propose an end-to-end deep
auto-encoder system for object segmenting from surveillance videos. Our
main purpose is to enhance the process of distinguishing the foreground
object when only limited data are available. To this end, we propose two
approaches based on transfer learning and multi-depth auto-encoders to
avoid over-fitting by combining classical data augmentation and principal
component analysis (PCA) techniques to improve the quality of training
data. Our approach achieves good results outperforming other popular
models, which used the same principle of training with limited data. In
addition, a detailed explanation of these techniques and some
recommendations are provided. Our methodology constitutes a useful
strategy for increasing samples in the deep learning domain and can be
applied to improve segmentation accuracy. We believe that our strategy has
a considerable interest in various applications such as medical and biological
fields, especially in the early stages of experiments where there are few
samples.
Keywords:
Auto-encoder
Data augmentation
Limited data
Principal component analysis
Segmentation
VGG16
This is an open access article under the CC BY-SA license.
Corresponding Author:
Abdeldjalil Kebir
Department of Electronics, Faculty of Sciences of engineers, Laboratory of Automatic and Signal Annaba,
University Badji Mokhtar
bp:12, 23000, Annaba, Algeria
Email: kebirabdeldjalil@gmail.com
1. INTRODUCTION
In the last years, deep learning architectures provided state-of-the-art results in various computer
vision-related tasks, including image classification, object detection, and natural language processing (NLP)
[1]–[3], to name a few. The deep learning concept is an artificial intelligence (AI) subfield which is different
from machine learning techniques in how it learns representations from data. Unlike traditional machine
learning techniques, deep learning models extract autonomously the hidden features from the data using a
hierarchical network through numerous layers. Over the last few years, a wide range of deep learning
architectures have been developed, examined, and discussed [3], [4]. In general, deep learning techniques
may be divided into four main categories, namely: recurrent neural networks (RNNs), convolutional neural
networks (CNNs), auto-encoders (AEs), and sparse coding [5].
2. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 6, December 2022: 6045-6057
6046
Recently, deep learning models are becoming one of the most important concepts to solve several
computer vision-related tasks, especially for image segmentation-based applications and dynamic
background modeling [6]–[8], because it provides better performance over traditional machine learning
methods. Several studies in the literature targeted the development of deep learning-based object
segmentation models such as the works in [6], [9]. Most of these works studied the performance of deep
learning segmentation methods in the case of using large datasets to train the deep learning model-based
models. However, only a few studies explored how to train and enhance a deep learning model performance
in the case of small datasets.
The segmentation of foreground regions that depict moving objects in videos is the core concept in
most computer vision systems. Object segmentation is considered a crucial step, whereas, it presents a
challenging task for many video surveillance applications like people counting, action recognition, and traffic
monitoring [10]–[12]. Also, building an accurate model that is capable of segmenting moving objects in
low-quality videos is even more challenging. In addition, other problems such as the presence of shadow,
illumination change, dynamic background, and bad weather conditions can make the modeling task more
complex. Moreover, the segmentation applied to small datasets remains a crucial challenge in computer
vision which is also often the case in many real-world applications.
Training deep learning models on a small dataset has attracted particular attention in recent research
studies in several fields. However, only a few works have addressed such a problem. For example, to
overcome the problem of small dataset size, Salehinejad et al. [13] used a cylindrical transformation
technique in a cylindrical coordinate system. Applying such transformations, they were able to make an
object segmentation from 3D abdominal tomography achieving higher performance than the fully
convolutional networks (FCNs) [14] in the case of using a limited number of annotated images. In order to
mitigate the lack of training data, Keshari et al. [15] proposed an spectro-spatial feature-convolutional neural
network (SSF-CNNN) architecture that modified the structure and strength of the filters obtained by CNN to
reduce the number of learnable parameters. The proposed technique has proven its effectiveness for
real-world newborn face recognition problems and multi-object classification. Salehinejad et al. [16] used a
pixel-level radial transformation in a polar coordinate system for each image in order to increase the dataset
samples’ number. The proposed approach increased the models’ generalization performance for various
datasets.
The current study is part of a deep learning model developed for moving object segmentation. Deep
learning is learned from data from the high-level features generated from the different network layers using
simple learning methods. Furthermore, the presence of big databases is necessary in order to efficiently
reconstruct the resulting segmentation mask and obtain better results from these precise features. However, in
the real-world scenario, large databases are not always available. Based on this fact, we propose enhancing
the precision of segmenting moving objects with little data training by using and comparing
end-to-end auto-encoder through transfer learning and multi-depth techniques. Furthermore, most researchers
use only traditional and general data augmentation techniques to enlarge the database such as (rotation, and
translation) used in the general domain. Moreover, it is a common practice to fix the training data and change
the model architecture. In the present work, some changes to the data are carried out to increase the number
of samples. For this reason, we propose, compare and discuss object segmentation-oriented techniques to
augment and enhance the quality of the training dataset that helps the model extract the relevant
characteristics and cover the lack of necessary samples.
According to the fact that deep learning is a highly recommended topic in several fields, we
conducted our work to be one of the first contributions that deals with the problem of training with little data
in the area of deep learning. The results obtained from the comparative experiments between the proposed
approaches and the well-known models show that the strategies used to improve and increase the database
provide good results and help the model generalization. Hence, our work is considered to be an essential
source of contribution to the research community and can be used in other areas when a large dataset is
needed.
The rest of the paper is organized. In section 2, we present some theoretical basis for the used deep
learning methods, concepts, and used materials. The proposed approaches and their performance evaluation
are discussed in section 3. We discuss and conclude the paper in sections 4 and 5, respectively.
2. METHODS AND MATERIALS
In this section, we aim to present the methods and materials used to build a robust object
segmentation system from video surveillance. We provide a detailed description of each of the main building
blocks introduced in our methodology. In addition, the details of the proposed approach and the different data
augmentation strategies are presented.
3. Int J Elec & Comp Eng ISSN: 2088-8708
End-to-end deep auto-encoder for segmenting a moving object with limited training data (Abdeldjalil Kebir)
6047
2.1. Auto-encoder
The auto-encoder is a successful deep neural network (DNN) type, which is considered among the
unsupervised algorithms. It aims to reproduce the input data at the output [17], [18] where both the input and
output layers have the same number of neurons. The auto-encoder consists of two main parts, which are the
encoder and the decoder. The encoder’s main role is to compress the input data into a lower-dimensional
representation through the use of non-linear transformation while preserving the valuable features from the
data by deleting the unnecessary elements. Then, the encoder outputs are fed to the decoder part to
decompress them and reconstruct the original data from the generated lower dimension data. According to
the literature, there exist four main auto-encoder architectures, including convolutional auto-encoder,
variational auto-encoder, denoising auto-encoder, and sparse auto-encoder. Auto-encoders can be adopted in
several applications like data denoising and dimensionality reduction [19]. Figure 1 shows the overall
network architecture.
Figure 1. The architecture of the general auto-encoder approach
The auto-encoder has three essential blocks: i) encoder part: the encoder aims to encode all of the
relevant information about the input in the latent space; ii) latent space: it represents the space represented by
a compressed form of the input; and iii) decoder part: the decoder aims to reproduce the input data at the
output level by focusing only on the data in the latent space. Encoding the input data X with nonlinear
encoder function E to Z=E(X), then decoding z to Y=D(Z) through nonlinear decoder function D which
approximates the original data X. As shown in Figure 2. We can describe this algorithm in its simplest form
as (1).
𝑌 = 𝐷(𝐸(𝑋)) (1)
The learning process minimizes the loss function between the input X and output Y as (2).
𝐿𝑜𝑠𝑠(X, Y) = ||X − Y||
2
= ||X − D(Z)||
2
= ||X − D(E(X))||
2
(2)
Figure 2. The basic expression of a general auto-encoder
2.2. Transfer learning
Transfer learning is an interesting approach to training efficient deep learning models when only
small datasets are available. Compared to deep learning models trained from scratch, the transfer learning
4. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 6, December 2022: 6045-6057
6048
technique aims to improve the model accuracy using lower computational power. The transfer learning
concept could be considered as a two-step technique. Firstly, it aims to learn data representation by training a
model on available datasets containing a large number of annotated data. Then, it uses this representation to
build a new model based on the pre-trained model using a smaller dataset, by training only some selected
layers or the final decision layer [20], [21].
Transfer learning [22], [23] is a machine learning method, where a model developed for a given task
is reused as a reference for another model on a second task. The concept is to use the knowledge learned
from the first model when solving a new problem. In other words, we can say that it is a transfer of
knowledge. However, the benefit of using transfer learning is that large dataset training is not needed to
avoid over-fitting and not many computational resources are required.
2.3. Data augmentation
The performance of deep learning models depends on the size of the training dataset. However, the
lack of available datasets, in several fields, is one of the most critical issues facing researchers. To overcome
such a problem, several solutions have been proposed over the years, including transfer learning and data
augmentation. To this end, in the current study, we tested dataset augmentation [20]. Data augmentation is a
procedure that aims to enlarge the dataset size by applying some transformations, where both the original and
the created images are used to train the model [24]. Therefore, our main objective is to use the existing
dataset to generate new data to avoid the over-fitting problem while improving the model performance. One
of the main data augmentation techniques is to perform some adjustments and geometric transformations,
including cropping, translation, scaling, mirroring, rotating, and changing lighting conditions. These methods
are widely used in the literature to solve problems related to image and video processing, including detection,
recognition, and segmentation, to name a few.
2.4. Principal component analysis
Principal component analysis (PCA) [25] is an unsupervised technique based on simple linear
transformation, it is a dimensionality reduction technique [26]. However, the main goal of a PCA is to
compress data. It is used in many applications of image processing such as image compression [27] and face
recognition [28]. Indeed, in our case, it will be used as new feature extraction compression and reconstruction
technique to preserve and extract new and essential features linearly in various levels of the distribution of
the data with the aim of using it as a new data augmentation technique.
The implementation used for the reconstruction and compression of color frames using PCA can be
divided into 3 main steps: i) splitting the frames into 3 channels R, G, and B arrays; ii) performing the PCA
and selecting the most dominated N eigenvalues on each color value matrix; and iii) recreating the original
frames by merging the R, G, and B components.
2.5. Evaluation protocol
The proposed approaches in this study are based on transfer learning and multi-depth auto-encoder.
To perform the segmentation of the moving object, we employ the auto-encoder as supervised learning for
both approaches. For the first approach, we construct the network by fine-tuning the VGG-16 network [29],
[30] that was pre-trained trained on the famous ImageNet dataset [29], [31] as the encoder part. Then, we
changed the fully connected layers with a latent space. On the other hand, the transposed architecture of the
VGG-16 has been used in the decoder part to reconstruct the resulting mask of the input frames. The
reconstruction process aims to increase the encoder output size to reconstruct the original input data through
upsampling and convolution operations, which are called transposed VGG-16 architecture. Finally, we only
train the latent space and the decoder part with the CDnet2014 dataset [32] while there is no training process
on the pre-trained encoder part.
In the second approach, convolution and pooling layers are stacked to build the encoder part,
whereas, upsampling layers are used in the decoder part to up-sampling the images in the latent space. The
hidden layers are in multi-depth. For this approach, we have trained the whole model with the CDnet2014
dataset. Figures 3 and 4 show the overall architectures proposed in the current study, which are based on
transfer learning and multi-depth auto-encoder architectures.
2.6. Dataset and metrics
The Cdnet2014 dataset (change detection) [32] is adopted to train and test our model. It consists of
real videos captured in challenging scenarios as shown in Table 1. For further generalization of the training
process, we select all video sequences (53 scenes), which contain 11 video categories from the CDnet2014
dataset; each video has an average of 2,000 frames.
5. Int J Elec & Comp Eng ISSN: 2088-8708
End-to-end deep auto-encoder for segmenting a moving object with limited training data (Abdeldjalil Kebir)
6049
Figure 3. Structure layers of the transfer learning-based model approach
Figure 4. Detailed information of five models at different depths
6. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 6, December 2022: 6045-6057
6050
Table 1. Shows the list of categories and video names in the CDnet 2014 dataset
Categories/Challenges Video names
Baseline Highway, Office, Pedestrians, PETS2006
Camera Jitter Badminton, Sidewalk, Traffic, Boulevard
Bad Weather Skating, Wet snow, Blizzard, Snowfall
Dynamic Background Boats, Canoe, Fountain1, Fountain2, Fall, Overpass
Intermittent Object Motion Abandoned box, Street light, Parking, Sofa, Tram stop, Winter driveway
Low Frame rate Port_0_17 fps, Tram crossroad_1 fps, tunnel exit_0_35 fps, Turnpike_0_5 fps
Night Videos Bridge entry, busy boulevard, fluid highway, Street corner at night, Tram station, Winter street
PTZ Continuous pan, Intermittent pan, Two-position ptz cam, Zoom in zoom out
Shadow Back door, Copy machine, Bungalows, Bus station, Cubicle, People in shade
Thermal Corridor, Library, Lakeside, Dining room, Park
Turbulence Turbulence0, Turbulence1, Turbulence2, Turbulence3
Several metrics are adopted to evaluate the deep learning-based models [32], including specificity,
precision, f-measure, false positive rate, false negative rate, and percentage of wrong classifications, by using
the four parameters of the confusion matrix. These metrics can be measured according to:
Specificity =
𝑇𝑁
𝑇𝑁+𝐹𝑃
(3)
False Positive Rate =
𝐹𝑃
𝑇𝑁+𝐹𝑃
(4)
False Negative Rate =
𝐹𝑁
𝑇𝑃+𝐹𝑁
(5)
Percentage of the Wrong Classifications = 100 ×
𝐹𝑁+𝐹𝑃
𝑇𝑃+𝐹𝑁+𝐹𝑃+𝑇𝑁
(6)
F − Measure = 2 ×
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛×𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
(7)
Precision =
𝑇𝑃
𝑇𝑃+𝐹𝑃
(8)
where TP, FP, FN, and TN denote true positive, false positive, false negative, and true negative, respectively.
2.7. Training data process
In this sub-section, we present the training data process of our data selection strategies, we manually
chose twenty-five frames for each video that contain important foreground objects to help our model learn
and segment the foreground accurately. However, the model has been fed across a variety of data training
strategies, including: i) the PCA strategy: this strategy consists of generating four different projections of the
principal components (geometric transformation) using the PCA technique at a rate of 80%, 60%, 40%, and
20% to eliminate non-informative variables, for each selected training frame; ii) the data augmentation (DA)
strategy: the DA strategy aims to enlarge the data by generating four morphological transformation frames
for each selected training frame, including translating, flipping, zooming, and rotating; iii) the PCA and DA
strategy: this strategy consists of merging the aforementioned strategies (PCA and DA) to build more data
frames for each video. Figure 5 shows the results of various transformation techniques and strategies. Table 2
provides the description and the amount of samples used for the training process of all strategies.
In our experiments, we perform our implementation using an open-source library called Keras which
was developed in 2018 by Chollet et al. [33]. The training process is done on the Google Colaboratory
platform through a Tesla K80 GPU [34], [35] for 100 epochs. We selected the RMSprop as the main
optimizer to train our model. Binary cross entropy (BCE) loss function is used to compute the loss between
the ground truth label and the predicted result, which can be measured using (9):
𝐵𝐶𝐸(𝑌, X
̃) = −(𝑌 ∗ log(𝑋
̃) + (1 − 𝑌) ∗ log(1 − X
̃)) (9)
where X̃ and Y denote the ground-truth label and the label predicted by the models, respectively. We train the
networks with 80% frames from the training data and 20% frames as validation. We evaluate the models with
50% of the dataset.
7. Int J Elec & Comp Eng ISSN: 2088-8708
End-to-end deep auto-encoder for segmenting a moving object with limited training data (Abdeldjalil Kebir)
6051
Figure 5. Examples of the different strategies transformation
Table 2. Description and number of training samples used for each strategy
Strategy Description Number of samples used to train the model
DA 25(frames) x 53(all video) x 5 6625
PCA 25(frames) x 53(all video) x 5 6625
PCA+DA 25(frames) x 53(all video) x 9 11925
3. EXPERIMENTAL RESULTS
In the current section, we aim to present the implementation and the achieved results using our
proposed approaches. Moreover, to illustrate the effectiveness of our models, we compare them with the
conventional algorithms. More detail is described in subsections 3.1 to 3.3.
3.1. Experiments
For the first approach, we freeze the first 14 layers of the VGG16 (encoder part), then we execute
the training for the remaining layers of the latent space and all the decoder part (VGG16 transposed). The
dropout layer [31] applied after every convolution layer of the decoder part is set to a learning rate of 0.2.
Figure 3 shows an explanatory diagram.
We developed five models in the second approach, starting with four hidden layers and eventually
increasing to eighteen hidden layers. The dropout layer [31] applied after every convolution layer is set to a
learning rate of 0.2 to generalize the model. Figure 4 shows the detailed information and structure concerning
the layers of the different multi-depth approach models.
3.2. Evaluation of the proposed approaches
In this sub-section, we analyze the training strategies using the obtained results and present their
influences on the adopted dataset. Subsection 3.2.1 explained about transfer learning approach. Subsection
3.2.2 explained about multi-depth approach.
3.2.1. Transfer learning approach
The obtained results using the transfer learning approach are shown in Table 3 and Figure 6. The
results clearly show that the PCA+DA strategy outperformed the other strategies providing better
performance. Figure 6 shows the training and validation accuracies and losses graphs.
Table 3. The test results obtained by PCA, DA and PCA+DA strategies
Strategy Sp FPR FNR PWC FM Pr
PCA 0.993 0.0058 0.3458 1.383 0.692 0.748
DA 0.997 0.0029 0.3338 1.073 0.742 0.853
PCA+DA 0.997 0.0020 0.3571 1.027 0.811 0.873
8. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 6, December 2022: 6045-6057
6052
According to the loss graph of the classical DA strategy as shown in Figure 6, we can clearly see the
gap between validation and training loss indicating that the model is over-fitting. The over-fitting could be
due to the lack of training samples. Also, as shown in Figure 6, the gap between train and test losses is
reduced in the case of using the PCA strategy, and there is a slightly difference between training and
validation loss values. Therefore, we can see that the over-fitting effect has been reduced due to the presence
of PCA transformation with higher precision features which help and facilitate the system for the learning
task.
Figure 6. Used strategies training and validation accuracies and losses graphs
The proposed PCA+DA strategy provides the best and lowest training and testing losses over the
other two strategies as shown in Figure 6. Thus, the model avoids over-fitting through the number of samples
added to the training data-set. For the accuracy curves in Figure 6, observe that with each increase of the
PCA samples in the training data the accuracy increases and both curves converge even more.
3.2.2. Multi-depth approach
The current study sought to evaluate the auto-encoder model in various depths, we analyzed the
influence and the results of the multi-depth training model. For that, both loss function and accuracy function
curves between training and validation data have been plotted. We fed the five models only with the DA
strategy. The results are shown in Figures 7 and 8.
We can see that for the models (A), (B), and (C), the validation and training losses are very close but
have a higher error. Validation and training accuracy have a low precision value. According to Figure 8, we
can see that the models (D) and (E) starts to over-fit, and the error begins to decrease. The validation
accuracy and training accuracy start to increase.
Generally, when we increase the number of layers, it may result in better accuracy and minimum
loss, increasing the depth means increasing the capacity of the model (can learn complex representations),
but with little training data, it may cause a high risk of over-fitting. We show that the model (E) provides
better results than the other methods in terms of accuracy and error. However, the model (E) starts to over-fit
from around epoch 50. To this end, we selected the model (E) as the base model to be improved by adding
more training data. The results of the improved version are shown in Figure 9.
The model (F) is the same model (E) but trained with more data. In addition to the data generated
using the classical DA techniques, the data used to train the model (F) is generated using the PCA strategy.
Hence, the model is trained using both DA and PCA strategies. As shown in Figure 9, we can notice that the
validation loss and accuracy of the model (F) are improved making it more robust to over-fitting. Our main
goal is to achieve high performance in terms of accuracy and loss using deep CNN architectures with small
datasets while avoiding over-fitting.
9. Int J Elec & Comp Eng ISSN: 2088-8708
End-to-end deep auto-encoder for segmenting a moving object with limited training data (Abdeldjalil Kebir)
6053
Figure 7. Used strategies training and validation accuracy graphs
Figure 8. Used strategies training and validation loss graphs
3.3. Comparison with reference algorithms
The amount of training data that is used to produce the models are also different in different
approaches. Since that, we compared our models with methods that use the same principle of training with
few training data. We compared our model with the one developed by Babaee et al. [9]. This model is trained
by 5% (100 frames) of frames from each video sequence. Furthermore, both BSUV-Net [36] and fast
BSUV-Net 2.0 [37] proposed background subtraction algorithms for unseen videos based on a fully CNN.
They introduced a spatio-temporal data augmentation technique to overcome the lack of training samples
issue. Also, our approach is compared with other traditional algorithms, including SuBSENSE [38], IUTIS-5
[39], and pan-arctic water-carbon cycles (PAWCS) [40], where the models are trained through little frames
from each video sequence of the dataset. We compare with our best models for each approach, both
10. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 6, December 2022: 6045-6057
6054
multi-depth auto-encoder (MD AE) and transfer learning auto-encoder (TL AE) training with the PCA+DA
strategy. The results obtained using our models are compared with other studies as shown in Table 4.
Figure 9. Used strategies training and validation accuracy and loss graphs
Table 4. A comparison of our result approach with reference algorithms
Method Sp FPR FNR PWC FM Pr
TL AE 0.997 0.0020 0.3571 1.027 0.811 0.873
Fast BSUV-Net 2.0 [37] 0.995 0.0044 0.1819 0.905 0.803 0.842
BSUV-Net [36] 0.995 0.0054 0.1797 1.140 0.787 0.811
CNN [9] 0.990 0.0095 0.2455 1.992 0.755 0.833
MD AE 0.990 0.0071 0.4467 1.836 0.747 0.809
IUTIS-5 [39] 0.995 0.0052 0.2151 1.198 0.772 0.808
PAWCS [40] 0.995 0.0051 0.2280 1.199 0.740 0.786
SuBSENSE [38] 0.990 0.0096 0.1876 1.678 0.741 0.751
TL AE 0.997 0.0020 0.3571 1.027 0.811 0.873
For further evaluation, we compare our approaches with other state-of-the-art models. Figure 10
provides qualitative comparison results. Three frames of video sequences from the CDnet 2014 dataset are
selected as demonstrative examples. The first column and second columns in Figure 10, shows the input
frames and the ground truth, respectively. The third column presents our deep Auto-encoder model based on
transfer learning trained through (PCA+DA) strategy. Whereas, the rest columns in Figure 10 represent the
results of the reference models. The results from Table 4 and Figure 10 show that our approach based on the
PCA+DA training strategy provides better results and selectivity segmentation than the other models.
Figure 10. Visual comparison between proposed and reference models generated foreground masks
11. Int J Elec & Comp Eng ISSN: 2088-8708
End-to-end deep auto-encoder for segmenting a moving object with limited training data (Abdeldjalil Kebir)
6055
4. DISCUSSION
The model was trained using a transfer learning technique in the first experiment, while in the
second experiment we went dive deeper into auto-encoder architecture for training a multi-depth model. As
shown in section 3, the PCA+DA strategy provides significant improvements across all objective metrics
over other adopted strategies for the transfer learning and multi-depth approaches. Consequently, we used the
models based on these strategies to compare with the popular CNN model [9] and the state-of-the-art
methods. From the results obtained in Table 4 and Figure 10, we can see that the results of our model based
on (TL AE) achieve improved performance than those obtained by the reference algorithms.
When comparing our two developed approaches, the achieved performance by the transfer learning
approach due to the transferred knowledge to the first layers of the encoder part trained with the VGG16
(ImageNet dataset) model. Furthermore, the key information provided by the main components in the PCA
transformation is ordered according to their power of representation, which encourages features in the dataset
to be statistically independent. The addition of transformed images by PCA at several rates has the objective
to increase the training dataset size while preserving essential information. In addition, to provide adapted
and meaningful data to recompense for the loss in the first layer of the decoder part of the network. As a
result, we prove the effectiveness of the proposed and novel data augmentation strategy. This strategy is
based on the preservation of necessary information, which proves its ability to avoid the over-fitting impact.
5. CONCLUSION
Motivated by the recent development of moving object segmentation methods based on deep
learning, we presented experiments comparing two deep learning approaches trained using three different
strategies to increase data size. The main purpose of the proposed methods is to increase the dataset size
using different strategies to improve the model accuracy. The adopted data augmentation strategies are; PCA
technique-based geometric transformation and classical data augmentation. The aforementioned strategies
were adopted to reduce the over-fitting problem as well as to generate the required features for the moving
object segmentation while improving the model performance. The deep learning category used in our
experiments is based on a deep convolution auto-encoder, which is mostly used for image segmentation
tasks.
The main objective is to enhance the object segmentation method based on a supervised deep
auto-encoder using limited training data. However, it can be concluded that combining morphological and
geometrical transformation for model training, helps the model enhance its generalization capabilities and
generate a precise model with minimal training data. Furthermore, compared to the traditional data
augmentation techniques (mirroring, rotation, and shifting) that rely on changing the placement of the
coordinates in the same mathematical plane which produces correlated variables and can offer minimal
enhancement. Our work demonstrates the value of purposefully enriching training data as with PCA to create
a new representation of the variables in a new plane with an important variance. As well as extracting the
variables required for the segmentation task and removing the unnecessary variables that can distort the
results of the prediction.
REFERENCES
[1] N. Rachburee and W. Punlumjeak, “An assistive model of obstacle detection based on deep learning: YOLOv3 for visually
impaired people,” International Journal of Electrical and Computer Engineering (IJECE), vol. 11, no. 4, pp. 3434–3442, Aug.
2021, doi: 10.11591/ijece.v11i4.pp3434-3442.
[2] J. Li, A. Sun, J. Han, and C. Li, “A survey on deep learning for named entity recognition,” IEEE Transactions on Knowledge and
Data Engineering, vol. 34, no. 1, pp. 50–70, Jan. 2022, doi: 10.1109/TKDE.2020.2981314.
[3] A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, P. Martinez-Gonzalez, and J. Garcia-Rodriguez, “A survey on
deep learning techniques for image and video semantic segmentation,” Applied Soft Computing, vol. 70, pp. 41–65, Sep. 2018,
doi: 10.1016/j.asoc.2018.05.018.
[4] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew, “Deep learning for visual understanding: A review,”
Neurocomputing, vol. 187, pp. 27–48, Apr. 2016, doi: 10.1016/j.neucom.2015.09.116.
[5] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A survey of deep neural network architectures and their
applications,” Neurocomputing, vol. 234, pp. 11–26, Apr. 2017, doi: 10.1016/j.neucom.2016.12.038.
[6] Y. Wang, Z. Luo, and P.-M. Jodoin, “Interactive deep learning method for segmenting moving objects,” Pattern Recognition
Letters, vol. 96, pp. 66–75, Sep. 2017, doi: 10.1016/j.patrec.2016.09.014.
[7] J. Gracewell and M. John, “Dynamic background modeling using deep learning autoencoder network,” Multimedia Tools and
Applications, vol. 79, no. 7–8, pp. 4639–4659, Feb. 2020, doi: 10.1007/s11042-019-7411-0.
[8] A. Bouguettaya, H. Zarzour, A. Kechida, and A. M. Taberkit, “Vehicle detection from UAV imagery with deep learning: a
review,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21, 2021, doi: 10.1109/TNNLS.2021.3080276.
[9] M. Babaee, D. T. Dinh, and G. Rigoll, “A deep convolutional neural network for video sequence background subtraction,”
Pattern Recognition, vol. 76, pp. 635–649, Apr. 2018, doi: 10.1016/j.patcog.2017.09.040.
[10] W. Ge, Z. Guo, Y. Dong, and Y. Chen, “Dynamic background estimation and complementary learning for pixel-wise
foreground/background segmentation,” Pattern Recognition, vol. 59, pp. 112–125, Nov. 2016, doi: 10.1016/j.patcog.2016.01.031.
12. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 6, December 2022: 6045-6057
6056
[11] T. Bouwmans, “Traditional and recent approaches in background modeling for foreground detection: An overview,” Computer
Science Review, vol. 11–12, pp. 31–66, May 2014, doi: 10.1016/j.cosrev.2014.04.001.
[12] H. Liu, X. Han, X. Li, Y. Yao, P. Huang, and Z. Tang, “Deep representation learning for road detection using Siamese network,”
Multimedia Tools and Applications, vol. 78, no. 17, pp. 24269–24283, Sep. 2019, doi: 10.1007/s11042-018-6986-1.
[13] H. Salehinejad, S. Naqvi, E. Colak, J. Barfett, and S. Valaee, “Cylindrical transform: 3D semantic segmentation of kidneys with
limited annotated images,” in 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Nov. 2018,
pp. 539–543, doi: 10.1109/GlobalSIP.2018.8646668.
[14] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in 2015 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), Jun. 2015, pp. 3431–3440, doi: 10.1109/CVPR.2015.7298965.
[15] R. Keshari, M. Vatsa, R. Singh, and A. Noore, “Learning structure and strength of CNN filters for small sample size training,” in
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 9349–9358, doi:
10.1109/CVPR.2018.00974.
[16] H. Salehinejad, S. Valaee, T. Dowdell, and J. Barfett, “Image augmentation using radial transform for training deep neural
networks,” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr. 2018,
pp. 3016–3020, doi: 10.1109/ICASSP.2018.8462241.
[17] D. Bank, N. Koenigstein, and R. Giryes, “Autoencoders,” arXiv preprint arXiv:2003.05991, Mar. 2020
[18] S. С. Leonov, A. Vasilyev, A. Makovetskii, V. Kuznetsov, and J. Diaz-Escobar, “An algorithm for selecting face features using
deep learning techniques based on autoencoders,” in Applications of Digital Image Processing XLI, Sep. 2018, doi:
10.1117/12.2321068.
[19] S. A. Ebiaredoh-Mienye, E. Esenogho, and T. G. Swart, “Artificial neural network technique for improving prediction of credit
card default: A stacked sparse autoencoder approach,” International Journal of Electrical and Computer Engineering (IJECE),
vol. 11, no. 5, pp. 4392–4402, Oct. 2021, doi: 10.11591/ijece.v11i5.pp4392-4402.
[20] K. Seddiki et al., “Towards CNN representations for small mass spectrometry data classification: from transfer learning to
cumulative learning,” bioRxiv, Mar. 2020.
[21] I. Idrissi, M. Azizi, and O. Moussaoui, “Accelerating the update of a DL-based IDS for IoT using deep transfer learning,”
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS), vol. 23, no. 2, pp. 1059–1067, Aug. 2021, doi:
10.11591/ijeecs.v23.i2.pp1059-1067.
[22] M. Al-Smadi, M. Hammad, Q. B. Baker, and S. A. Al-Zboon, “A transfer learning with deep neural network approach for diabetic
retinopathy classification,” International Journal of Electrical and Computer Engineering (IJECE), vol. 11, no. 4, pp. 3492–3501,
Aug. 2021, doi: 10.11591/ijece.v11i4.pp3492-3501.
[23] F. Zhuang et al., “A comprehensive survey on transfer learning,” Proceedings of the IEEE, vol. 109, no. 1, pp. 43–76, Jan. 2021,
doi: 10.1109/JPROC.2020.3004555.
[24] C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of Big Data, vol. 6, no. 1,
Dec. 2019, doi: 10.1186/s40537-019-0197-0.
[25] S. M. Shaharudin, N. Ahmad, and S. M. C. M. Nor, “A modified correlation in principal component analysis for torrential rainfall
patterns identification,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 9, no. 4, pp. 655–661, Dec. 2020, doi:
10.11591/ijai.v9.i4.pp655-661.
[26] C. Kamlaskar and A. Abhyankar, “Multilinear principal component analysis for iris biometric system,” Indonesian Journal of
Electrical Engineering and Computer Science (IJEECS), vol. 23, no. 3, pp. 1458–1469, Sep. 2021, doi:
10.11591/ijeecs.v23.i3.pp1458-1469.
[27] C. Clausen and H. Wechsler, “Color image compression using PCA and backpropagation learning,” Pattern Recognition, vol. 33,
no. 9, pp. 1555–1560, Sep. 2000, doi: 10.1016/S0031-3203(99)00126-0.
[28] P. C. Yuen, “Human face recognition using PCA on wavelet subband,” Journal of Electronic Imaging, vol. 9, no. 2, Apr. 2000,
doi: 10.1117/1.482742.
[29] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd International
Conference on Learning Representations, Sep. 2014.
[30] A. W. Reza, M. M. Hasan, N. Nowrin, and M. M. Ahmed Shibly, “Pre-trained deep learning models in automatic COVID-19
diagnosis,” Indonesian Journal of Electrical Engineering and Computer Science (IJEECS), vol. 22, no. 3, pp. 1540–1547, Jun.
2021, doi: 10.11591/ijeecs.v22.i3.pp1540-1547.
[31] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in
Neural Information Processing Systems, 2012, vol. 25.
[32] Y. Wang, P.-M. Jodoin, F. Porikli, J. Konrad, Y. Benezeth, and P. Ishwar, “CDnet 2014: an expanded change detection
benchmark dataset,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2014, pp. 393–400,
doi: 10.1109/CVPRW.2014.126.
[33] F. Chollet, Keras: The python deep learning library. Astrophysics Source Code Library, 2018.
[34] E. Bisong, “Google colaboratory,” in Building Machine Learning and Deep Learning Models on Google Cloud Platform,
Berkeley, CA: Apress, 2019, pp. 59–64, doi: 10.1007/978-1-4842-4470-8_7.
[35] T. S. Gunawan et al., “Development of video-based emotion recognition using deep learning with Google Colab,” TELKOMNIKA
(Telecommunication Computing Electronics and Control), vol. 18, no. 5, pp. 2463–2471, Oct. 2020, doi:
10.12928/telkomnika.v18i5.16717.
[36] M. O. Tezcan, P. Ishwar, and J. Konrad, “BSUV-Net: a fully-convolutional neural network for background subtraction of unseen
videos,” in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Mar. 2020, pp. 2763–2772, doi:
10.1109/WACV45572.2020.9093464.
[37] M. O. Tezcan, P. Ishwar, and J. Konrad, “BSUV-Net 2.0: spatio-temporal data augmentations for video-agnostic supervised
background subtraction,” IEEE Access, vol. 9, pp. 53849–53860, 2021, doi: 10.1109/ACCESS.2021.3071163.
[38] P.-L. St-Charles, G.-A. Bilodeau, and R. Bergevin, “SuBSENSE: a universal change detection method with local adaptive
sensitivity,” IEEE Transactions on Image Processing, vol. 24, no. 1, pp. 359–373, Jan. 2015, doi: 10.1109/TIP.2014.2378053.
[39] S. Bianco, G. Ciocca, and R. Schettini, “Combination of video change detection algorithms by genetic programming,” IEEE
Transactions on Evolutionary Computation, vol. 21, no. 6, pp. 914–928, Dec. 2017, doi: 10.1109/TEVC.2017.2694160.
[40] P.-L. St-Charles, G.-A. Bilodeau, and R. Bergevin, “Universal background subtraction using word consensus models,” IEEE
Transactions on Image Processing, vol. 25, no. 10, pp. 4768–4781, Oct. 2016, doi: 10.1109/TIP.2016.2598691.
13. Int J Elec & Comp Eng ISSN: 2088-8708
End-to-end deep auto-encoder for segmenting a moving object with limited training data (Abdeldjalil Kebir)
6057
BIOGRAPHIES OF AUTHORS
Abdeldjalil Kebir is a Ph.D. Student at the Badji Mokhtar University-Annaba
(Algeria) and a member of laboratory Automatic and Signal Processing of Annaba (LASA).
He received his BS and MS Degrees (Communication and Digital Processing) from the same
institution in 2011 and 2013 respectively. His main research interests include video and image
segmentation using recent machine learning methods. He can be contacted at email:
kebirabdeldjalil@gmail.com.
Mahmoud Taibi received his BSc from the USTO University-Oran (Algeria) in
Electrical Engineering in 1980, then an MSc degree from Badji-Mokhtar University-Annaba
(Algeria) in 1996. Currently, he is a full professor in Computer Science since 2006 at
Badji-Mokhtar University-Annaba (Algeria). He is a member with the LERICA laboratory.
His research interests focus on intelligent systems, intrusion detection, methods used in
automatic object detection and tracking systems, as well as techniques and tips used in the field
of Deep Learning. He can be contacted at email: mahmoudtaibi@yahoo.fr.