This document presents a vector-based backpropagation algorithm for a supervised convolution neural network (CNN) model. The key points are:
- The CNN model consists of one convolution layer followed by three fully connected hidden layers for classification of handwritten digits using the MNIST dataset.
- The classical convolution operation is replaced by a matrix operation to avoid mathematical complexities. Convolution maps and filters are represented as vectors.
- Forward propagation involves applying the new convolution and pooling operations to extract features, then passing the output through the fully connected layers.
- Backpropagation is used to update the CNN parameters (filters, weights, biases) via gradient descent to minimize a cost function, with update equations derived for both the convolution
Adaptive lifting based image compression scheme using interactive artificial ...csandit
This paper presents image compression method using Interactive Artificial Bee Colony (IABC) optimization algorithm. The proposed method reduces storage and facilitates data transmission by reducing transmission costs. To get the finest quality of compressed image, utilizing local search, IABC determines different update coefficient, and the best update coefficient is chosen
optimally. By using local search in the update step, we alter the center pixels with the coefficient in 8-different directions with a considerable window size, to produce the compressed image, expressed in terms of both PSNR and compression ratio. The IABC brings in the idea of
universal gravitation into the consideration of the affection between onlooker bees and the employed bees. By passing on different values of the control parameter, the universal gravitation involved in the IABC has various quantities of the single onlooker bee and employed bees. As a result when compared to existing methods, the proposed work gives better PSNR.
This document proposes using machine learning techniques to predict COVID-19 infections based on chest x-ray images. Specifically, it involves using discrete wavelet transform to extract space-frequency features from chest x-rays, reducing the dimensionality of features using Shannon entropy, and then training standard machine learning classifiers like logistic regression, support vector machine, decision tree, and convolutional neural network on the extracted features to classify images as COVID-19 positive or negative. The document provides background on the proposed techniques of discrete wavelet transform, entropy, and various machine learning models.
Road Segmentation from satellites imagesYoussefKitane
The document describes a method for road segmentation from satellite images using a fully convolutional neural network approach. Specifically, it proposes using a ResNet-50 encoder with a custom decoder consisting of transpose convolutional layers. To address the limited size of the training dataset, the method employs transfer learning using an ImageNet-pretrained ResNet-50 and data augmentation. It also pre-trains the network on the larger SpaceNet roads dataset before fine-tuning on the target dataset. Experimental results show the proposed approach achieves better performance than baselines that do not use these techniques, demonstrating their effectiveness for addressing the small dataset size.
The document describes a vehicle detection system using a fully convolutional regression network (FCRN). The FCRN is trained on patches from aerial images to predict a density map indicating vehicle locations. The proposed system is evaluated on two public datasets and achieves higher precision and recall than comparative shallow and deep learning methods for vehicle detection in aerial images. The system could help with applications like urban planning and traffic management.
This document provides an overview of convolutional neural networks (CNNs) for image and video recognition. It discusses that CNNs have greatly improved image classification accuracy on ImageNet over the years. CNNs consist of convolutional layers that apply filters to extract features, pooling layers that reduce the spatial size, and fully connected layers for classification. Training involves tuning parameters through backpropagation, while inference uses a trained model for classification. Example networks discussed include AlexNet, VGG16, GoogLeNet and ResNet, which contain increasing numbers of parameters and computational operations.
Fixed-Point Code Synthesis for Neural Networksgerogepatton
Over the last few years, neural networks have started penetrating safety critical systems to take decisions in robots, rockets, autonomous driving car, etc. A problem is that these critical systems often have limited computing resources. Often, they use the fixed-point arithmetic for its many advantages (rapidity, compatibility with small memory devices.) In this article, a new technique is introduced to tune the formats (precision) of already trained neural networks using fixed-point arithmetic, which can be implemented using integer operations only. The new optimized neural network computes the output with fixed-point numbers without modifying the accuracy up to a threshold fixed by the user. A fixed-point code is synthesized for the new optimized neural network ensuring the respect of the threshold for any input vector belonging the range [xmin, xmax] determined during the analysis. From a technical point of view, we do a preliminary analysis of our floating neural network to determine the worst cases, then we generate a system of linear constraints among integer variables that we can solve by linear programming. The solution of this system is the new fixed-point format of each neuron. The experimental results obtained show the efficiency of our method which can ensure that the new fixed-point neural network has the same behavior as the initial floating-point neural network.
Fixed-Point Code Synthesis for Neural NetworksIJITE
Over the last few years, neural networks have started penetrating safety critical systems to take decisions in robots, rockets, autonomous driving car, etc. A problem is that these critical systems often have limited computing resources. Often, they use the fixed-point arithmetic for its many advantages (rapidity, compatibility with small memory devices.) In this article, a new technique is introduced to tune the formats (precision) of already trained neural networks using fixed-point arithmetic, which can be implemented using integer operations only. The new optimized neural network computes the output with fixed-point numbers without modifying the accuracy up to a threshold fixed by the user. A fixed-point code is synthesized for the new optimized neural network ensuring the respect of the threshold for any input vector belonging the range [xmin, xmax] determined during the analysis. From a technical point of view, we do a preliminary analysis of our floating neural network to determine the worst cases, then we generate a system of linear constraints among integer variables that we can solve by linear programming. The solution of this system is the new fixed-point format of each neuron. The experimental results obtained show the efficiency of our method which can ensure that the new fixed-point neural network has the same behavior as the initial floating-point neural network.
Adaptive lifting based image compression scheme using interactive artificial ...csandit
This paper presents image compression method using Interactive Artificial Bee Colony (IABC) optimization algorithm. The proposed method reduces storage and facilitates data transmission by reducing transmission costs. To get the finest quality of compressed image, utilizing local search, IABC determines different update coefficient, and the best update coefficient is chosen
optimally. By using local search in the update step, we alter the center pixels with the coefficient in 8-different directions with a considerable window size, to produce the compressed image, expressed in terms of both PSNR and compression ratio. The IABC brings in the idea of
universal gravitation into the consideration of the affection between onlooker bees and the employed bees. By passing on different values of the control parameter, the universal gravitation involved in the IABC has various quantities of the single onlooker bee and employed bees. As a result when compared to existing methods, the proposed work gives better PSNR.
This document proposes using machine learning techniques to predict COVID-19 infections based on chest x-ray images. Specifically, it involves using discrete wavelet transform to extract space-frequency features from chest x-rays, reducing the dimensionality of features using Shannon entropy, and then training standard machine learning classifiers like logistic regression, support vector machine, decision tree, and convolutional neural network on the extracted features to classify images as COVID-19 positive or negative. The document provides background on the proposed techniques of discrete wavelet transform, entropy, and various machine learning models.
Road Segmentation from satellites imagesYoussefKitane
The document describes a method for road segmentation from satellite images using a fully convolutional neural network approach. Specifically, it proposes using a ResNet-50 encoder with a custom decoder consisting of transpose convolutional layers. To address the limited size of the training dataset, the method employs transfer learning using an ImageNet-pretrained ResNet-50 and data augmentation. It also pre-trains the network on the larger SpaceNet roads dataset before fine-tuning on the target dataset. Experimental results show the proposed approach achieves better performance than baselines that do not use these techniques, demonstrating their effectiveness for addressing the small dataset size.
The document describes a vehicle detection system using a fully convolutional regression network (FCRN). The FCRN is trained on patches from aerial images to predict a density map indicating vehicle locations. The proposed system is evaluated on two public datasets and achieves higher precision and recall than comparative shallow and deep learning methods for vehicle detection in aerial images. The system could help with applications like urban planning and traffic management.
This document provides an overview of convolutional neural networks (CNNs) for image and video recognition. It discusses that CNNs have greatly improved image classification accuracy on ImageNet over the years. CNNs consist of convolutional layers that apply filters to extract features, pooling layers that reduce the spatial size, and fully connected layers for classification. Training involves tuning parameters through backpropagation, while inference uses a trained model for classification. Example networks discussed include AlexNet, VGG16, GoogLeNet and ResNet, which contain increasing numbers of parameters and computational operations.
Fixed-Point Code Synthesis for Neural Networksgerogepatton
Over the last few years, neural networks have started penetrating safety critical systems to take decisions in robots, rockets, autonomous driving car, etc. A problem is that these critical systems often have limited computing resources. Often, they use the fixed-point arithmetic for its many advantages (rapidity, compatibility with small memory devices.) In this article, a new technique is introduced to tune the formats (precision) of already trained neural networks using fixed-point arithmetic, which can be implemented using integer operations only. The new optimized neural network computes the output with fixed-point numbers without modifying the accuracy up to a threshold fixed by the user. A fixed-point code is synthesized for the new optimized neural network ensuring the respect of the threshold for any input vector belonging the range [xmin, xmax] determined during the analysis. From a technical point of view, we do a preliminary analysis of our floating neural network to determine the worst cases, then we generate a system of linear constraints among integer variables that we can solve by linear programming. The solution of this system is the new fixed-point format of each neuron. The experimental results obtained show the efficiency of our method which can ensure that the new fixed-point neural network has the same behavior as the initial floating-point neural network.
Fixed-Point Code Synthesis for Neural NetworksIJITE
Over the last few years, neural networks have started penetrating safety critical systems to take decisions in robots, rockets, autonomous driving car, etc. A problem is that these critical systems often have limited computing resources. Often, they use the fixed-point arithmetic for its many advantages (rapidity, compatibility with small memory devices.) In this article, a new technique is introduced to tune the formats (precision) of already trained neural networks using fixed-point arithmetic, which can be implemented using integer operations only. The new optimized neural network computes the output with fixed-point numbers without modifying the accuracy up to a threshold fixed by the user. A fixed-point code is synthesized for the new optimized neural network ensuring the respect of the threshold for any input vector belonging the range [xmin, xmax] determined during the analysis. From a technical point of view, we do a preliminary analysis of our floating neural network to determine the worst cases, then we generate a system of linear constraints among integer variables that we can solve by linear programming. The solution of this system is the new fixed-point format of each neuron. The experimental results obtained show the efficiency of our method which can ensure that the new fixed-point neural network has the same behavior as the initial floating-point neural network.
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS Academy
1. The document discusses classification and estimation using artificial neural networks. It provides examples of classification problems from industries like mining and banking loan approval.
2. It describes the basic components of an artificial neural network including the feedforward architecture with multiple layers of neurons and the backpropagation algorithm for learning network weights.
3. Examples are given to illustrate how neural networks can perform nonlinear classification and estimation through combinations of linear perceptron units in multiple layers with the backpropagation algorithm for training the network weights.
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...Cemal Ardil
This document compares first and second order training algorithms for artificial neural networks. It summarizes that feedforward network training is a special case of functional minimization where no explicit model of the data is assumed. Gradient descent, conjugate gradient, and quasi-Newton methods are discussed as first and second order training methods. Conjugate gradient and quasi-Newton methods are shown to outperform gradient descent methods experimentally using share rate data. The backpropagation algorithm and its variations are described for finding the gradient of the error function with respect to the network weights. Conjugate gradient techniques are discussed as a way to find the search direction without explicitly computing the Hessian matrix.
- The document presents a neural network model for recognizing handwritten digits. It uses a dataset of 20x20 pixel grayscale images of digits 0-9.
- The proposed neural network has an input layer of 400 nodes, a hidden layer of 25 nodes, and an output layer of 10 nodes. It is trained using backpropagation to classify images.
- The model achieves an accuracy of over 96.5% on test data after 200 iterations of training, outperforming a logistic regression model which achieved 91.5% accuracy. Future work could involve classifying more complex natural images.
Reconfiguration layers of convolutional neural network for fundus patches cla...journalBEEI
Convolutional neural network (CNN) is a method of supervised deep learning. The architectures including AlexNet, VGG16, VGG19, ResNet 50, ResNet101, GoogleNet, Inception-V3, Inception ResNet-V2, and Squeezenet that have 25 to 825 layers. This study aims to simplify layers of CNN architectures and increased accuracy for fundus patches classification. Fundus patches classify two categories: normal and neovascularization. Data used for classification is MESSIDOR and Retina Image Bank that have 2,080 patches. Results show the best accuracy of 93.17% for original data and 99,33% for augmentation data using CNN 31 layers. It consists input layer, 7 convolutional layers, 7 batch normalization, 7 rectified linear unit, 6 max-pooling, fully connected layer, softmax, and output layer.
RK7(5) is a numerical method was used to minimize the local truncation error using selection step size algorithm.
Time multiplexing CNN simulator was modied using RK7(5) to improve the performance of the simulator and the
quality of the output image for edge detection. The results showed better performance than those in the literature.
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering number to calculate the optimal clustering number, then the samples of input space is mapped to highdimensional feature space using gaussian kernel and clustering in the feature space. The Matlab simulation results confirmed that the algorithm's performance has greatly improvement than classical clustering algorithm and has faster convergence speed and more accurate clustering results.
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not
obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an
adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering
number to calculate the optimal clustering number, then the samples of input space is mapped to highdimensional
feature space using gaussian kernel and clustering in the feature space. The Matlab simulation
results confirmed that the algorithm's performance has greatly improvement than classical clustering algorithm and has faster convergence speed and more accurate clustering results
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...ArchiLab 7
The document describes a three-stage approach to data mining that uses self-organizing maps, clustering, and fuzzy rule induction. In the first stage, a self-organizing map is used to reduce the data size while preserving topology. In the second stage, clustering identifies regions of interest. In the third stage, fuzzy rules are generated to describe the clusters. The approach was tested on image and real-world datasets and produced intuitive results.
International journal of applied sciences and innovation vol 2015 - no 1 - ...sophiabelthome
This document presents a finite element model using cubic elements to characterize electromagnetic fields in a 3D waveguide transmission line. It uses the free and open-source GNU Octave software to perform the electromagnetic analysis and solve the Maxwell equations. The cubic finite element discretization is shown to provide an efficient solution with sparse matrices, reducing computational cost. Numerical results demonstrate good agreement between the cubic element model and analytical solutions for the electric and magnetic fields in the waveguide.
Algorithm Finding Maximum Concurrent Multicommodity Linear Flow with Limited ...IJCNCJournal
Graphs and extended networks are is powerful mathematical tools applied in many fields as transportation,
communication, informatics, economy, … Algorithms to find Maximum Concurrent Multicommodity Flow
with Limited Cost on extended traffic networks are introduced in the works we did. However, with those
algorithms, capacities of two-sided lines are shared fully for two directions. This work studies the more
general and practical case, where flows are limited to use two-sided lines with a single parameter called
regulating coefficient. The algorithm is presented in the programming language Java. The algorithm is
coded in programming language Java with extended network database in database management system
MySQL and offers exact results.
The document describes a study that used deep learning and convolutional neural networks to develop an image-based detection model for classifying four types of nuts (hazelnut, walnut, pecan, forest nut) with 100% accuracy on test data. The model was developed using Python in Google Colab, utilizing a dataset of 1595 images. A VGG16 model pre-trained on ImageNet was used to extract features from the images. The model contains convolutional and max pooling layers for feature extraction, and fully connected layers for classification. Training, validation, and testing of the model was performed in Google Colab using a GPU, demonstrating the feasibility of deep learning for nut detection applications.
This is a preliminary study and the objective of this study has been to reconstruct of missing parts or scratches of digital images is an important field used extensively in artwork restoration. This restoration can be done by using two approaches, image inpainting, and texture synthesis. There are many techniques for the two previous approaches that can carry out the process optimally and accurately. In this paper, the advantages and disadvantages of most algorithms of the image inpainting approach are discussed. Among the different algorithms, the proposed dynamic masking method outperformed than other techniques. This modification produces rapid and simple for reconstruction of small missing and damaged portions of images that are two to three orders of magnitude faster than current methods while producing comparable results with respect to other.
A simple framework for contrastive learning of visual representationsDevansh16
Link: https://machine-learning-made-simple.medium.com/learnings-from-simclr-a-framework-contrastive-learning-for-visual-representations-6c145a5d8e99
If you'd like to discuss something, text me on LinkedIn, IG, or Twitter. To support me, please use my referral link to Robinhood. It's completely free, and we both get a free stock. Not using it is literally losing out on free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
Live conversations at twitch here: https://rb.gy/zlhk9y
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.
Comments: ICML'2020. Code and pretrained models at this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as: arXiv:2002.05709 [cs.LG]
(or arXiv:2002.05709v3 [cs.LG] for this version)
Submission history
From: Ting Chen [view email]
[v1] Thu, 13 Feb 2020 18:50:45 UTC (5,093 KB)
[v2] Mon, 30 Mar 2020 15:32:51 UTC (5,047 KB)
[v3] Wed, 1 Jul 2020 00:09:08 UTC (5,829 KB)
Improved interpretability for Computer-Aided Assessment of Retinopathy of Pre...Mara Graziani
1. The document presents methods for improving the interpretability and accuracy of computer-aided severity assessment of retinopathy of prematurity (ROP).
2. Both traditional handcrafted feature extraction and deep learning approaches are discussed, including the use of vessel segmentation, feature extraction, and classification models.
3. The authors propose using regression concept vectors to relate deep learning features to continuous clinical measures and concepts, in order to provide individualized explanations of a model's assessments. This allows interpretation of what features the models may be focusing on.
This document provides an overview of deep learning algorithms, including deep neural networks, convolutional neural networks, deep belief networks, and restricted Boltzmann machines. It discusses key concepts such as learning in deep neural networks, the evolution timeline of deep learning approaches, deep architectures, and restricted Boltzmann machines. It also covers training restricted Boltzmann machines using contrastive divergence, constructing deep belief networks by stacking restricted Boltzmann machines, and practical considerations for pre-training and fine-tuning deep belief networks.
This summary provides an overview of the key points from the document:
1) The document presents the use of General Regression Neural Networks (GRNN) to predict propagation path loss in an urban environment based on measurements taken in Kavala, Greece.
2) Two neural network models are studied - one for path loss prediction and another using error control. Their performance is compared to measured path loss values based on error metrics.
3) For line-of-sight predictions, the GRNN model achieves better performance than empirical models due to using multiple input parameters and generalization. For non-line-of-sight, a third GRNN model including street orientation has the lowest error rates.
The document reports on the results of three image processing projects. The first project implemented Lloyd-Max quantization to reduce image file sizes and Retinex theory to compensate for uneven illumination. The second project used principal component analysis to compute eigenfaces for face recognition. The third project performed linear discriminant analysis and tensor-based linear discriminant analysis for binary classification and visual object recognition. Illumination compensation subtracted an estimated illumination plane from image intensities to reduce shadows. Eigenfaces were the principal components of a training set of face images. Tensor-based linear discriminant analysis treated images as higher-order tensors to outperform conventional LDA.
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...CSCJournals
The thirst of better and faster retrieval techniques has always fuelled to the research in content based image retrieval (CBIR). The paper presents innovative content based image retrieval (CBIR) techniques based on feature vectors as fractional coefficients of transformed images using Discrete Cosine, Walsh, Haar and Kekre’s transforms. Here the advantage of energy compaction of transforms in higher coefficients is taken to greatly reduce the feature vector size per image by taking fractional coefficients of transformed image. The feature vectors are extracted in fourteen different ways from the transformed image, with the first being considering all the coefficients of transformed image and then fourteen reduced coefficients sets (as 50%, 25%, 12.5%, 6.25%, 3.125%, 1.5625% ,0.7813%, 0.39%, 0.195%, 0.097%, 0.048%, 0.024%, 0.012% and 0.06% of complete transformed image) are considered as feature vectors. The four transforms are applied on gray image equivalents and the colour components of images to extract Gray and RGB feature sets respectively. Instead of using all coefficients of transformed images as feature vector for image retrieval, these fourteen reduced coefficients sets for gray as well as RGB feature vectors are used, resulting into better performance and lower computations. The proposed CBIR techniques are implemented on a database having 1000 images spread across 11 categories. For each proposed CBIR technique 55 queries (5 per category) are fired on the database and net average precision and recall are computed for all feature sets per transform. The results have shown performance improvement (higher precision and recall values) with fractional coefficients compared to complete transform of image at reduced computations resulting in faster retrieval. Finally Kekre’s transform surpasses all other discussed transforms in performance with highest precision and recall values for fractional coefficients (6.25% and 3.125% of all coefficients) and computation are lowered by 94.08% as compared to DCT.
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIijcsit
In this document, we propose a simple algorithm for the encryption of gray-scale images, although the
scheme is perfectly usable in color images. Prior to encryption, the proposed algorithm includes a pair of
permutation processes, inspired by the Bernoulli mapping. The permutation disperses the image
information to hinder the unauthorized recovery of the original image. The image is encrypted using the
XOR function between a sequence generated from the same Bernoulli mapping and the image data,
obtained after two permutation processes. Finally, for the verification of the algorithm, the gray-scale Lena
pattern image was used; calculating histograms for each stage alongside of the encryption process. The
histograms prove dispersion evolution for pattern image during whole algorithm.
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS Academy
1. The document discusses classification and estimation using artificial neural networks. It provides examples of classification problems from industries like mining and banking loan approval.
2. It describes the basic components of an artificial neural network including the feedforward architecture with multiple layers of neurons and the backpropagation algorithm for learning network weights.
3. Examples are given to illustrate how neural networks can perform nonlinear classification and estimation through combinations of linear perceptron units in multiple layers with the backpropagation algorithm for training the network weights.
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...Cemal Ardil
This document compares first and second order training algorithms for artificial neural networks. It summarizes that feedforward network training is a special case of functional minimization where no explicit model of the data is assumed. Gradient descent, conjugate gradient, and quasi-Newton methods are discussed as first and second order training methods. Conjugate gradient and quasi-Newton methods are shown to outperform gradient descent methods experimentally using share rate data. The backpropagation algorithm and its variations are described for finding the gradient of the error function with respect to the network weights. Conjugate gradient techniques are discussed as a way to find the search direction without explicitly computing the Hessian matrix.
- The document presents a neural network model for recognizing handwritten digits. It uses a dataset of 20x20 pixel grayscale images of digits 0-9.
- The proposed neural network has an input layer of 400 nodes, a hidden layer of 25 nodes, and an output layer of 10 nodes. It is trained using backpropagation to classify images.
- The model achieves an accuracy of over 96.5% on test data after 200 iterations of training, outperforming a logistic regression model which achieved 91.5% accuracy. Future work could involve classifying more complex natural images.
Reconfiguration layers of convolutional neural network for fundus patches cla...journalBEEI
Convolutional neural network (CNN) is a method of supervised deep learning. The architectures including AlexNet, VGG16, VGG19, ResNet 50, ResNet101, GoogleNet, Inception-V3, Inception ResNet-V2, and Squeezenet that have 25 to 825 layers. This study aims to simplify layers of CNN architectures and increased accuracy for fundus patches classification. Fundus patches classify two categories: normal and neovascularization. Data used for classification is MESSIDOR and Retina Image Bank that have 2,080 patches. Results show the best accuracy of 93.17% for original data and 99,33% for augmentation data using CNN 31 layers. It consists input layer, 7 convolutional layers, 7 batch normalization, 7 rectified linear unit, 6 max-pooling, fully connected layer, softmax, and output layer.
RK7(5) is a numerical method was used to minimize the local truncation error using selection step size algorithm.
Time multiplexing CNN simulator was modied using RK7(5) to improve the performance of the simulator and the
quality of the output image for edge detection. The results showed better performance than those in the literature.
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering number to calculate the optimal clustering number, then the samples of input space is mapped to highdimensional feature space using gaussian kernel and clustering in the feature space. The Matlab simulation results confirmed that the algorithm's performance has greatly improvement than classical clustering algorithm and has faster convergence speed and more accurate clustering results.
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not
obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an
adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering
number to calculate the optimal clustering number, then the samples of input space is mapped to highdimensional
feature space using gaussian kernel and clustering in the feature space. The Matlab simulation
results confirmed that the algorithm's performance has greatly improvement than classical clustering algorithm and has faster convergence speed and more accurate clustering results
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...ArchiLab 7
The document describes a three-stage approach to data mining that uses self-organizing maps, clustering, and fuzzy rule induction. In the first stage, a self-organizing map is used to reduce the data size while preserving topology. In the second stage, clustering identifies regions of interest. In the third stage, fuzzy rules are generated to describe the clusters. The approach was tested on image and real-world datasets and produced intuitive results.
International journal of applied sciences and innovation vol 2015 - no 1 - ...sophiabelthome
This document presents a finite element model using cubic elements to characterize electromagnetic fields in a 3D waveguide transmission line. It uses the free and open-source GNU Octave software to perform the electromagnetic analysis and solve the Maxwell equations. The cubic finite element discretization is shown to provide an efficient solution with sparse matrices, reducing computational cost. Numerical results demonstrate good agreement between the cubic element model and analytical solutions for the electric and magnetic fields in the waveguide.
Algorithm Finding Maximum Concurrent Multicommodity Linear Flow with Limited ...IJCNCJournal
Graphs and extended networks are is powerful mathematical tools applied in many fields as transportation,
communication, informatics, economy, … Algorithms to find Maximum Concurrent Multicommodity Flow
with Limited Cost on extended traffic networks are introduced in the works we did. However, with those
algorithms, capacities of two-sided lines are shared fully for two directions. This work studies the more
general and practical case, where flows are limited to use two-sided lines with a single parameter called
regulating coefficient. The algorithm is presented in the programming language Java. The algorithm is
coded in programming language Java with extended network database in database management system
MySQL and offers exact results.
The document describes a study that used deep learning and convolutional neural networks to develop an image-based detection model for classifying four types of nuts (hazelnut, walnut, pecan, forest nut) with 100% accuracy on test data. The model was developed using Python in Google Colab, utilizing a dataset of 1595 images. A VGG16 model pre-trained on ImageNet was used to extract features from the images. The model contains convolutional and max pooling layers for feature extraction, and fully connected layers for classification. Training, validation, and testing of the model was performed in Google Colab using a GPU, demonstrating the feasibility of deep learning for nut detection applications.
This is a preliminary study and the objective of this study has been to reconstruct of missing parts or scratches of digital images is an important field used extensively in artwork restoration. This restoration can be done by using two approaches, image inpainting, and texture synthesis. There are many techniques for the two previous approaches that can carry out the process optimally and accurately. In this paper, the advantages and disadvantages of most algorithms of the image inpainting approach are discussed. Among the different algorithms, the proposed dynamic masking method outperformed than other techniques. This modification produces rapid and simple for reconstruction of small missing and damaged portions of images that are two to three orders of magnitude faster than current methods while producing comparable results with respect to other.
A simple framework for contrastive learning of visual representationsDevansh16
Link: https://machine-learning-made-simple.medium.com/learnings-from-simclr-a-framework-contrastive-learning-for-visual-representations-6c145a5d8e99
If you'd like to discuss something, text me on LinkedIn, IG, or Twitter. To support me, please use my referral link to Robinhood. It's completely free, and we both get a free stock. Not using it is literally losing out on free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
Live conversations at twitch here: https://rb.gy/zlhk9y
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.
Comments: ICML'2020. Code and pretrained models at this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as: arXiv:2002.05709 [cs.LG]
(or arXiv:2002.05709v3 [cs.LG] for this version)
Submission history
From: Ting Chen [view email]
[v1] Thu, 13 Feb 2020 18:50:45 UTC (5,093 KB)
[v2] Mon, 30 Mar 2020 15:32:51 UTC (5,047 KB)
[v3] Wed, 1 Jul 2020 00:09:08 UTC (5,829 KB)
Improved interpretability for Computer-Aided Assessment of Retinopathy of Pre...Mara Graziani
1. The document presents methods for improving the interpretability and accuracy of computer-aided severity assessment of retinopathy of prematurity (ROP).
2. Both traditional handcrafted feature extraction and deep learning approaches are discussed, including the use of vessel segmentation, feature extraction, and classification models.
3. The authors propose using regression concept vectors to relate deep learning features to continuous clinical measures and concepts, in order to provide individualized explanations of a model's assessments. This allows interpretation of what features the models may be focusing on.
This document provides an overview of deep learning algorithms, including deep neural networks, convolutional neural networks, deep belief networks, and restricted Boltzmann machines. It discusses key concepts such as learning in deep neural networks, the evolution timeline of deep learning approaches, deep architectures, and restricted Boltzmann machines. It also covers training restricted Boltzmann machines using contrastive divergence, constructing deep belief networks by stacking restricted Boltzmann machines, and practical considerations for pre-training and fine-tuning deep belief networks.
This summary provides an overview of the key points from the document:
1) The document presents the use of General Regression Neural Networks (GRNN) to predict propagation path loss in an urban environment based on measurements taken in Kavala, Greece.
2) Two neural network models are studied - one for path loss prediction and another using error control. Their performance is compared to measured path loss values based on error metrics.
3) For line-of-sight predictions, the GRNN model achieves better performance than empirical models due to using multiple input parameters and generalization. For non-line-of-sight, a third GRNN model including street orientation has the lowest error rates.
The document reports on the results of three image processing projects. The first project implemented Lloyd-Max quantization to reduce image file sizes and Retinex theory to compensate for uneven illumination. The second project used principal component analysis to compute eigenfaces for face recognition. The third project performed linear discriminant analysis and tensor-based linear discriminant analysis for binary classification and visual object recognition. Illumination compensation subtracted an estimated illumination plane from image intensities to reduce shadows. Eigenfaces were the principal components of a training set of face images. Tensor-based linear discriminant analysis treated images as higher-order tensors to outperform conventional LDA.
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...CSCJournals
The thirst of better and faster retrieval techniques has always fuelled to the research in content based image retrieval (CBIR). The paper presents innovative content based image retrieval (CBIR) techniques based on feature vectors as fractional coefficients of transformed images using Discrete Cosine, Walsh, Haar and Kekre’s transforms. Here the advantage of energy compaction of transforms in higher coefficients is taken to greatly reduce the feature vector size per image by taking fractional coefficients of transformed image. The feature vectors are extracted in fourteen different ways from the transformed image, with the first being considering all the coefficients of transformed image and then fourteen reduced coefficients sets (as 50%, 25%, 12.5%, 6.25%, 3.125%, 1.5625% ,0.7813%, 0.39%, 0.195%, 0.097%, 0.048%, 0.024%, 0.012% and 0.06% of complete transformed image) are considered as feature vectors. The four transforms are applied on gray image equivalents and the colour components of images to extract Gray and RGB feature sets respectively. Instead of using all coefficients of transformed images as feature vector for image retrieval, these fourteen reduced coefficients sets for gray as well as RGB feature vectors are used, resulting into better performance and lower computations. The proposed CBIR techniques are implemented on a database having 1000 images spread across 11 categories. For each proposed CBIR technique 55 queries (5 per category) are fired on the database and net average precision and recall are computed for all feature sets per transform. The results have shown performance improvement (higher precision and recall values) with fractional coefficients compared to complete transform of image at reduced computations resulting in faster retrieval. Finally Kekre’s transform surpasses all other discussed transforms in performance with highest precision and recall values for fractional coefficients (6.25% and 3.125% of all coefficients) and computation are lowered by 94.08% as compared to DCT.
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIijcsit
In this document, we propose a simple algorithm for the encryption of gray-scale images, although the
scheme is perfectly usable in color images. Prior to encryption, the proposed algorithm includes a pair of
permutation processes, inspired by the Bernoulli mapping. The permutation disperses the image
information to hinder the unauthorized recovery of the original image. The image is encrypted using the
XOR function between a sequence generated from the same Bernoulli mapping and the image data,
obtained after two permutation processes. Finally, for the verification of the algorithm, the gray-scale Lena
pattern image was used; calculating histograms for each stage alongside of the encryption process. The
histograms prove dispersion evolution for pattern image during whole algorithm.
Similar to Vector-Based Back Propagation Algorithm of.pdf (20)
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
2. II. CLASSICAL CONVOLUTION NEURAL NETWORK (CNN)
MODEL
A. Convolution Operation
The convolution operation between an image X ∈ ℜ(u×u)
and a filter F ∈ ℜ(v×v)
is defined as follows
X~F = C
(
(u − v + 2Pad) + 1
s
×
(u − v + 2Pad) + 1
s
)
(1)
where,
C[a, b] =
u
∑
k=0
u
∑
l=0
X[k, l]F[a − k, b − l] (2)
Here,~ denotes the convolution operation. The stride s
denotes the number of pixels by which F is sliding over X.
The padding Pad is the number of zeros applied around X.
a, b, k, and l are the rows and columns indices of C and X.
B. Convolution Layer
The convolution layer shown in Fig.1 is a model consisting
of a convolution map and a pooling map. Here, in the
convolution operation the padding Pad = 0 and the stride
s = 1.
Fig. 1. Convolution Layer (ConvL).
Convolution Map
The convolution map is designed as follows:
• An input image X ∈ ℜ(u×u)
.
• r filters Fj ∈ ℜ(v×v)
; j = 1, 2, · · · , r.
• A bias matrix Bfj ∈ ℜ((u−v+1)×(u−v+1))
.
• A non-linear activation function f.
• An output matrix Cpj ∈ ℜ((u−v+1)×(u−v+1))
.
The output convolution map is defined as follows
Cj = X ~ Fj + Bfj (3)
Cpj = f(Cj) (4)
Pooling Map
A pooling map is an essential unit of CNN architecture.
This step is used to reduce the computational complexity of
the network through the minimization of the Cpj dimension.
Average pooling and max pooling are examples of the pooling
operation [18], [19]. Typically, a kernel Kj of size (2×2) with
a stride equals 2 can be applied to calculate the average or the
maximum value for each patch of Cpj. The output pooling
operation Pj has the size of (u−v+1
2 × u−v+1
2 ).
In appendix 1, we present the average and max-pooling
operations.
C. Fully Connected Layer
The convolution filters detect features of the input images,
called local features. A fully connected layer added in series
with the convolution layer to recognize the input images [12],
[15], [26]. As shown in Fig.2, a fully connected layer has an
input vector Y 0
corresponding to the concatenation of the r
CNN pooling map Pj. The output vector Y t
defines image
classes. Typically, a series of t fully connected hidden layers
are added between the input and output vector to enhance the
CNN performance.
Fig. 2. Fully Connected layer (FCL).
The basic equations of fully connected hidden layers are as
shown
Hi
= Wi
Y (i−1)
+ Bi
(5)
Y i
= f(Hi
) (6)
Where, Hi
is the weight sum vector, Bi
defines the bias
vector, Wi
is the weight matrix that represents the intercon-
nection between hidden layers . Y i
denotes the output vector
of a select fully connected hidden layer.
D. Convolution Neural Networks Model
The convolution neural network shown in Fig.3 is composed
of one convolution layer convL in series with t fully connected
hidden layers L. X is the input image that will be recognized
by the CNN model. Y t
is the CNN output corresponding to
the image recognition.
Authorized licensed use limited to: Tsinghua University. Downloaded on December 19,2020 at 08:24:47 UTC from IEEE Xplore. Restrictions apply.
3. Fig. 3. Convolution Neural Network Architecture.
III. VECTOR-BASED CNN MODEL
This section aims to replace the classical convolution
operation by matrix operation.
Definition: We define the vector expression of any matrix
M ∈ ℜ(n×n)
as follows
M =
M1T
M2T
.
.
.
MnT
, M̄(n2×1) =
M1
M2
.
.
.
Mn
In fact, in this section the output convolutions map Cpj of
size ((u − v + 1) × (u − v + 1))will be transformed into a
vector ¯
Cpj of dimension((u − v + 1)2
× 1).
Based on appendix 2, the output convolved vector ¯
Cpj and
the output average pooling vector ¯
Pj are defined as follows
C̄j = Xx · ¯
Fj + B̄fj (7)
¯
Cpj = f ¯
(Cj) (8)
¯
Pj =
(
c̄pj1+c̄pj2+c̄pj3+c̄pj4
4 )
(
c̄pj5+c̄pj6+c̄pj7+c̄pj8
4 )
.
.
.
(
c̄pj(w2−3)+c̄pj(w2−2)+c̄pj(w2−1)+c̄pjw2
4 )
(9)
Not that Xx is ((u − v + 1)2
× v2
) input image, ¯
Fj is (v2
×
1) vector filter and B̄fj is ((u − v + 1)2
× 1) bias convolved
vector.
A. Forward Propagation
The CNN model developed in this study consists of one
convolution layer in series with three fully connected hidden
layers i = 3.
Convolution Layer
Based on equations (7), (8), and (9) the output convolution
map and the average pooling map of a model consists of r
vector filters can be written as follows
C̄((r×w2)×1) = Xo((r×w2)×(r×v2))·F̄((r×v2)×1)+B̄f((r×w2)×1)
(10)
¯
Cp = f ¯
(C) (11)
P̄((r× w2
4 )×1)
=
P̄1
P̄2
.
.
.
P̄r
(12)
where,w = u − v + 1
Xo((r×w2)×(r×v2) =
Xx 0 · · · 0
0 Xx · · · 0
.
.
.
.
.
.
...
.
.
.
0 0 · · · Xx
F̄(r×v2) =
F1
F2
.
.
.
Fr
, B̄f(r×w2) =
Bf1
Bf2
.
.
.
Bfr
Concatenation
Concatenation is the operation that defines the input of the
fully connected layer as a function of r pooling operations.
Y 0
(m×1) = P̄ (13)
where, m = r × w2
4 .
Fully Connected Layer
The fully connected hidden layers equations are derived
from [14].
Layer 1
Y 1
(n×1) = f(W1
(n×m)Y 0
(m×1) + B1
(n×1)) (14)
n denotes the number of artificial neurons in the first fully
connected hidden layer.
Layer 2
Y 2
(o×1) = f(W2
(o×n)Y 1
(n×1) + B2
(o×1)) (15)
o denotes the number of artificial neurons in the second
fully connected hidden layer.
Layer 3
Y 3
(p×1) = f(W3
(p×o)Y 2
(o×1) + B3
(p×1)) (16)
p is the dimension of the input labeled data.
B. Back Propagation
To update the CNN parameters and perform the learning
process, a back propagation algorithm is developed to min-
imize a cost function E. In our analysis, the mean squared
error cost function [12], [15] is used.
E =
1
2
(Y 3
− Yd)T
(Y 3
− Yd) (17)
Equation (18) shows the gradient descent method to update
the CNN parameters.
ℓnew = ℓold − α(
∂E
∂ℓold
)T
(18)
Authorized licensed use limited to: Tsinghua University. Downloaded on December 19,2020 at 08:24:47 UTC from IEEE Xplore. Restrictions apply.
4. Here, ℓnew represents the update of bias convolution vector
B̄f , filter vector F̄, weight matrix W, and bias fully connected
vector B. α is the learning rate, we can choose it as a constant
or a variable with a positive value.
We note that the update equations of the CNN parameters
involve the computations of ( ∂E
∂ℓold
). We develop below the
parameters update for each layer.
Fully Connected Layer
For the fully connected layer, the detailed development of
the back propagation is proposed in [14], where
∂E
∂Bi
= [(W(i−1)T
·
∂E
∂B(i−1)
) ∗ f
′
(Hi
)]T
(19)
∂E
∂Wi
= (
∂E
∂Bi
)
T
· Y (i−1)T
(20)
The operation ∗ is defined in appendix 3.
Convolution Layer
∂E(1×1)
∂B̄f((r×w2)×1)
|(1×(r×w2)) =
∂E(1×1)
∂C̄((r×w2)×1)
∂C̄((r×w2)×1)
∂B̄f((r×w2)×1)
(21)
∂E(1×1)
∂F̄((r×v2)×1)
|(1×(r×v2)) =
∂E(1×1)
∂C̄((r×w2)×1)
∂C̄((r×w2)×1)
∂F̄((r×v2)×1)
(22)
By substituting (10) into (21) and (22), we obtain
∂C̄((r×w2)×1)
∂B̄f((r×w2)×1)
|((r×w2)×(r×w2)) = I (23)
∂C̄((r×w2)×1)
∂F̄((r×v2)×1)
|((r×w2)×(r×v2)) = Xo (24)
Here, I is the identity matrix
Computing of ∂E
∂C̄
∂E(1×1)
∂C̄((r×w2)×1)
|(1×(r×w2)) =
∂E(1×1)
∂C̄p((r×w2)×1)
∂C̄p((r×w2)×1)
∂C̄((r×w2)×1)
=
∂E
∂C̄p
|(1×(r×w2)) ∗ f′ ¯
(Cp)T
(1×(r×w2))
(25)
Definition: let’s define the operation Inc of any vector U as
follows:
U(n×1) =
u1
u2
.
.
.
un
Inc(U) =
u1
4
u1
4
u1
4
u1
4
.
.
.
un
4
un
4
un
4
un
4
(4n×1)
(26)
In this section, the Inc operation is used to increase the size of
the average pooling vector map, where
∂E
∂C̄p
= Inc(
∂E
∂P̄
) (27)
Since Y 0
=P̄
∂E
∂P̄
=
∂E
∂B1
W1
(28)
By substituting (28), (27),(25),(24), and (23) into (22) and (21)
we obtain
∂E
∂B̄f
= Inc(
∂E
∂B1
W1
) ∗ f′ ¯
(Cp)T (29)
∂E
∂F̄
=
∂E
∂B̄f
· Xo (30)
IV. SIMULATIONS RESULTS AND DISCUSSION
In this section, MNIST handwritten digits data are used to test the
CNN performance using the proposed matrix operation. This database
consists of a training set of 60,000 images and a testing set of 10,000
images. The images are a gray scale of dimension(28 × 28). The
(10 × 1) output vector classifies the digits from 0 to 9. To enhance
the CNN performance the following hyper-parameters are varied:
• CNN Width.
• CNN Height.
The performance of the CNN model corresponds to the ratio
between the total number of correct classifications and the number
of test images.
TABLE I
CNN PERFORMANCE ACCORDING THE VARIATION OF WIDTH AND
HEIGHT HYPER-PARAMETERS
N° of training images N° of filters Size of filters Performance
5,000
− − 0.9325
5
(32 × 1) 0.9414
(92 × 1) 0.9508
(132 × 1) 0.9478
10 (92 × 1) 0.9575
20 (92 × 1) 0.9582
10,000
− − 0.9481
5
(32 × 1) 0.9570
(92 × 1) 0.9589
(132 × 1) 0.9579
10 (92 × 1) 0.9623
20 (92 × 1) 0.9677
30,000
− − 0.9773
5
(32 × 1) 0.9785
(92 × 1) 0.9810
(132 × 1) 0.9798
10 (92 × 1) 0.9835
20 (92 × 1) 0.9871
60,000
− − 0.9790
5
(32 × 1) 0.9831
(92 × 1) 0.9867
(132 × 1) 0.9859
10 (92 × 1) 0.9874
20 (92 × 1) 0.9883
The simulated CNN model consists of :
• One convolution layer. The convolve activation function is the
Relu. The used pooling operation is the average.
• A fully connected layer. It comprises 3 hidden layers. Each
hidden layer forms of 200 artificial neurons. The activation
function is the same used on the convolution layer. The learning
rate equals 0.3.
Authorized licensed use limited to: Tsinghua University. Downloaded on December 19,2020 at 08:24:47 UTC from IEEE Xplore. Restrictions apply.
5. As shown in Table 1, we tested the CNN performance through the
variation of
• Number of convolving filters vector.
• Size of each filter.
• Number of training images.
Here (−) denotes a model consisting just with a fully connected
layer.
The peak of CNN performance is 0.9883. We obtained it by
increasing the size and the number of convolved filters vector and
the number of training images.
V. CONCLUSION
In this paper, a new matrix operation that substitutes the classical
convolution operation is developed. MNIST data of handwritten digits
is used to test the influence of the CNN hyper-parameters on the
model performance. The peak of performance achieved is 0.9883.
It is obtained using a CNN model composed of one convolution
layer and three fully connected hidden layers. The results deduced
from the simulation proposed do not represent the optimal CNN
hyper-parameters configuration. Further increase in the number of
convolution layers and the number of training dataset can be enhanced
the CNN performance.
APPENDIX
Appendix 1: Average and max-pooling operations
Cpj =
cp11 cp12 · · · cp1(u−v+1)
cp21 cp22 · · · cp2(u−v+1)
.
.
.
.
.
.
...
.
.
.
cp(u−v+1)1 cp(u−v+1)2 · · · cp(u−v+1)(u−v+1)
(31)
The pooling map defined as follows
PjAve =
P11 P12 · · · P1
(u−v+1)
2
P21 P22 · · · P2
(u−v+1)
2
.
.
.
.
.
.
...
.
.
.
P(u−v+1)
2
1
P(u−v+1)
2
2
· · · P(u−v+1)
2
(u−v+1)
2
(32)
For the Average Pooling Operation,
P11 =
cp11+cp12+cp21+cp22
4
P1
(u−v+1)
2
=
cp1(u−v)+cp1(u−v+1)+cp2(u−v)+cp2(u−v+1)
4
P(u−v+1)
2
1
=
cp(u−v)1+cp(u−v)2+cp(u−v+1)1+cp(u−v+1)2
4
P(u−v+1)
2
(u−v+1)
2
=
cp(u−v)(u−v)+cp(u−v)(u−v+1)+cp(u−v+1)(u−v)+cp(u−v+1)(u−v+1)
4
For the max Pooling Operation,
P11 = max(cp11, cp12, cp21, cp22)
P1
(u−v+1)
2
= max(cp1(u−v), cp1(u−v+1), cp2(u−v), cp2(u−v+1))
P(u−v+1)
2
1
= max(cp(u−v)1, cp(u−v)2, cp(u−v+1)1, cp(u−v+1)2)
P(u−v+1)
2
(u−v+1)
2
= max(cp(u−v)(u−v), cp(u−v)(u−v+1), cp(u−v+1)(u−v), cp(u−v+1)(u−v+1))
Appendix 2: Matrix Operation
For a CNN model consisting of an input image X ∈ ℜ(u×u)
convolved with a filter F ∈ ℜ(v×v)
X =
X1T
X2T
.
.
.
XuT
, F =
F1T
F2T
.
.
.
FvT
where,
XiT
=
[
xi
1 xi
2 · · · xi
u
]
FiT
=
[
fi
1 fi
2 · · · fi
v
]
We define XiT
j−
→k as follows
XiT
j−
→k =
[
xi
j xi
j+1 · · · xi
k
]
The proposed matrix operation that substituted the classical con-
volution operation is:
X ~ F = Xx|((u−v+1)2×v2) · F̄|(v2×1)
=
XuT
1−
→v X
(u−1)T
1−
→v · · · X1T
1−
→v
XuT
2−
→(v+1) X
(u−1)T
2−
→(v+1) · · · X1T
2−
→(v+1)
.
.
.
.
.
.
...
.
.
.
XuT
(u−v+1)−
→1 X
(u−1)T
(u−v+1)−
→1 · · · X1T
(u−v+1)−
→1
X
(u−1)T
1−
→v X
(u−2)T
1−
→v · · · X2T
1−
→v
X
(u−1)T
2−
→(v+1) X
(u−2)T
2−
→(v+1) · · · X2T
2−
→(v+1)
.
.
.
.
.
.
...
.
.
.
X
(u−1)T
(u−v+1)−
→1 X
(u−2)T
(u−v+1)−
→1 · · · X2T
(u−v+1)−
→1
.
.
.
XvT
1−
→v X
(u−v)T
1−
→v · · · X
(u−v+1)T
1−
→v
XvT
2−
→(v+1) X
(u−v)T
2−
→(v+1) · · · X
(u−v+1)T
2−
→(v+1)
.
.
.
.
.
.
...
.
.
.
XvT
(u−v+1)−
→1 X
(u−v)T
(u−v+1)−
→1 · · · X
(u−v+1)T
(u−v+1)−
→1
F1
F2
.
.
.
Fv
(33)
Appendix 3: The definition of (∗) operation
Let’s define the operation (∗) for any vector U and V :
U(n×1) =
u1
u2
.
.
.
un
and V(n×1) =
v1
v2
.
.
.
vn
U ∗ V =
u1v1
u2v2
.
.
.
unvn
(34)
REFERENCES
[1] A. Krizhevsky, I. Sutskever, and G.E. Hinton, “ImageNet Classifica-
tion with Deep ConvolutionalNeural Networks,” Proceedings of the
Advances in neural information Processing Systems, pp. 1097–1105,
2012.
[2] K. Fukushima, “ Neocognitron: A Self-organizing Neural Network
Model for a Mechanism of Pattern Recognition Unaffected by Shift
in Position,” Biol. Cybernetics 36 by Springe, pp. 193–202,1980.
Authorized licensed use limited to: Tsinghua University. Downloaded on December 19,2020 at 08:24:47 UTC from IEEE Xplore. Restrictions apply.
6. [3] Y. LeCun, et al., “Handwritten digit recognition with a backpropagation
network,” in Advances in neural information processing systems, pp.
396–404, 1990.
[4] Y. LeCun, L.Bottou, Y.Bengio, and P.Haffner, “Gradient-based learning
applied to document recognition,” Proceedings of the IEEE, vol. 86, pp.
2278–2324, 1998.
[5] D. Steinkraus, I. Buck, and P.Y. Simard, “Using GPUs for machine
learning algorithms,” in Document Analysis and Recognition, 2005.
Proceedings. Eighth International Conference, pp. 1115–1120,2005.
[6] K.Simonyan, and A.Zisserman, “ Very deep convolutional networks for
large-scale image recognition,” Conference paper at ICLR, pp. 1–14,
2015.
[7] K. He, X. Zhang, S. Ren,and J. Sun, “ Deep Residual Learning for
Image Recognition,” 2016 IEEE Conference on Computer Vision and
Pattern Recognition , pp.1–9,2016.
[8] S.Xie, R.Girshick, P.Dollar, Z. Tu, and K.He, “ Aggregated Residual
Transformations for Deep Neural Networks,” 2017 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp. 1–9, 2017.
[9] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi , “Inception-v4,
Inception-ResNet and the Impact of Residual Connections on Learning,”
arXiv Prepr. arXiv1602.07261v2, vol. 131, no. 2,pp. 262–263,2016.
[10] H.Jang, H-Jun. Yang, D-S. Jeong, and H.Lee, “ Object classification
using CNN for video traffic detection system,” 2015 21st Korea-Japan
Joint Workshop on Frontiers of Computer Vision (FCV), 2015.
[11] H.Yanagisawa, T.Yamashita, and H.Watanabe, “ A Study on Object
Detection Method from Manga Images using CNN ,”2018 International
Workshop on Advanced Image Technology (IWAIT), 2018.
[12] R. B. Arif, M. A. B.Siddique, M. M. R. Khan, and M.R. Oishe, “Study
and Observation of the Variations of Accuracies for Handwritten Digits
Recognition with Various Hidden Layers and Epochs using Convolu-
tional Neural Network ,” 4th International Conference on Electrical
Engineering and Information and Communication Technology,2018.
[13] V.Gullapalli, “ A COMPARISON OF SUPERVISED AND REIN-
FORCEMENT LEARNING METHODS ON A REINFORCEMENT
LEARNING TASK, ” Proceedings of the 1991 IEEE International
Symposium on Intelligent Control, pp. 394–399, 1991.
[14] N.Wagaa,and H. Kallel, “Recursive Supervised Artificial Neural Net-
work Algorithm for Data Classification and Regression,” unpublished.
[15] M. A. B. Siddique, M. M. R. Khan, R. B. Arif, and Z. Ashrafi,
“Study and Observation of the Variations of Accuracies for Handwritten
Digits Recognition with Various Hidden Layers and Epochs using
Neural Network Algorithm,” 4th International Conference on Electrical
Engineering and Information and Communication Technology, pp. 118–
123,2018.
[16] V. E .Ismailov, “On the approximation by neural networks with bounded
number of neurons in hidden layers,” Journal of Mathematical Analysis
and Applications, pp. 963–969,2014.
[17] M .Rastegari, V.Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net:
ImageNet Classification Using Binary Convolutional Neural Networks,
” Lecture Notes in Computer Science Springer, pp. 525–542, 2016.
[18] D . Scherer, A.M¨uller, and S. Behnke, “Evaluation of Pooling Opera-
tions in Convolutional Architectures for Object Recognition, ” Interna-
tional Conference on Artificial Neural Networks Springer, pp. 92–101,
2010.
[19] I. B.Dlimi, and H.Kallel, “Robust Neural Control for Robotic Ma-
nipulators,” International Journal of Enhanced Research in Science
Technology, and Engineering, vol.5, no.2, pp. 198–205, 2016.
[20] M . DONG, Y.LI, X.TANG, J. XU, S.BI, and A. Y.CAI, “Variable Con-
volution and Pooling Convolutional Neural Network for Text Sentiment
Classification, ” IEEE Access, 2020.
[21] X .Glorot , and Y.Bengio, “Understanding the difficulty of training
deep feedforward neural networks, ” 13th International Conference on
Artificial Intelligence and Statistics, pp. 249–256, 2010.
[22] K .He, X.Zhang, S.Ren, and J.Sun, “Delving Deep into Rectifiers:
Surpassing Human-Level Performance on ImageNet Classification, ”
2015 IEEE International Conference on Computer Vision, pp. 1026–
1034, 2015.
[23] P .Dangeti , “Statistics for Machine Learning: Techniques for explor-
ing supervised, unsupervised, and reinforcement learning models with
Python and R, ” Packt Publishing, 2017.
[24] T .Takase , S.Oyama, and K.Masahito, “Effective neural network training
with adaptive learning rate based on training loss, ” Neural Networks,
pp. 68–78, 2018.
[25] Y . LeCun , “The MNIST database of handwritten digits,
”http://yann.lecun.com/exdb/mnist/”, 1998.
[26] I. B.Dlimi, and H.Kallel, “Optimal neural control for constrained robotic
manipulators,” 2010 5th IEEE International Conference Intelligent
Systems,pp.302–308, 2010.
Authorized licensed use limited to: Tsinghua University. Downloaded on December 19,2020 at 08:24:47 UTC from IEEE Xplore. Restrictions apply.