The document discusses Abhishek Sharma's PhD defense talk on learning from multiple views of data. It presents an overview of his work on semantic segmentation to extract visual features from images and using a recursive context propagation network to incorporate contextual information. It also covers his research on constructing a common representation space to match content across different modalities like images and text.
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videosijtsrd
The paper presents a novel algorithm for object classification in videos based on improved support vector machine (SVM) and genetic algorithm. One of the problems of support vector machine is selection of the appropriate parameters for the kernel. This has affected the accuracy of the SVM over the years. This research aims at optimizing the SVM Radial Basis kernel parameters using the genetic algorithm. Moving object classification is a requirement in smart visual surveillance systems as it allows the system to know the kind of object in the scene and be able to recognize the actions the object can perform. This paper presents an GA-SVM machine learning approach for real time object classification in videos. Radial distance signal features are extracted from the silhouettes of object detected in videos. The radial distance signals features are then normalized and fed into the GA-SVM model. The classification rate of 99.39% is achieved with the genetically trained SVM algorithm while 99.1% classification accuracy is achieved with the normal SVM. A comparison of this classifier with some other classifiers in terms of classification accuracy shows a better performance than other classifiers such as the normal SVM, Artificial neural network (ANN), Genetic Artificial neural network (GANN), K-nearest neighbor (K-NN) and K-Means classifiers. Akintola Kolawole G."A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-4 , June 2017, URL: http://www.ijtsrd.com/papers/ijtsrd109.pdf http://www.ijtsrd.com/computer-science/artificial-intelligence/109/a-novel-ga-svm-model-for-vehicles-and-pedestrial-classification-in-videos/akintola-kolawole-g
We test if modern computer-vision algorithms can predict if users are reading relevant information, from their eye movement patterns. The slides accompany the video presentation at https://youtu.be/ZebBgUhL-EU
The full research paper is available at:
https://dl.acm.org/doi/10.1145/3343413.3377960
and also at
https://arxiv.org/abs/2001.05152
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...CSCJournals
The document compares the Levenberg-Marquardt and Scaled Conjugate Gradient algorithms for training a multilayer perceptron neural network for image compression. It finds that while both algorithms performed comparably in terms of accuracy and speed, the Levenberg-Marquardt algorithm achieved slightly better accuracy as measured by average training accuracy and mean squared error, while the Scaled Conjugate Gradient algorithm was faster as measured by average training iterations. The document compresses a standard test image called Lena using both algorithms and analyzes the results.
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Francisco Zamora-Martinez
Artificial neural networks have proved to be good at time-series forecasting
problems, being widely studied at literature. Traditionally, shallow
architectures were used due to convergence problems when dealing with deep
models. Recent research findings enable deep architectures training, opening a
new interesting research area called deep learning. This paper presents a study
of deep learning techniques applied to time-series forecasting in a real indoor
temperature forecasting task, studying performance due to different
hyper-parameter configurations. When using deep models, better generalization
performance at test set and an over-fitting reduction has been observed.
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...Waqas Tariq
Kalman filter have been used for the estimation of instantaneous states of linear dynamic systems. It is a good tool for inferring of missing information from noisy measurement. The quantum neural network is another approach to the merging of fuzzy logic with the neural network and that by the investment of quantum mechanics theory in building the structure of neural network. The gradient descent algorithm has been used, widely, in training the neural network, but the problem of local minima is one of the disadvantages of this algorithm. This paper presents an algorithm to train the quantum neural network by using the extended kalman filter.
We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different
classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make training
faster, we used non-saturating neurons and a very efficient GPU implementation
of the convolution operation. To reduce overfitting in the fully-connected
layers we employed a recently-developed regularization method called “dropout”
that proved to be very effective. We also entered a variant of this model in the
ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%,
compared to 26.2% achieved by the second-best entry
The document discusses image recognition using convolutional neural networks (CNNs). It explains that CNNs consist of multiple layers of small neuron collections that look at small portions of an input image called receptive fields. The results are tiled to overlap and represent the original image better. CNNs learn filters through training rather than relying on hand-engineered features. Convolution involves calculating the overlap between functions as one is translated, and is used in CNNs to identify patterns across translated versions of inputs like images. Pointwise nonlinearities are applied between CNN layers to introduce nonlinearity.
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videosijtsrd
The paper presents a novel algorithm for object classification in videos based on improved support vector machine (SVM) and genetic algorithm. One of the problems of support vector machine is selection of the appropriate parameters for the kernel. This has affected the accuracy of the SVM over the years. This research aims at optimizing the SVM Radial Basis kernel parameters using the genetic algorithm. Moving object classification is a requirement in smart visual surveillance systems as it allows the system to know the kind of object in the scene and be able to recognize the actions the object can perform. This paper presents an GA-SVM machine learning approach for real time object classification in videos. Radial distance signal features are extracted from the silhouettes of object detected in videos. The radial distance signals features are then normalized and fed into the GA-SVM model. The classification rate of 99.39% is achieved with the genetically trained SVM algorithm while 99.1% classification accuracy is achieved with the normal SVM. A comparison of this classifier with some other classifiers in terms of classification accuracy shows a better performance than other classifiers such as the normal SVM, Artificial neural network (ANN), Genetic Artificial neural network (GANN), K-nearest neighbor (K-NN) and K-Means classifiers. Akintola Kolawole G."A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-4 , June 2017, URL: http://www.ijtsrd.com/papers/ijtsrd109.pdf http://www.ijtsrd.com/computer-science/artificial-intelligence/109/a-novel-ga-svm-model-for-vehicles-and-pedestrial-classification-in-videos/akintola-kolawole-g
We test if modern computer-vision algorithms can predict if users are reading relevant information, from their eye movement patterns. The slides accompany the video presentation at https://youtu.be/ZebBgUhL-EU
The full research paper is available at:
https://dl.acm.org/doi/10.1145/3343413.3377960
and also at
https://arxiv.org/abs/2001.05152
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...CSCJournals
The document compares the Levenberg-Marquardt and Scaled Conjugate Gradient algorithms for training a multilayer perceptron neural network for image compression. It finds that while both algorithms performed comparably in terms of accuracy and speed, the Levenberg-Marquardt algorithm achieved slightly better accuracy as measured by average training accuracy and mean squared error, while the Scaled Conjugate Gradient algorithm was faster as measured by average training iterations. The document compresses a standard test image called Lena using both algorithms and analyzes the results.
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Francisco Zamora-Martinez
Artificial neural networks have proved to be good at time-series forecasting
problems, being widely studied at literature. Traditionally, shallow
architectures were used due to convergence problems when dealing with deep
models. Recent research findings enable deep architectures training, opening a
new interesting research area called deep learning. This paper presents a study
of deep learning techniques applied to time-series forecasting in a real indoor
temperature forecasting task, studying performance due to different
hyper-parameter configurations. When using deep models, better generalization
performance at test set and an over-fitting reduction has been observed.
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...Waqas Tariq
Kalman filter have been used for the estimation of instantaneous states of linear dynamic systems. It is a good tool for inferring of missing information from noisy measurement. The quantum neural network is another approach to the merging of fuzzy logic with the neural network and that by the investment of quantum mechanics theory in building the structure of neural network. The gradient descent algorithm has been used, widely, in training the neural network, but the problem of local minima is one of the disadvantages of this algorithm. This paper presents an algorithm to train the quantum neural network by using the extended kalman filter.
We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different
classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make training
faster, we used non-saturating neurons and a very efficient GPU implementation
of the convolution operation. To reduce overfitting in the fully-connected
layers we employed a recently-developed regularization method called “dropout”
that proved to be very effective. We also entered a variant of this model in the
ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%,
compared to 26.2% achieved by the second-best entry
The document discusses image recognition using convolutional neural networks (CNNs). It explains that CNNs consist of multiple layers of small neuron collections that look at small portions of an input image called receptive fields. The results are tiled to overlap and represent the original image better. CNNs learn filters through training rather than relying on hand-engineered features. Convolution involves calculating the overlap between functions as one is translated, and is used in CNNs to identify patterns across translated versions of inputs like images. Pointwise nonlinearities are applied between CNN layers to introduce nonlinearity.
A Time Series ANN Approach for Weather Forecastingijctcm
Weather forecasting is most challenging problem around the world. There are various reason because of its experimented values in meteorology, but it is also a typical unbiased time series forecasting problem in scientific research. A lots of methods proposed by various scientists. The motive behind research is to predict more accurate. This paper contribute the same using artificial neural network (ANN) and simulated in MATLAB to predict two important weather parameters i.e. maximum and minimum temperature. The model has been trained using past 60 years of real data collected from(1901-1960) and tested over 40 years to forecast maximum and minimum temperature. The results based on mean square error function (MSE) confirm, this model which is based on multilayer perceptron has the potential to successful application to weather forecasting
A SURVEY OF SPIKING NEURAL NETWORKS AND SUPPORT VECTOR MACHINE PERFORMANCE BY...ijdms
This document summarizes research on parallelizing spiking neural networks (SNNs) and support vector machines (SVMs) using GPUs. It reviews related work applying SNNs and SVMs to tasks like classification and pattern recognition. SNNs and SVMs can benefit from parallelization but require different approaches due to their computational characteristics. SNNs are better suited to FPGAs than GPUs due to their non-linear learning equations. SVMs parallelize well on GPUs by solving matrix operations in parallel. The document discusses factors to consider in parallelizing SNNs and SVMs, such as hardware limitations and memory requirements.
Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...IJERA Editor
The architectures of Artificial Neural Networks (ANN) are based on the problem domain and it is applied during
the „training phase‟ of sample data and used to infer results for the remaining data in the testing phase.
Normally, the architecture consist of three layers as input, hidden, output layers with the number of nodes in the
input layer as number of known values on hand and the number of nodes as result to be computed out of the
values of input nodes and hidden nodes as the output layer. The number of nodes in the hidden layer is
heuristically decided so that the optimum value is obtained with reasonable number of iterations with other
parameters with its default values. This study mainly focuses on Cascade-Correlation Neural Networks (CCNN)
using Back-Propagation (BP) algorithm which finds the number of neurons during the training phase itself by
appending one from the previous iteration satisfying the error condition gives a promising result on the optimum
number of neurons in the hidden layer
Automatic time series forecasting using nonlinear autoregressive neural netwo...journalBEEI
This study aims to determine an automatic forecasting method of univariate time series, using the nonlinear autoregressive neural network model with exogenous input (NARX). In this automatic setting, users only need to supply the input of time series. Then, an automatic forecasting algorithm sets up the appropriate features, estimate the parameters in the model, and calculate forecasts, without the users’ intervention. The algorithm method used include preprocessing, tests for trends, and the application of first differences. The time series were tested for seasonality, and seasonal differences were obtained from a successful analysis. These series were also linearly scaled to [−1, +1]. The autoregressive lags and hidden neurons were further selected through the stepwise and optimization algorithms, respectively. The 20 NARX models were fitted with different random starting weights, and the forecasts were combined using the ensemble operator, in order to obtain the final product. This proposed method was applied to real data, and its performance was compared with several available automatic models in the literature. The forecasting accuracy was also measured by mean squared error (MSE) and mean absolute percent error (MAPE), and the results showed that the proposed method outperformed the other automatic models.
Simulation of Single and Multilayer of Artificial Neural Network using Verilogijsrd.com
This document discusses the simulation of single layer and multilayer artificial neural networks using Verilog. It begins with an introduction to artificial neural networks and their application in VLSI circuit fault diagnosis. It then provides details on the algorithm and design methodology for simulating a single layer neural network to model an AND gate, showing the calculation of error over iterations in Matlab and time taken using Verilog code. For a multilayer network modeling an XOR gate, it similarly discusses the backpropagation algorithm, showing error reduction over iterations in Matlab and time taken using Verilog. It concludes that neural networks can help minimize time to find faults in digital circuits.
This document presents research using artificial neural networks to identify toxic gases in real time. A multi-layer perceptron neural network was trained using data from a multi-sensor system that detected hydrogen sulfide, nitrogen dioxide, and their mixture. Features extracted from the sensor responses were used as inputs to the neural network. The network was trained online using backpropagation and achieved 100% accuracy classifying gases during training and 96.6% accuracy during testing, with low error rates. This model achieved better performance than previous methods and can identify low concentrations of toxic gases in real time, which has applications for air quality monitoring and safety.
Improving of artifical neural networks performance by using gpu's a surveycsandit
In this paper we study the improvement in the performance of Artificial Neural Networks (ANN)
by using parallel programming in GPU or FPGA architectures. It is well known that ANN can
be parallelized according to particular characteristics of the training algorithm. We discuss
both approaches: the software (GPU) and the Hardware (FPGA). Different training strategies
are discussed: the Perceptron training unit, the Support Vector Machines (SVM) and Spiking
Neural Networks (SNN). The different approaches are evaluated by the training speed and
performance. On the other hand, algorithms were coded by authors in the hardware, like Nvidia
card, FPGA or sequential circuits that depends on methodology used, to compare learning time
with between GPU and CPU. Also, the main applications were made for recognition pattern,
like acoustic speech, odor and clustering According to literature, GPU has a great advantage
compared to CPU, this in the learning time except when it implies rendering of images, despite
several architectures of Nvidia cards and CPU’s. Also, in the survey we introduce a brief
description of the types of ANN and its techniques of execution to be related with results of
researching.
IMPROVING OF ARTIFICIAL NEURAL NETWORKS PERFORMANCE BY USING GPU’S: A SURVEYcsandit
This document provides a survey of improving the performance of artificial neural networks (ANNs) through parallel programming on GPUs. It discusses different ANN training strategies that can be parallelized, such as perceptrons, support vector machines, and spiking neural networks. GPUs provide significant speed advantages over CPUs for ANN training. The document reviews various studies that have implemented ANNs using GPUs and FPGAs, finding that GPUs reduce training time compared to CPUs, especially for algorithms involving large matrix operations like support vector machines. Spiking neural networks are better suited to FPGAs or custom circuits due to their complex temporal dynamics. The document concludes that GPUs are generally the best approach for ANN parallelization, but the
This document summarizes a study that used artificial neural networks (ANNs) and the Multi-Layer Perceptron model (MLP) to predict the bearing capacities of steel driven piles in sandy soils. The ANN was trained on data from full-scale pile load tests, including pile length, diameter, soil elastic modulus, and soil friction angle as inputs. The output was pile bearing capacity. The study examined factors for effective ANN behavior, trained and tested the network, and analyzed the sensitivity of the inputs on the output capacity prediction.
INVESTIGATIONS OF THE INFLUENCES OF A CNN’S RECEPTIVE FIELD ON SEGMENTATION O...adeij1
Segmentation of objects with various sizes is relatively less explored in medical imaging, and has been very challenging in computer vision tasks in general. We hypothesize that the receptive field of a deep model corresponds closely to the size of object to be segmented, which could critically influence the segmentation accuracy of objects with varied sizes. In this study, we employed “AmygNet”, a dual-branch fully convolutional neural network (FCNN) with two different sizes of receptive fields, to investigate the effects of receptive field on segmenting four major subnuclei of bilateral amygdalae. The experiment was conducted on 14 subjects, which are all 3-dimensional MRI human brain images. Since the scale of different subnuclear groups are different, by investigating the accuracy of each subnuclear group while using receptive fields of various sizes, we may find which kind of receptive field is suitable for object of which scale respectively. In the given condition, AmygNet with multiple receptive fields presents great potential in segmenting objects of different sizes.
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...IJERA Editor
This paper proposes the Rainfall Prediction System by using classification technique. The advanced and modified neural network called Data Core Based Fuzzy Min Max Neural Network (DCFMNN) is used for pattern classification. This classification method is applied to predict Rainfall. The neural network called fuzzy min max neural network (FMNN) that creates hyperboxes for classification and predication, has a problem of overlapping neurons that resoled in DCFMNN to give greater accuracy. This system is composed of forming of hyperboxes, and two kinds of neurons called as Overlapping Neurons and Classifying neurons, and classification used for prediction. For each kind of hyperbox its data core and geometric center of data is calculated. The advantage of this method is it gives high accuracy and strong robustness. According to evaluation results we can say that this system gives better prediction of rainfall and classification tool in real environment.
Hybrid neural networks for time series learning by Tian Guo, EPFL, SwitzerlandEuroIoTa
Time series is prevalent in the IoT environment and used for monitoring the evolving behavior of involved entities or objects over time. Analyzing and mining such time series data serve for revealing insightful long-term and instantaneous information behind the data, e.g., trend, event, correlation and causality and so on.
Inspired by the recent successes of neural networks, in this talk we present a novel end-to-end hybrid neural network for learning the local and global contextual features of time series. In particular, we explore the idea of hybrid neural networks in a specific time series learning problem, namely learning the local trend of time series. Local trends of time series characterize the intermediate upward and downward patterns of time series. Learning and forecasting the local trend in time series data play an important role in many real applications, ranging from investing in the stock market, resource allocation in data centers and load schedule in the smart grid. We propose TreNet, a hybrid neural network which leverages convolutional neural networks (CNNs) to extract salient features from local raw data of time series and a long-short term memory recurrent neural network (LSTM) to capture such dependency of local trends. Preliminary experimental results on real datasets demonstrate the superiority of TreNet over conventional CNN, LSTM, HMM method and various kernel based baselines.
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...WithTheBest
This presentation illustrates distinct statistical and machine learning approaches to automated recognition of major brain tissues in 3D brain MRI.
Nataliya Portman, Postdoctoral Fellow Faculty of Science, UOIT, Oshawa, ON Canada
PhD in Applied Mathematics, University of Waterloo | Postdoctoral Research on Brain MRI Segmentation, Neuro | Current: Applied Machine Learning in Materials Science, University of Ontario Institute of Technology
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
Most conventional Neural Architecture Search (NAS) approaches are limited in that they only generate architectures without searching for the optimal parameters. While some NAS methods handle this issue by utilizing a supernet trained on a large-scale dataset such as ImageNet, they may be suboptimal if the target tasks are highly dissimilar from the dataset the supernet is trained on. To address such limitations, we introduce a novel problem of Neural Network Search (NNS), whose goal is to search for the optimal pretrained network for a novel dataset and constraints (e.g. number of parameters), from a model zoo. Then, we propose a novel framework to tackle the problem, namely Task-Adaptive Neural Network Search (TANS). Given a model-zoo that consists of network pretrained on diverse datasets, we use a novel amortized meta-learning framework to learn a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a high-performing network on it, and minimize the similarity between irrelevant dataset-network pairs. We validate the effectiveness and efficiency of our method on ten real-world datasets, against existing NAS/AutoML baselines. The results show that our method instantly retrieves networks that outperform models obtained with the baselines with significantly fewer training steps to reach the target performance, thus minimizing the total cost of obtaining a task-optimal network. Our code and the model-zoo are available at https://anonymous.4open.science/r/TANS-33D6
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
The document summarizes several machine learning algorithms used for data mining:
- Decision trees use nodes and edges to iteratively divide data into groups for classification or prediction.
- Naive Bayes classifiers use Bayes' theorem for text classification, spam filtering, and sentiment analysis due to their multi-class prediction abilities.
- K-nearest neighbors algorithms find the closest K data points to make predictions for classification or regression problems.
- ID3, CART, and k-means clustering are also summarized highlighting their uses, advantages, and disadvantages.
This paper presents a study using an artificial neural network (ANN) for load forecasting in the smart grid. Specifically, it uses a backpropagation network to forecast electricity load in Ontario, Canada based on weather and other input data. The paper describes collecting hourly load and weather data over two years, normalizing the data, creating a three-layer backpropagation network with different numbers of neurons, training the network using two algorithms, and testing the network on a separate data set to analyze forecast accuracy. The results show the ANN approach is able to accurately forecast electricity load based on the input factors.
In this paper, a new steganography algorithm has been suggested to enforce the security of data hiding and to increase the amount of payloads. This algorithm is based on four safety layers; the first safety layer has been initiated through compression and an encryption of a confidential message using a set partition in hierarchical trees (SPIHT) and advanced encryption standard (AES) mechanisms respectively. An irregular image segmentation algorithm (IIS) on a cover-image (Ic) has been constructed successfully in
the second safety layer, and it is based on the adaptive reallocation segments' edges (ARSE) by applying an
adaptive finite-element method (AFEM) to find the numerical solution of the proposed partial differential equation (PDE). An intelligent computing technique using a hybrid adaptive neural network with a modified ant colony optimizer (ANN_MACO) has been proposed in the third safety layer to construct a
learning system. This system accepts entry using support vector machine (SVM) to generate input patterns as features of byte attributes and produces new features to modify a cover-image. The significant innovation of the proposed novel steganography algorithm is applied efficiently on the forth
safety layer which is more robust for hiding a large amount of confidential message reach to six bits per pixel (bpp) into color images. The new approach of hiding algorithm works against statistical and visual attacks with high imperceptible of hiding data into stego-images (Is). The experimental results are
discussed and compared with the previous steganography algorithms; it demonstrates that the proposed algorithm has a significant improvement on the effect of the security level of steganography by making an arduous task of retrieving embedded confidential message from color images.
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
This document discusses machine learning algorithms for image classification using five different classification schemes. It summarizes the mathematical models behind each classification algorithm, including Nearest Class Centroid classifier, Nearest Sub-Class Centroid classifier, k-Nearest Neighbor classifier, Perceptron trained using Backpropagation, and Perceptron trained using Mean Squared Error. It also describes two datasets used in the experiments - the MNIST dataset of handwritten digits and the ORL face recognition dataset. The performance of the five classification schemes are compared on these datasets.
This document summarizes principal component analysis (PCA) and its application to face recognition. PCA is a technique used to reduce the dimensionality of large datasets while retaining the variations present in the dataset. It works by transforming the dataset into a new coordinate system where the greatest variance lies on the first coordinate (principal component), second greatest variance on the second coordinate, and so on. The document discusses how PCA can be used for face recognition by applying it to image datasets of faces. It reduces the dimensionality of the image data while preserving the key information needed to distinguish different faces. Experimental results show PCA provides reasonably accurate face recognition with low error rates.
The document discusses principal component analysis (PCA) and linear discriminant analysis (LDA) for dimensionality reduction in pattern recognition and their application to face recognition. PCA finds the directions along which the data varies the most to reduce dimensionality while retaining variation. LDA seeks directions that maximize between-class variation and minimize within-class variation. Studies show LDA performs better than PCA for classification when the training set is large and representative of each class.
A Time Series ANN Approach for Weather Forecastingijctcm
Weather forecasting is most challenging problem around the world. There are various reason because of its experimented values in meteorology, but it is also a typical unbiased time series forecasting problem in scientific research. A lots of methods proposed by various scientists. The motive behind research is to predict more accurate. This paper contribute the same using artificial neural network (ANN) and simulated in MATLAB to predict two important weather parameters i.e. maximum and minimum temperature. The model has been trained using past 60 years of real data collected from(1901-1960) and tested over 40 years to forecast maximum and minimum temperature. The results based on mean square error function (MSE) confirm, this model which is based on multilayer perceptron has the potential to successful application to weather forecasting
A SURVEY OF SPIKING NEURAL NETWORKS AND SUPPORT VECTOR MACHINE PERFORMANCE BY...ijdms
This document summarizes research on parallelizing spiking neural networks (SNNs) and support vector machines (SVMs) using GPUs. It reviews related work applying SNNs and SVMs to tasks like classification and pattern recognition. SNNs and SVMs can benefit from parallelization but require different approaches due to their computational characteristics. SNNs are better suited to FPGAs than GPUs due to their non-linear learning equations. SVMs parallelize well on GPUs by solving matrix operations in parallel. The document discusses factors to consider in parallelizing SNNs and SVMs, such as hardware limitations and memory requirements.
Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...IJERA Editor
The architectures of Artificial Neural Networks (ANN) are based on the problem domain and it is applied during
the „training phase‟ of sample data and used to infer results for the remaining data in the testing phase.
Normally, the architecture consist of three layers as input, hidden, output layers with the number of nodes in the
input layer as number of known values on hand and the number of nodes as result to be computed out of the
values of input nodes and hidden nodes as the output layer. The number of nodes in the hidden layer is
heuristically decided so that the optimum value is obtained with reasonable number of iterations with other
parameters with its default values. This study mainly focuses on Cascade-Correlation Neural Networks (CCNN)
using Back-Propagation (BP) algorithm which finds the number of neurons during the training phase itself by
appending one from the previous iteration satisfying the error condition gives a promising result on the optimum
number of neurons in the hidden layer
Automatic time series forecasting using nonlinear autoregressive neural netwo...journalBEEI
This study aims to determine an automatic forecasting method of univariate time series, using the nonlinear autoregressive neural network model with exogenous input (NARX). In this automatic setting, users only need to supply the input of time series. Then, an automatic forecasting algorithm sets up the appropriate features, estimate the parameters in the model, and calculate forecasts, without the users’ intervention. The algorithm method used include preprocessing, tests for trends, and the application of first differences. The time series were tested for seasonality, and seasonal differences were obtained from a successful analysis. These series were also linearly scaled to [−1, +1]. The autoregressive lags and hidden neurons were further selected through the stepwise and optimization algorithms, respectively. The 20 NARX models were fitted with different random starting weights, and the forecasts were combined using the ensemble operator, in order to obtain the final product. This proposed method was applied to real data, and its performance was compared with several available automatic models in the literature. The forecasting accuracy was also measured by mean squared error (MSE) and mean absolute percent error (MAPE), and the results showed that the proposed method outperformed the other automatic models.
Simulation of Single and Multilayer of Artificial Neural Network using Verilogijsrd.com
This document discusses the simulation of single layer and multilayer artificial neural networks using Verilog. It begins with an introduction to artificial neural networks and their application in VLSI circuit fault diagnosis. It then provides details on the algorithm and design methodology for simulating a single layer neural network to model an AND gate, showing the calculation of error over iterations in Matlab and time taken using Verilog code. For a multilayer network modeling an XOR gate, it similarly discusses the backpropagation algorithm, showing error reduction over iterations in Matlab and time taken using Verilog. It concludes that neural networks can help minimize time to find faults in digital circuits.
This document presents research using artificial neural networks to identify toxic gases in real time. A multi-layer perceptron neural network was trained using data from a multi-sensor system that detected hydrogen sulfide, nitrogen dioxide, and their mixture. Features extracted from the sensor responses were used as inputs to the neural network. The network was trained online using backpropagation and achieved 100% accuracy classifying gases during training and 96.6% accuracy during testing, with low error rates. This model achieved better performance than previous methods and can identify low concentrations of toxic gases in real time, which has applications for air quality monitoring and safety.
Improving of artifical neural networks performance by using gpu's a surveycsandit
In this paper we study the improvement in the performance of Artificial Neural Networks (ANN)
by using parallel programming in GPU or FPGA architectures. It is well known that ANN can
be parallelized according to particular characteristics of the training algorithm. We discuss
both approaches: the software (GPU) and the Hardware (FPGA). Different training strategies
are discussed: the Perceptron training unit, the Support Vector Machines (SVM) and Spiking
Neural Networks (SNN). The different approaches are evaluated by the training speed and
performance. On the other hand, algorithms were coded by authors in the hardware, like Nvidia
card, FPGA or sequential circuits that depends on methodology used, to compare learning time
with between GPU and CPU. Also, the main applications were made for recognition pattern,
like acoustic speech, odor and clustering According to literature, GPU has a great advantage
compared to CPU, this in the learning time except when it implies rendering of images, despite
several architectures of Nvidia cards and CPU’s. Also, in the survey we introduce a brief
description of the types of ANN and its techniques of execution to be related with results of
researching.
IMPROVING OF ARTIFICIAL NEURAL NETWORKS PERFORMANCE BY USING GPU’S: A SURVEYcsandit
This document provides a survey of improving the performance of artificial neural networks (ANNs) through parallel programming on GPUs. It discusses different ANN training strategies that can be parallelized, such as perceptrons, support vector machines, and spiking neural networks. GPUs provide significant speed advantages over CPUs for ANN training. The document reviews various studies that have implemented ANNs using GPUs and FPGAs, finding that GPUs reduce training time compared to CPUs, especially for algorithms involving large matrix operations like support vector machines. Spiking neural networks are better suited to FPGAs or custom circuits due to their complex temporal dynamics. The document concludes that GPUs are generally the best approach for ANN parallelization, but the
This document summarizes a study that used artificial neural networks (ANNs) and the Multi-Layer Perceptron model (MLP) to predict the bearing capacities of steel driven piles in sandy soils. The ANN was trained on data from full-scale pile load tests, including pile length, diameter, soil elastic modulus, and soil friction angle as inputs. The output was pile bearing capacity. The study examined factors for effective ANN behavior, trained and tested the network, and analyzed the sensitivity of the inputs on the output capacity prediction.
INVESTIGATIONS OF THE INFLUENCES OF A CNN’S RECEPTIVE FIELD ON SEGMENTATION O...adeij1
Segmentation of objects with various sizes is relatively less explored in medical imaging, and has been very challenging in computer vision tasks in general. We hypothesize that the receptive field of a deep model corresponds closely to the size of object to be segmented, which could critically influence the segmentation accuracy of objects with varied sizes. In this study, we employed “AmygNet”, a dual-branch fully convolutional neural network (FCNN) with two different sizes of receptive fields, to investigate the effects of receptive field on segmenting four major subnuclei of bilateral amygdalae. The experiment was conducted on 14 subjects, which are all 3-dimensional MRI human brain images. Since the scale of different subnuclear groups are different, by investigating the accuracy of each subnuclear group while using receptive fields of various sizes, we may find which kind of receptive field is suitable for object of which scale respectively. In the given condition, AmygNet with multiple receptive fields presents great potential in segmenting objects of different sizes.
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...IJERA Editor
This paper proposes the Rainfall Prediction System by using classification technique. The advanced and modified neural network called Data Core Based Fuzzy Min Max Neural Network (DCFMNN) is used for pattern classification. This classification method is applied to predict Rainfall. The neural network called fuzzy min max neural network (FMNN) that creates hyperboxes for classification and predication, has a problem of overlapping neurons that resoled in DCFMNN to give greater accuracy. This system is composed of forming of hyperboxes, and two kinds of neurons called as Overlapping Neurons and Classifying neurons, and classification used for prediction. For each kind of hyperbox its data core and geometric center of data is calculated. The advantage of this method is it gives high accuracy and strong robustness. According to evaluation results we can say that this system gives better prediction of rainfall and classification tool in real environment.
Hybrid neural networks for time series learning by Tian Guo, EPFL, SwitzerlandEuroIoTa
Time series is prevalent in the IoT environment and used for monitoring the evolving behavior of involved entities or objects over time. Analyzing and mining such time series data serve for revealing insightful long-term and instantaneous information behind the data, e.g., trend, event, correlation and causality and so on.
Inspired by the recent successes of neural networks, in this talk we present a novel end-to-end hybrid neural network for learning the local and global contextual features of time series. In particular, we explore the idea of hybrid neural networks in a specific time series learning problem, namely learning the local trend of time series. Local trends of time series characterize the intermediate upward and downward patterns of time series. Learning and forecasting the local trend in time series data play an important role in many real applications, ranging from investing in the stock market, resource allocation in data centers and load schedule in the smart grid. We propose TreNet, a hybrid neural network which leverages convolutional neural networks (CNNs) to extract salient features from local raw data of time series and a long-short term memory recurrent neural network (LSTM) to capture such dependency of local trends. Preliminary experimental results on real datasets demonstrate the superiority of TreNet over conventional CNN, LSTM, HMM method and various kernel based baselines.
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...WithTheBest
This presentation illustrates distinct statistical and machine learning approaches to automated recognition of major brain tissues in 3D brain MRI.
Nataliya Portman, Postdoctoral Fellow Faculty of Science, UOIT, Oshawa, ON Canada
PhD in Applied Mathematics, University of Waterloo | Postdoctoral Research on Brain MRI Segmentation, Neuro | Current: Applied Machine Learning in Materials Science, University of Ontario Institute of Technology
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
Most conventional Neural Architecture Search (NAS) approaches are limited in that they only generate architectures without searching for the optimal parameters. While some NAS methods handle this issue by utilizing a supernet trained on a large-scale dataset such as ImageNet, they may be suboptimal if the target tasks are highly dissimilar from the dataset the supernet is trained on. To address such limitations, we introduce a novel problem of Neural Network Search (NNS), whose goal is to search for the optimal pretrained network for a novel dataset and constraints (e.g. number of parameters), from a model zoo. Then, we propose a novel framework to tackle the problem, namely Task-Adaptive Neural Network Search (TANS). Given a model-zoo that consists of network pretrained on diverse datasets, we use a novel amortized meta-learning framework to learn a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a high-performing network on it, and minimize the similarity between irrelevant dataset-network pairs. We validate the effectiveness and efficiency of our method on ten real-world datasets, against existing NAS/AutoML baselines. The results show that our method instantly retrieves networks that outperform models obtained with the baselines with significantly fewer training steps to reach the target performance, thus minimizing the total cost of obtaining a task-optimal network. Our code and the model-zoo are available at https://anonymous.4open.science/r/TANS-33D6
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
The document summarizes several machine learning algorithms used for data mining:
- Decision trees use nodes and edges to iteratively divide data into groups for classification or prediction.
- Naive Bayes classifiers use Bayes' theorem for text classification, spam filtering, and sentiment analysis due to their multi-class prediction abilities.
- K-nearest neighbors algorithms find the closest K data points to make predictions for classification or regression problems.
- ID3, CART, and k-means clustering are also summarized highlighting their uses, advantages, and disadvantages.
This paper presents a study using an artificial neural network (ANN) for load forecasting in the smart grid. Specifically, it uses a backpropagation network to forecast electricity load in Ontario, Canada based on weather and other input data. The paper describes collecting hourly load and weather data over two years, normalizing the data, creating a three-layer backpropagation network with different numbers of neurons, training the network using two algorithms, and testing the network on a separate data set to analyze forecast accuracy. The results show the ANN approach is able to accurately forecast electricity load based on the input factors.
In this paper, a new steganography algorithm has been suggested to enforce the security of data hiding and to increase the amount of payloads. This algorithm is based on four safety layers; the first safety layer has been initiated through compression and an encryption of a confidential message using a set partition in hierarchical trees (SPIHT) and advanced encryption standard (AES) mechanisms respectively. An irregular image segmentation algorithm (IIS) on a cover-image (Ic) has been constructed successfully in
the second safety layer, and it is based on the adaptive reallocation segments' edges (ARSE) by applying an
adaptive finite-element method (AFEM) to find the numerical solution of the proposed partial differential equation (PDE). An intelligent computing technique using a hybrid adaptive neural network with a modified ant colony optimizer (ANN_MACO) has been proposed in the third safety layer to construct a
learning system. This system accepts entry using support vector machine (SVM) to generate input patterns as features of byte attributes and produces new features to modify a cover-image. The significant innovation of the proposed novel steganography algorithm is applied efficiently on the forth
safety layer which is more robust for hiding a large amount of confidential message reach to six bits per pixel (bpp) into color images. The new approach of hiding algorithm works against statistical and visual attacks with high imperceptible of hiding data into stego-images (Is). The experimental results are
discussed and compared with the previous steganography algorithms; it demonstrates that the proposed algorithm has a significant improvement on the effect of the security level of steganography by making an arduous task of retrieving embedded confidential message from color images.
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
This document discusses machine learning algorithms for image classification using five different classification schemes. It summarizes the mathematical models behind each classification algorithm, including Nearest Class Centroid classifier, Nearest Sub-Class Centroid classifier, k-Nearest Neighbor classifier, Perceptron trained using Backpropagation, and Perceptron trained using Mean Squared Error. It also describes two datasets used in the experiments - the MNIST dataset of handwritten digits and the ORL face recognition dataset. The performance of the five classification schemes are compared on these datasets.
This document summarizes principal component analysis (PCA) and its application to face recognition. PCA is a technique used to reduce the dimensionality of large datasets while retaining the variations present in the dataset. It works by transforming the dataset into a new coordinate system where the greatest variance lies on the first coordinate (principal component), second greatest variance on the second coordinate, and so on. The document discusses how PCA can be used for face recognition by applying it to image datasets of faces. It reduces the dimensionality of the image data while preserving the key information needed to distinguish different faces. Experimental results show PCA provides reasonably accurate face recognition with low error rates.
The document discusses principal component analysis (PCA) and linear discriminant analysis (LDA) for dimensionality reduction in pattern recognition and their application to face recognition. PCA finds the directions along which the data varies the most to reduce dimensionality while retaining variation. LDA seeks directions that maximize between-class variation and minimize within-class variation. Studies show LDA performs better than PCA for classification when the training set is large and representative of each class.
Artificial Intelligence Applications in Petroleum Engineering - Part IRamez Abdalla, M.Sc
This document discusses applications of artificial intelligence, specifically artificial neural networks and genetic algorithms, in petroleum engineering. It provides an overview of neural networks in OnePetro papers, describes the basic concepts and training processes of neural networks and genetic algorithms. It then discusses various applications of these techniques in reservoir engineering, production technologies, and oil well drilling, including reservoir characterization, modeling, well test analysis, permeability prediction, production monitoring, drilling optimization, and more. The presentation aims to explore these applications in more depth.
The history of self-driving cars began in the 1930s with conceptual designs and progressed through the 20th century with early prototypes. Significant milestones include RCA Labs building a guided miniature car in the 1950s, vision-guided robotic vans achieving highway speeds in the 1980s, and USDOT demonstrations of automated highway systems in the 1990s. Development continued through military efforts in the 2000s and 2010s with increasing capabilities and testing of commercial applications such as mining haulage systems.
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
1) The document discusses super-resolution techniques in deep learning, including inverse problems, image restoration problems, and different deep learning models.
2) Early models like SRCNN used convolutional networks for super-resolution but were shallow, while later models incorporated residual learning (VDSR), recursive learning (DRCN), and became very deep and dense (SRResNet).
3) Key developments included EDSR which provided a strong backbone model and GAN-based approaches like SRGAN which aimed to generate more realistic textures but require new evaluation metrics.
Machine learning in science and industry — day 4arogozhnikov
- tabular data approach to machine learning and when it didn't work
- convolutional neural networks and their application
- deep learning: history and today
- generative adversarial networks
- finding optimal hyperparameters
- joint embeddings
1. The document describes using a deep neural network to detect changes between two SAR images by preclassifying the images, training the neural network on selected samples, and analyzing the results.
2. A similarity matrix and variance matrix are calculated during preclassification to identify and jointly label similar pixels, while different pixels are labeled separately. Good samples are selected to train the neural network.
3. The neural network is tested on images with different types and levels of noise and performs well at change detection, with performance increasing as noise decreases. Future work could focus on accelerating the training process.
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...IRJET Journal
This document summarizes research on developing parallel algorithms to optimize solving the longest common subsequence (LCS) problem. LCS is commonly used for sequence comparison in bioinformatics. Traditional sequential dynamic programming algorithms have complexity of O(mn) for sequences of lengths m and n. The document reviews parallel algorithms developed using tools like OpenMP and GPUs like CUDA to reduce computation time. It proposes the authors' own optimized parallel algorithm for multi-core CPUs using OpenMP.
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
Graph Representation Learning with Deep Embedding Approach:
Graphs are commonly used data structure for representing the real-world relationships, e.g., molecular structure, knowledge graphs, social and communication networks. The effective encoding of graphical information is essential to the success of such applications. In this talk I’ll first describe a general deep learning framework, namely structure2vec, for end to end graph feature representation learning. Then I’ll present the direct application of this model on graph problems on different scales, including community detection and molecule graph classification/regression. We then extend the embedding idea to temporal evolving user-product interaction graph for recommendation. Finally I’ll present our latest work on leveraging the reinforcement learning technique for graph combinatorial optimization, including vertex cover problem for social influence maximization and traveling salesman problem for scheduling management.
The document summarizes Yan Xu's upcoming presentation at the Houston Machine Learning Meetup on dimension reduction techniques. Yan will cover linear methods like PCA and nonlinear methods such as ISOMAP, LLE, and t-SNE. She will explain how these methods work, including preserving variance with PCA, using geodesic distances with ISOMAP, and modeling local neighborhoods with LLE and t-SNE. Yan will also demonstrate these methods on a dataset of handwritten digits. The meetup is part of a broader roadmap of machine learning topics that will be covered in future sessions.
Targeted Visual Content Recognition Using Multi-Layer Perceptron Neural Networkijceronline
Visual Content Recognition has become an attractive research oriented field of computer vision and machine learning for the last few decades. The focus of this work is monument recognition. Imagesof significant locations captured and maintainedas data bases can be used by the travelers before visiting the places. They can use images of a famous building to know the description of the building. In all these applications, the visual content recognition plays a key role. Humans can learn the contents of the images and quickly identify them by seeing again. In this paper we present a constructive training algorithm for Multi-Layer Perceptron Neural Network (MLPNN) applied to a set of targeted object recognition applications. The target set consists of famous monuments in India for travel guide applications. The training data set (TDS) consists 3000 images. The Gist features are extracted for the images. These are given to the neural network during training phase.The mean square error (MSE) on the training data is computed and used as metric to adjust the weights of the neural network,using back propagation algorithm. In the constructive learning, if the MSE is less than a predefined value, the number of hidden neurons is increased. Input patterns are trained incrementally until all patterns of TDS are presented and learned. The parameters or weights obtained during the training phase are used in the testing phase, in which new untrained images are given to the neural network for recognition. If the test image is recognized, the details of the image will also be displayed. The performance accuracy of this method is found to be 95%
Trackster Pruning at the CMS High-Granularity CalorimeterYousef Fadila
The document discusses approaches for assigning weights to layer clusters in Tracksters to indicate the likelihood of belonging to the same particle or being contaminated. The goal is to develop reproducible code, port a trained model to C, and provide a final report and presentation. Various data representations and machine learning methods are explored, including layer-cluster level, extended layer-cluster level, sequence representations using LSTM and CNN, and graph representations using GCN and adaptive sampling. Performance is evaluated on classification of purity levels. Extended layer-cluster and sequence representations showed improved performance over the basic layer-cluster approach. Notebooks containing the code are described in an appendix.
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcscpconf
This document summarizes a research paper that proposes a modified version of Steering Kernel Regression called Median Based Parallel Steering Kernel Regression for improving image reconstruction. The key points are:
1. The proposed algorithm addresses two drawbacks of the original Steering Kernel Regression technique by implementing it in parallel on GPUs and multi-cores to improve computational efficiency, and using a median filter to suppress spurious edges in the output.
2. Experimental results show the proposed algorithm achieves a speedup of 21x using GPUs and 6x using multi-cores compared to serial implementation, while maintaining comparable reconstruction quality as measured by RMSE.
3. The algorithm is implemented iteratively, applying the median filter after each iteration
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcsandit
Image reconstruction is a process of obtaining the original image from corrupted data.Applications of image reconstruction include Computer Tomography, radar imaging, weather forecasting etc. Recently steering kernel regression method has been applied for image reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is computationally intensive. Secondly, output of the algorithm suffers form spurious edges(especially in case of denoising). We propose a modified version of Steering Kernel Regression called as Median Based Parallel Steering Kernel Regression Technique. In the proposed algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The second problem is addressed by a gradient based suppression in which median filter is used.Our algorithm gives better output than that of the Steering Kernel Regression. The results are compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of 21x using GPUs and shown speedup of 6x using multi-cores.
Median based parallel steering kernel regression for image reconstructioncsandit
Image reconstruction is a process of obtaining the original image from corrupted data.
Applications of image reconstruction include Computer Tomography, radar imaging, weather
forecasting etc. Recently steering kernel regression method has been applied for image
reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is
computationally intensive. Secondly, output of the algorithm suffers form spurious edges
(especially in case of denoising). We propose a modified version of Steering Kernel Regression
called as Median Based Parallel Steering Kernel Regression Technique. In the proposed
algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The
second problem is addressed by a gradient based suppression in which median filter is used.
Our algorithm gives better output than that of the Steering Kernel Regression. The results are
compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of
21x using GPUs and shown speedup of 6x using multi-cores.
Data driven model optimization [autosaved]Russell Jarvis
Russell Jarvis is developing a general purpose optimizer called NeuronUnit to fit abstract neural models to the firing dynamics of specific biological neurons. As a proof of concept, he is using NeuronUnit to fit the Izhikevich model to a murine layer 5 neocortex pyramidal neuron. He discusses using virtual electrophysiology experiments in NeuronUnit along with real neuron recordings from the Allen Brain Atlas to derive error functions that guide the optimization of the model parameters to replicate the biological neuron.
This document summarizes research on enhancing gene expression programming (GEP) for Reynolds-averaged Navier-Stokes equations turbulence modeling with unsupervised clustering. It presents a GEP-enhanced multi-model framework that uses feature selection, dimensionality reduction, and clustering to assign different turbulence models to distinct regions of a flow, improving simulation accuracy. Results show the approach produces more accurate mean velocities and Reynolds stresses for a body-of-revolution testcase compared to baseline and GEP-driven models. Ongoing work includes optimizing the framework configuration and extending it to 3D domains.
1. Learning From Multiple Views of
Data
PhD Defense talk of
Abhishek Sharma
Collaborators
David W. Jacobs, Larry S. Davis, Hal Daume III, Oncel Tuzel, Ming Yu-Liu,
Abhishek Kumar, Jonghyun Choi, Murad Al Haj, Sanja Fidler and Angjoo
Kanazawa
2. Overview
1. Introduction
PART - I
1. Content Extraction
1. Semantic Segmentation as visual feature
2. Contextual information
3. Neural Network model
PART - II
1. Cross-modal content matching
1. Challenges
2. PLS based common representation
3. Generalized Multi-view Analysis
2. Future Directions
3. Match image and sentence
Image courtesy – UIUC sentence-Image dataset: http://vision.cs.uiuc.edu/pascal-sentences/
Text
viewTwo parked jet airplanes facing opposite directions
Image
view
Canonical/
Common
view
4. Find the image based on a sentence
Two parked jet airplanes facing opposite directions
5. Find the image based on a sentence
Two parked jet airplanes facing opposite directions
6. Find the image based on a sentence
Two parked jet airplanes facing opposite directions
7. Find the image based on a sentence
Two parked jet airplanes facing opposite directions
8. A simple computer-based matching of sentence and image
1. Task understanding
2. Content from text and image
1. jet airplanes
2. Two
3. Parked
4. facing opposite direction
3. Content Matching
9. Cross-view content matching challenges
Text – “Two parked jet airplanes facing opposite directions on a grassy land”
Bag-of-Word
SIFT BoW
1
jetdirection facing
111 …Index 2 3 4 10000
Dimension
Mismatch
Semantic
Mismatch
Insufficient
Content
Deep ?
10. Cross-view content matching challenge
Lack of correspondence
Same Region
Missing Region
=
Column-wise Vectorization
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Deep ?
11. Other useful problems
Task – Face recognition
… Face DB
Content Extraction
Pixel, Attribute, SIFT, LBP,
HOG, Gabor
Content Matching
CCA, PLS, Metric Learning,
SVMs
12. Other useful problems
Task – Forensic sketch photo matching
Suspect
Image
Database
Forensic
Sketch Query
Image courtesy – Lios Gibson, “Forensic Art Essentials: A Manual for Law Enforcement Artists”
Content Extraction
SIFT, HOG, Gabor
Content Matching
Local LDA, PLS, CCA
13. This Dissertation
We are interested in extracting and matching task-dependent content
across multiple modalities
Task
Content
Matching
Content
Extraction
Pose-invariant face recognition
Pose-lighting invariant face recognition
Text-image matching
Forensic Image-photo matching
Semantic Segmentation
Partial Least Square
Pose-error robust matching
Generalized Multi-view Analysis
16. Semantic Segmentation: Overview
1. Scene understanding, robotics, medical image analysis etc.
2. Related work
3. Problem formulation
4. Role of context
5. Intuitive picture
6. Mathematical picture
7. Complete Pipeline
8. Back-propagation and issues
9. Pure-node RCPN
10. Experiments
17. Related Work
1. Multi-scale CNN (Farabet, Pineheiro)
2. Deep CNN (DeepSeg)
3. Non-parametric template matching (Tighe_1, Tighe_2, Eigen, Yang)
4. CRF models (Gould, Munoz, Lempitzky, Kumar, Mottaghi, Yuille)
18. Semantic Segmentation: Problem formulation
Label each super-pixel
Super-
segmentation
Road
Car
Ground
Image courtesy – http://www.cs.unc.edu/~jtighe/Papers/ECCV10/siftflow/baseFinal.html
Input image Super-segment overlaid image
19. Semantic Segmentation: Context
• Labeling super-pixel in isolation is difficult
• Without context machines outperform humans: 77.4% vs 72.2%
(Mottaghi et al.)
Building
Train
Aeroplane
Image courtesy – Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun and Devi Parikh, “Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs”, IEEE CVPR 2013
21. Semantic Segmentation: Context
• Labeling super-pixel in isolation is difficult
• Without context machines outperform humans: 77.4% vs 72.2%
(Mottaghi et al.)
• Use context
• MRFs and CRFs
• Typically MRFs and CRFs use human designed potential functions and features
• Complex human visual system – LEARN IT FROM DATA
Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun and Devi Parikh, “Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs”, IEEE CVPR 2013
22. Recursive Context Propagation Network or RCPN
1. Label each super-pixel using entire image
2. Fast feed-forward computations for real-time labeling
3. End-to-end learning
4. Modular to the segmentation pipeline
24. Semantic Segmentation - Pipeline
• 𝐹𝐶𝑁𝑁 = Multi-scale CNN at scales – 1, 2 and 4
• 8×8×16 → 2×2 maxpool → 7×7×64 → 2×2
maxpool → 7×7×256
• 256×3 = 768 dimensional pixel feature
• Field of View (FOV) for every pixel = 47×47,
94×94 and 188×188 at different scales
• Super-pixels by LiuSeg
• ~ 100 super-pixels per image
• 𝑣𝑖 = average pixel features in each super-pixel
• Data augmentation by 5 random average sets
1. Super-pixel feature
31. RCPN Back-propagation and Bypass Error
1v
2v
1x
2x
6x 6
~x
1y1
~x
9x
cat
e1
dec
e6
com
e9
Sub-tree
com
e6
sem
e1
sem
e2
1l
32. RCPN Back-propagation and Bypass Error
1v 1x 1y1
~x
cat
e1
2v 2x
6x 6
~x
9x
dec
e6
com
e9
Sub-tree
com
e6
sem
e1
sem
e2
1l
Combiner is
bypassed
Context
Lost
Poor Local
Minimum
Sem Grad
Com Grad
Dec Grad
Lab Grad
Empirical
𝒈 𝑐𝑜𝑚 ≪ 𝒈 𝑠𝑒𝑚 ≈ 𝒈 𝑑𝑒𝑐 ≪ 𝒈𝑙𝑎𝑏
Ideal
𝒈 𝑠𝑒𝑚 < 𝒈 𝑐𝑜𝑚 < 𝒈 𝑑𝑒𝑐 < 𝒈𝑙𝑎𝑏
33. Pure-node RCPN or PN-RCPN
•RCPN + pure-nodes classification loss
•Benefits
•Roughly 65% more training data
•Meaningful combination by combiner
•Deeper and stronger gradients
35. Grad Strength: RCPN vs. PN-RCPN
Sem Grad
Com Grad
Dec Grad
Lab Grad
Sem Grad
Com Grad
Dec Grad
Lab Grad
𝒈 𝑐𝑜𝑚 ≪ 𝒈 𝑠𝑒𝑚 ≈ 𝒈 𝑑𝑒𝑐 ≪ 𝒈𝑙𝑎𝑏 𝒈 𝑠𝑒𝑚 < 𝒈 𝑐𝑜𝑚 ≈ 𝒈 𝑑𝑒𝑐 < 𝒈𝑙𝑎𝑏
36. Experiments: Datasets
We conduct semantic segmentation experiments on three datasets
Stanford Background
Color images with 8 semantic classes
Train/Test – 572/143 images
SIFT Flow
Color images with 33 semantic classes
Train/Test – 2488/200
Daimler Urban Dataset
Gray-scale images with 6 semantic classes
Train/Test – 500/200
37. Experiments: Details
• Per pixel 0.5 subtraction
• 100 Super-pixels/image for Stanford and SIFT Flow
• 800 for Daimler due to large size
• 10 random parse trees with 5 random feature set for training to avoid
over-fitting
• 20 random parse trees with max-voting for testing
38. Experiments: Performance metric
1. Per-pixel accuracy (PPA)
2. Mean-class accuracy (MCA)
3. Intersection over Union (IoU) – Penalize under- & over-segmentation
4. Dynamic IoU (Dyn IoU) – IoU for dynamic objects
5. Time Per Image (TPI) – Both CPU and GPU
39. Stanford Results
Method PPA MCA IoU TPI (CPU/GPU)
Gould 76.4 NA NA 30 – 600 / NA
Munoz 76.9 NA NA 12 / NA
Tighe_1 77.5 NA NA 4 / NA
Kumar 79.4 NA NA < 600 / NA
Socher 78.1 NA NA NA / NA
Lempitzky 81.9 72.4 NA > 60 /NA
Singh 74.1 62.2 NA 20 / NA
Farabet 81.4 76.0 NA 60.5 / NA
Eigen 75.3 66.5 NA 16.6 / NA
Pinheiro 80.2 69.6 NA 10 / NA
Plain-NN 80.1 69.7 56.4 1.1 / 0.4
RCPN 81.8 73.9 61.3 1.1 / 0.4
PN-RCPN 82.1 79.0 64.0 1.1 / 0.4
TM-RCPN 82.3 79.1 64.5 1.6-6.1 / 0.9-5.9
40. SIFT Flow results
Method PPA MCA IoU TPI (CPU/GPU)
Tighe 77.0 30.1 NA 8.4 / NA
Liu 76.7 NA NA 31 / NA
Siingh 79.2 33.8 NA 20 / NA
Eigen 77.1 32.5 NA 16.6 / NA
Farabet 78.5 29.6 NA NA / NA
Bal. Farabet 72.3 50.8 NA NA / NA
Tighe, 24 78.6 39.2 NA 8.4 / NA
Pinheiro 77.7 29.8 NA NA / NA
Yang 79.8 48.7 NA < 12 / NA
Plain-NN 76.3 32.1 24.7 1.1 / 0.36
RCPN 79.6 33.6 26.9 1.1 / 0.4
Bal. RCPN 75.5 48.0 28.6 1.1 / 0.4
PN-RCPN 80.9 39.1 30.8 1.1 / 0.4
Bal. PN-RCPN 75.5 52.8 30.2 1.1 / 0.4
TM-RCPN 80.8 38.4 30.7 1.6-6.1 / 0.9-5.4
Bal. TM-RCPN 76.4 52.6 31.4 1.6-6.1 / 0.9-5.4
DeepSeg 85.2 51.7 39.1 NA / 0.2
46. PLS based multi-modal face recognition
PLS Bridge
Common
Subspace
Pose
Resolution
Sketch
WX WY
Shape = Identity
X Y
47. PLS based pose-invariant face recognition
0.75
0.8
0.85
0.9
0.95
1
1.05
PGFR TFA LLR ELF
Partial Comparison –Differenttesting
scenario
Others Proposed
• CMU PIE face date set for experiments.
• 34 training and 34 testing, intensity features
54. GMA cont..
• Multi-view extension of any generalized eigen-value
feature extraction
• GMA + LDA = GMLDA
D = Between class scatter matrix; S = Between class scatter
matrix
• GMA + MFA = GMMFA
D = Penalty Graph; S = Intrinsic Graph
• GMA + LPP = GMLPP
D = Identity; S = Graph Laplacian of Similarity matrix
55. Pros and Cons
Cross-view classification and retrieval
Kernelizable
Closed form optimal solution
Supervised
Generalize to unseen classes
Domain agnostic
56. Pros and Cons
Still not ideal
Non-probabilistic
Shallow
Similar views across test and train
57. VIEW 2
GMAVIEW 1 CCA/PLS/BLM
SVM-2K/HMFDA IDEAL
DIFFERENT LATENT SPACESORIGINAL SPACE
PAIRED DATA
Final Picture
58. Experiments
Pose and Lighting Invariant face
recognition
• 129 train subjects in 5 illums
• 129 test subjects (same identity
diff session) in 18 illums
• 120 subjects in 5 illum
• 129 test subjects (diff identity
diff session) in 18 illum
59. Text-Image Retrieval
• Wiki pages (2173 + 693)
• 10 Different classes
• Latent Dirichlet Allocation Model based text features
• SIFT histogram based image features
• Precision-Recall based Mean Average Precision score
• SM – Sematic matching (domain dependent approach)
• SCM – Semantic matching in CCA latent space (two stage
domain dependent approach)
60. Future Directions
• Deep learning based feature extraction
• Large-scale Data collection
• Deep Multi-view algorithms Vs. Common Deep Network
• Unsupervised training
62. Reference
Tighe_1: J. Tighe and S. Lazebnik. Superparsing. Int. J. Comput. Vision, 101(2):329–349, 2013
Tighe_2: J. Tighe and S. Lazebnik. Finding things: Image parsing with regions and per-exemplar detectors. IEEE CVPR, 2013
Gould: S. Gould, R. Fulton, and D. Koller. Decomposing a scene into geometric and semantically consistent regions. IEEE ICCV, 2009
Munoz: D. Munoz, J. A. Bagnell, and M. Hebert. Stacked hierarchical labeling. ECCV, 2010
Kumar: M. P. Kumar and D. Koller. Efficiently selecting regions for scene understanding. IEEE CVPR, 2010
Lempitsky: V. Lempitsky, A. Vedaldi, and A. Zisserman. A pylon model for semantic segmentation. NIPS, 2011
Farabet: C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. IEEE TPAMI, August 2013
Eigen: R. Fergus and D. Eigen. Nonparametric image parsing using adaptive neighbor sets. IEEE CVPR, 2012
Joint: L. Ladick, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, and P. Torr. Joint optimization for object class
segmentation and dense stereo reconstruction. International Journal of Computer Vision, 100(2):122–133, 2012
Liu: C. Liu, J. Yuen, and A. Torralba. Nonparametric scene parsing via label transfer. IEEE TPAMI, 33(12), Dec 2011
LiuSeg: M.-Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa. Entropy rate superpixel segmentation. IEEE CVPR, 2011
Pinheiro: P. H. O. Pinheiro and R. Collobert. Recurrent convolutional neural networks for scene parsing. ICML, 2014
Stixmantics: T. Scharwachter, M. Enzweiler, U. Franke, and S. Roth. Stix- ¨ mantics: A medium-level model for real-time semantic scene
understanding. ECCV, 2014
Yang: J. Yang, B. Price, S. Cohen, and M.-H. Yang. Context driven scene parsing with attention to rare classes. CVPR, pages 3294–3301,
2014
Editor's Notes
First show a single mode matching and then discuss cross-modal with face perhaps easier or text-image.
Assigning a class to each pixel
Prior work on semantic segmentation at least one slide
Say from where you got it all of them
Cite Racquel’s paper
Racquel cite and give a little more time to audience
Try to remove as much as possible
This is what I did, dnt throw it away like this without emphasizing
Make correspondence between the part being discussed and the text
Make correspondence between the part being discussed and the text
Too much parse tree text
Put a picture of pure node with animation say something that it is learnable and end-to-end
Variable width of the gradient because it is vanishing
Variable width of the gradient because it is vanishing
Variable width of the gradient because it is vanishing
Variable width of the gradient because it is vanishing
Pictures from dataset
Segmentation results
Make palletes of all colors for view-specific content