This thesis analyzes neural coding in the hippocampus using statistical modeling techniques. It summarizes experiments recording neural activity from place cells in rats navigating a T-maze. Generalized linear models are used to model each neuron's firing rate based on position and firing history. Goodness of fit tests show many neurons are well modeled. An algorithm is derived to decode position from the ensemble activity using Bayes' rule and numerical integration of the exact posterior distribution. The models and decoding algorithm are applied to analyze the hippocampal population code.
This document discusses neural decoding techniques. It defines encoding as relating neuron responses to stimuli, while decoding is the inverse problem of relating stimuli to neuron responses. It introduces Bayesian decoding using prior probabilities of stimuli, the probability of neural responses, and Bayes' theorem to calculate the conditional probability of a stimulus given a neural response. Examples discussed include decoding movement intentions from neural data and using linear filters and particle filters for decoding. Other techniques mentioned include reverse correlation to construct receptive fields and signal detection theory for detecting signals in noisy backgrounds.
1. Population models describe the activity of groups of similar neurons receiving common input. They include firing-rate models and the probability density approach.
2. The refractory density approach models populations using a hazard function that describes the probability of individual neurons firing. It can accurately model both stationary and non-stationary activity.
3. Conductance-based refractory density and firing-rate models can simulate the responses of coupled neuronal populations to different input patterns and reproduce experimental observations of population-level activity.
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...홍배 김
The document discusses using Gaussian process global optimization, also known as Bayesian optimization, to tune the gains of an automatic controller. It involves using a Gaussian process to model an unknown cost function based on noisy evaluations. The next parameters to evaluate are chosen to maximize the acquisition function, which seeks to reduce uncertainty about the minimum of the cost function. Specifically, it proposes using Entropy Search, which selects points that minimize the entropy of the predicted cost distribution, allowing the method to quickly find globally optimal controller gains.
Robot의 Gait optimization, Gesture Recognition, Optimal Control, Hyper parameter optimization, 신약 신소재 개발을 위한 optimal data sampling strategy등과 같은 ML분야에서 약방의 감초 같은 존재인 GP이지만 이해가 쉽지 않은 GP의 기본적인 이론 및 matlab code 소개
【DL輪読会】Incorporating group update for speech enhancement based on convolutio...Deep Learning JP
1. The document discusses a research paper on speech enhancement using a convolutional gated recurrent network (CGRN) and ordered neuron long short-term memory (ON-LSTM).
2. The proposed method aims to improve speech quality by incorporating both time and frequency dependencies using CGRN, and handling noise with varying change rates using ON-LSTM.
3. CGRN replaces fully-connected layers with convolutions, allowing it to capture local spatial structures in the frequency domain. ON-LSTM groups neurons based on the change rate of internal information to model hierarchical representations.
EXPERT SYSTEMS AND SOLUTIONS
Project Center For Research in Power Electronics and Power Systems
IEEE 2010 , IEEE 2011 BASED PROJECTS FOR FINAL YEAR STUDENTS OF B.E
Email: expertsyssol@gmail.com,
Cell: +919952749533, +918608603634
www.researchprojects.info
OMR, CHENNAI
IEEE based Projects For
Final year students of B.E in
EEE, ECE, EIE,CSE
M.E (Power Systems)
M.E (Applied Electronics)
M.E (Power Electronics)
Ph.D Electrical and Electronics.
Training
Students can assemble their hardware in our Research labs. Experts will be guiding the projects.
EXPERT GUIDANCE IN POWER SYSTEMS POWER ELECTRONICS
We provide guidance and codes for the for the following power systems areas.
1. Deregulated Systems,
2. Wind power Generation and Grid connection
3. Unit commitment
4. Economic Dispatch using AI methods
5. Voltage stability
6. FLC Control
7. Transformer Fault Identifications
8. SCADA - Power system Automation
we provide guidance and codes for the for the following power Electronics areas.
1. Three phase inverter and converters
2. Buck Boost Converter
3. Matrix Converter
4. Inverter and converter topologies
5. Fuzzy based control of Electric Drives.
6. Optimal design of Electrical Machines
7. BLDC and SR motor Drives
This document discusses neural decoding techniques. It defines encoding as relating neuron responses to stimuli, while decoding is the inverse problem of relating stimuli to neuron responses. It introduces Bayesian decoding using prior probabilities of stimuli, the probability of neural responses, and Bayes' theorem to calculate the conditional probability of a stimulus given a neural response. Examples discussed include decoding movement intentions from neural data and using linear filters and particle filters for decoding. Other techniques mentioned include reverse correlation to construct receptive fields and signal detection theory for detecting signals in noisy backgrounds.
1. Population models describe the activity of groups of similar neurons receiving common input. They include firing-rate models and the probability density approach.
2. The refractory density approach models populations using a hazard function that describes the probability of individual neurons firing. It can accurately model both stationary and non-stationary activity.
3. Conductance-based refractory density and firing-rate models can simulate the responses of coupled neuronal populations to different input patterns and reproduce experimental observations of population-level activity.
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...홍배 김
The document discusses using Gaussian process global optimization, also known as Bayesian optimization, to tune the gains of an automatic controller. It involves using a Gaussian process to model an unknown cost function based on noisy evaluations. The next parameters to evaluate are chosen to maximize the acquisition function, which seeks to reduce uncertainty about the minimum of the cost function. Specifically, it proposes using Entropy Search, which selects points that minimize the entropy of the predicted cost distribution, allowing the method to quickly find globally optimal controller gains.
Robot의 Gait optimization, Gesture Recognition, Optimal Control, Hyper parameter optimization, 신약 신소재 개발을 위한 optimal data sampling strategy등과 같은 ML분야에서 약방의 감초 같은 존재인 GP이지만 이해가 쉽지 않은 GP의 기본적인 이론 및 matlab code 소개
【DL輪読会】Incorporating group update for speech enhancement based on convolutio...Deep Learning JP
1. The document discusses a research paper on speech enhancement using a convolutional gated recurrent network (CGRN) and ordered neuron long short-term memory (ON-LSTM).
2. The proposed method aims to improve speech quality by incorporating both time and frequency dependencies using CGRN, and handling noise with varying change rates using ON-LSTM.
3. CGRN replaces fully-connected layers with convolutions, allowing it to capture local spatial structures in the frequency domain. ON-LSTM groups neurons based on the change rate of internal information to model hierarchical representations.
EXPERT SYSTEMS AND SOLUTIONS
Project Center For Research in Power Electronics and Power Systems
IEEE 2010 , IEEE 2011 BASED PROJECTS FOR FINAL YEAR STUDENTS OF B.E
Email: expertsyssol@gmail.com,
Cell: +919952749533, +918608603634
www.researchprojects.info
OMR, CHENNAI
IEEE based Projects For
Final year students of B.E in
EEE, ECE, EIE,CSE
M.E (Power Systems)
M.E (Applied Electronics)
M.E (Power Electronics)
Ph.D Electrical and Electronics.
Training
Students can assemble their hardware in our Research labs. Experts will be guiding the projects.
EXPERT GUIDANCE IN POWER SYSTEMS POWER ELECTRONICS
We provide guidance and codes for the for the following power systems areas.
1. Deregulated Systems,
2. Wind power Generation and Grid connection
3. Unit commitment
4. Economic Dispatch using AI methods
5. Voltage stability
6. FLC Control
7. Transformer Fault Identifications
8. SCADA - Power system Automation
we provide guidance and codes for the for the following power Electronics areas.
1. Three phase inverter and converters
2. Buck Boost Converter
3. Matrix Converter
4. Inverter and converter topologies
5. Fuzzy based control of Electric Drives.
6. Optimal design of Electrical Machines
7. BLDC and SR motor Drives
The document discusses deep feedforward networks, also known as multilayer perceptrons. It begins with an introduction to feedforward networks, which apply vector-to-vector functions across multiple hidden layers without feedback connections between layers. Each hidden layer consists of units that resemble neurons. The document then covers gradient-based learning, different cost functions, types of output and hidden units like ReLU, and considerations for network architecture such as depth, width, and universal approximation properties.
SchNet: A continuous-filter convolutional neural network for modeling quantum...Kazuki Fujikawa
The document summarizes a paper about modeling quantum interactions using a continuous-filter convolutional neural network called SchNet. Some key points:
1) SchNet performs convolution using distances between nodes in 3D space rather than graph connectivity, allowing it to model interactions between arbitrarily positioned nodes.
2) This is useful for cases where graphs have different configurations that impact properties, or where graph and physical distances differ.
3) The paper proposes a continuous-filter convolutional layer and interaction block to incorporate distance information into graph convolutions performed by the SchNet model.
Practical Spherical Harmonics Based PRT MethodsNaughty Dog
The document summarizes methods for compressing precomputed radiance transfer (PRT) coefficients using spherical harmonics. It presents 4 methods with progressively higher compression ratios: Method 1 uses 9 bytes by removing a factor and scaling, Method 2 uses 6 bytes with a bit field allocation, Method 3 uses 6 bytes with a Lloyd-Max non-uniform quantizer, and Method 4 achieves 4 bytes with a different bit allocation. The methods are evaluated based on storage size, reconstruction quality, and rendering performance.
The document describes optimizing a lighting calculation for the SPU by analyzing memory requirements, partitioning data, and rearranging data for a streaming model. It then provides an example of optimizing a lighting calculation function, including vectorizing the calculation by hand to process 4 vertices simultaneously. The optimizations reduced the calculation time from 231.6 cycles per vertex per light to 208.5 cycles through compiler hints and further to an estimated higher performance by manual vectorization.
This document contains solved examples related to information theory. It begins with examples calculating the information rate of a telegraph source with dots and dashes. It then provides examples calculating the entropy, message rate, and information rate of a PCM voice signal quantized into 16 levels. Further examples calculate the source entropy and information rate for a message source that generates one of four messages. Finally, it constructs the Shannon-Fano code for a source with five symbols of varying probabilities.
Joel Yancey Poster (Buonomano-Blair).compressedJoel Yancey
The document summarizes an experiment that recorded neural activity in the auditory cortex of rats while presenting different auditory stimuli. An algorithm was then used to decode spatial and temporal features of the stimuli from the population neural responses. The algorithm could accurately decode features of the current stimulus above chance level, and showed a low but statistically significant ability to decode features of the previous stimulus. This provides preliminary evidence that the population response in auditory cortex encodes both present and past sensory information, supporting the state-dependent network model of cortical encoding.
This document provides an introduction to SPU optimizations by summarizing the SPU assembly instructions. It begins by explaining the SPU execution environment and memory model. It then categorizes the instruction set into classes based on arity and latency. The majority of the document details the various instructions in the Single Precision Floating Point (SP), Fixed precision (FX), and other classes; explaining their syntax, latency, and examples of use. The goal is to familiarize programmers with the SPU hardware and instruction set to enable improved performance through optimization techniques.
Regularization is used in deep learning to reduce generalization error by modifying the learning algorithm. Common regularization techniques for deep neural networks include:
1) Parameter norm penalties like L2 and L1 regularization that penalize the weights of a network. This encourages simpler models that generalize better.
2) Early stopping which obtains the model parameters at the point of lowest validation error during training, rather than at the end of training.
3) Data augmentation which creates additional fake training data through techniques like translation to improve robustness.
USRP Implementation of Max-Min SNR Signal Energy based Spectrum Sensing Algor...T. E. BOGALE
This poster presents the USRP experimental results of the Max-Min signal
SNR Signal Energy based Spectrum Sensing Algorithms for Cognitive Radio
Networks. The full detail of the poster has been published in ICC 2014.
The document provides an introduction to the perceptron model. It discusses how the perceptron was originally invented in 1958 as a machine for image recognition, with an array of photocells randomly connected to neurons. Weights were encoded using potentiometers, and weight updates during learning were performed by electric motors. It then discusses how multiple perceptrons can be combined to solve non-linearly separable problems like XOR. Finally, it provides details on perceptron weight calculation and the use of an activation function to produce the output in a nonlinear way similar to biological neurons.
The document discusses information theory concepts such as entropy, channel capacity, and linear block codes. It introduces entropy as a measure of information and average information rate. Channel capacity for additive white Gaussian noise channels is defined as a function of signal-to-noise ratio and bandwidth. Finally, it provides an overview of linear block codes, including encoding using a generator matrix and transmitting codewords over the channel.
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtureszukun
1. Gaussian mixtures are commonly used in computer vision and pattern recognition tasks like classification, segmentation, and probability density function estimation.
2. The document reviews Gaussian mixtures, which model a probability distribution as a weighted sum of Gaussian distributions. It discusses estimating Gaussian mixture models with the EM algorithm and techniques for model order selection like minimum description length and Gaussian deficiency.
3. Gaussian mixtures can model images and perform color-based segmentation. The EM algorithm is used to estimate the parameters of Gaussian mixtures by alternating between expectation and maximization steps.
The document proposes a hybrid approach to scalably computing pairwise comparisons of large datasets that follows a Zipf-Mandelbrot distribution. It involves pre-computing and storing the most frequent "head" comparisons while computing less frequent "body" and trivial "tail" comparisons on-demand. This balances storage and computation needs while allowing dimensionality reduction as the dataset scales. Preliminary experiments tested the approach on loan data using HBase and HDFS. Further performance testing and integration with Hadoop is suggested to prove the approach.
Accelerating Random Forests in Scikit-LearnGilles Louppe
Random Forests are without contest one of the most robust, accurate and versatile tools for solving machine learning tasks. Implementing this algorithm properly and efficiently remains however a challenging task involving issues that are easily overlooked if not considered with care. In this talk, we present the Random Forests implementation developed within the Scikit-Learn machine learning library. In particular, we describe the iterative team efforts that led us to gradually improve our codebase and eventually make Scikit-Learn's Random Forests one of the most efficient implementations in the scientific ecosystem, across all libraries and programming languages. Algorithmic and technical optimizations that have made this possible include:
- An efficient formulation of the decision tree algorithm, tailored for Random Forests;
- Cythonization of the tree induction algorithm;
- CPU cache optimizations, through low-level organization of data into contiguous memory blocks;
- Efficient multi-threading through GIL-free routines;
- A dedicated sorting procedure, taking into account the properties of data;
- Shared pre-computations whenever critical.
Overall, we believe that lessons learned from this case study extend to a broad range of scientific applications and may be of interest to anybody doing data analysis in Python.
The document presents approximate tree kernels as a faster alternative to parse tree kernels for machine learning with tree-structured data. Parse tree kernels have quadratic computational complexity that makes them impractical for large trees. Approximate tree kernels speed up computation by selectively ignoring subtrees based on a selection function. Experimental results on synthetic and real-world datasets show approximate tree kernels achieve similar performance to parse tree kernels while reducing runtime by up to three orders of magnitude and memory usage from gigabytes to kilobytes.
This document discusses optimal detection theory for digital modulation and coding. It contains the following key points:
1) The goal of detection is to minimize error probability by choosing the optimal decision rule that maximizes the probability of the received signal given each possible transmitted signal.
2) The maximum a posteriori (MAP) and maximum likelihood (ML) receivers are introduced as optimal detectors.
3) For binary antipodal signaling in additive white Gaussian noise, the MAP detector reduces to choosing the signal closest to the received signal.
4) Expressions are provided for the error probability of binary signaling schemes in AWGN, including the well-known Q-function expression for binary antipodal signaling.
A Random Forest using a Multi-valued Decision Diagram on an FPGaHiroki Nakahara
The ISMVL (Int'l Symp. on Multiple-Valued Logic) presentation slide on May, 22nd, 2017 at Novi Sad, Serbia. It is a kind of machine learning to realize a high-performance and low power.
Tensorflow, deep learning and recurrent neural networks without a ph dDanielGinot
This document discusses recurrent neural networks and batch normalization. It begins by introducing RNN cells and how they can be stacked into deep RNNs. It then discusses the LSTM cell and GRU cell variations of RNNs that are better able to learn long-term dependencies. The document next explains how batch normalization works, including its use in convolutional networks. It provides TensorFlow code examples for implementing batch normalization and language models using RNNs.
This document discusses several topics related to Fourier transforms including:
1) Representing polynomials in value representation by evaluating them at roots of unity allows for faster multiplication using the Discrete Fourier Transform (DFT).
2) The DFT reduces the complexity of the Discrete Fourier Transform (DFT) from O(n2) to O(n log n) by formulating it recursively.
3) Converting images from the spatial to frequency domain using techniques like the Discrete Cosine Transform (DCT) allows for image compression by retaining only low frequency components with large coefficients.
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
The document discusses meta-learning and prototypical networks for few-shot learning. It introduces prototypical networks, which learn a metric space such that classification can be performed by finding the nearest class prototype to a query example in embedding space. The document summarizes results on few-shot image classification benchmarks like Omniglot and miniImageNet, finding that prototypical networks achieve state-of-the-art performance.
1. The document discusses maximum likelihood estimation and Bayesian parameter estimation for machine learning problems involving parametric densities like the Gaussian.
2. Maximum likelihood estimation finds the parameter values that maximize the probability of obtaining the observed training data. For Gaussian distributions with unknown mean and variance, MLE returns the sample mean and variance.
3. Bayesian parameter estimation treats the parameters as random variables and uses prior distributions and observed data to obtain posterior distributions over the parameters. This allows incorporation of prior knowledge with the training data.
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience hirokazutanaka
This document summarizes key concepts from a lecture on neural networks and neuroscience:
- Single-layer neural networks like perceptrons can only learn linearly separable patterns, while multi-layer networks can approximate any function. Backpropagation enables training multi-layer networks.
- Recurrent neural networks incorporate memory through recurrent connections between units. Backpropagation through time extends backpropagation to train recurrent networks.
- The cerebellum functions similarly to a perceptron for motor learning and control. Its feedforward circuitry from mossy fibers to Purkinje cells maps to the layers of a perceptron.
The document discusses deep feedforward networks, also known as multilayer perceptrons. It begins with an introduction to feedforward networks, which apply vector-to-vector functions across multiple hidden layers without feedback connections between layers. Each hidden layer consists of units that resemble neurons. The document then covers gradient-based learning, different cost functions, types of output and hidden units like ReLU, and considerations for network architecture such as depth, width, and universal approximation properties.
SchNet: A continuous-filter convolutional neural network for modeling quantum...Kazuki Fujikawa
The document summarizes a paper about modeling quantum interactions using a continuous-filter convolutional neural network called SchNet. Some key points:
1) SchNet performs convolution using distances between nodes in 3D space rather than graph connectivity, allowing it to model interactions between arbitrarily positioned nodes.
2) This is useful for cases where graphs have different configurations that impact properties, or where graph and physical distances differ.
3) The paper proposes a continuous-filter convolutional layer and interaction block to incorporate distance information into graph convolutions performed by the SchNet model.
Practical Spherical Harmonics Based PRT MethodsNaughty Dog
The document summarizes methods for compressing precomputed radiance transfer (PRT) coefficients using spherical harmonics. It presents 4 methods with progressively higher compression ratios: Method 1 uses 9 bytes by removing a factor and scaling, Method 2 uses 6 bytes with a bit field allocation, Method 3 uses 6 bytes with a Lloyd-Max non-uniform quantizer, and Method 4 achieves 4 bytes with a different bit allocation. The methods are evaluated based on storage size, reconstruction quality, and rendering performance.
The document describes optimizing a lighting calculation for the SPU by analyzing memory requirements, partitioning data, and rearranging data for a streaming model. It then provides an example of optimizing a lighting calculation function, including vectorizing the calculation by hand to process 4 vertices simultaneously. The optimizations reduced the calculation time from 231.6 cycles per vertex per light to 208.5 cycles through compiler hints and further to an estimated higher performance by manual vectorization.
This document contains solved examples related to information theory. It begins with examples calculating the information rate of a telegraph source with dots and dashes. It then provides examples calculating the entropy, message rate, and information rate of a PCM voice signal quantized into 16 levels. Further examples calculate the source entropy and information rate for a message source that generates one of four messages. Finally, it constructs the Shannon-Fano code for a source with five symbols of varying probabilities.
Joel Yancey Poster (Buonomano-Blair).compressedJoel Yancey
The document summarizes an experiment that recorded neural activity in the auditory cortex of rats while presenting different auditory stimuli. An algorithm was then used to decode spatial and temporal features of the stimuli from the population neural responses. The algorithm could accurately decode features of the current stimulus above chance level, and showed a low but statistically significant ability to decode features of the previous stimulus. This provides preliminary evidence that the population response in auditory cortex encodes both present and past sensory information, supporting the state-dependent network model of cortical encoding.
This document provides an introduction to SPU optimizations by summarizing the SPU assembly instructions. It begins by explaining the SPU execution environment and memory model. It then categorizes the instruction set into classes based on arity and latency. The majority of the document details the various instructions in the Single Precision Floating Point (SP), Fixed precision (FX), and other classes; explaining their syntax, latency, and examples of use. The goal is to familiarize programmers with the SPU hardware and instruction set to enable improved performance through optimization techniques.
Regularization is used in deep learning to reduce generalization error by modifying the learning algorithm. Common regularization techniques for deep neural networks include:
1) Parameter norm penalties like L2 and L1 regularization that penalize the weights of a network. This encourages simpler models that generalize better.
2) Early stopping which obtains the model parameters at the point of lowest validation error during training, rather than at the end of training.
3) Data augmentation which creates additional fake training data through techniques like translation to improve robustness.
USRP Implementation of Max-Min SNR Signal Energy based Spectrum Sensing Algor...T. E. BOGALE
This poster presents the USRP experimental results of the Max-Min signal
SNR Signal Energy based Spectrum Sensing Algorithms for Cognitive Radio
Networks. The full detail of the poster has been published in ICC 2014.
The document provides an introduction to the perceptron model. It discusses how the perceptron was originally invented in 1958 as a machine for image recognition, with an array of photocells randomly connected to neurons. Weights were encoded using potentiometers, and weight updates during learning were performed by electric motors. It then discusses how multiple perceptrons can be combined to solve non-linearly separable problems like XOR. Finally, it provides details on perceptron weight calculation and the use of an activation function to produce the output in a nonlinear way similar to biological neurons.
The document discusses information theory concepts such as entropy, channel capacity, and linear block codes. It introduces entropy as a measure of information and average information rate. Channel capacity for additive white Gaussian noise channels is defined as a function of signal-to-noise ratio and bandwidth. Finally, it provides an overview of linear block codes, including encoding using a generator matrix and transmitting codewords over the channel.
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtureszukun
1. Gaussian mixtures are commonly used in computer vision and pattern recognition tasks like classification, segmentation, and probability density function estimation.
2. The document reviews Gaussian mixtures, which model a probability distribution as a weighted sum of Gaussian distributions. It discusses estimating Gaussian mixture models with the EM algorithm and techniques for model order selection like minimum description length and Gaussian deficiency.
3. Gaussian mixtures can model images and perform color-based segmentation. The EM algorithm is used to estimate the parameters of Gaussian mixtures by alternating between expectation and maximization steps.
The document proposes a hybrid approach to scalably computing pairwise comparisons of large datasets that follows a Zipf-Mandelbrot distribution. It involves pre-computing and storing the most frequent "head" comparisons while computing less frequent "body" and trivial "tail" comparisons on-demand. This balances storage and computation needs while allowing dimensionality reduction as the dataset scales. Preliminary experiments tested the approach on loan data using HBase and HDFS. Further performance testing and integration with Hadoop is suggested to prove the approach.
Accelerating Random Forests in Scikit-LearnGilles Louppe
Random Forests are without contest one of the most robust, accurate and versatile tools for solving machine learning tasks. Implementing this algorithm properly and efficiently remains however a challenging task involving issues that are easily overlooked if not considered with care. In this talk, we present the Random Forests implementation developed within the Scikit-Learn machine learning library. In particular, we describe the iterative team efforts that led us to gradually improve our codebase and eventually make Scikit-Learn's Random Forests one of the most efficient implementations in the scientific ecosystem, across all libraries and programming languages. Algorithmic and technical optimizations that have made this possible include:
- An efficient formulation of the decision tree algorithm, tailored for Random Forests;
- Cythonization of the tree induction algorithm;
- CPU cache optimizations, through low-level organization of data into contiguous memory blocks;
- Efficient multi-threading through GIL-free routines;
- A dedicated sorting procedure, taking into account the properties of data;
- Shared pre-computations whenever critical.
Overall, we believe that lessons learned from this case study extend to a broad range of scientific applications and may be of interest to anybody doing data analysis in Python.
The document presents approximate tree kernels as a faster alternative to parse tree kernels for machine learning with tree-structured data. Parse tree kernels have quadratic computational complexity that makes them impractical for large trees. Approximate tree kernels speed up computation by selectively ignoring subtrees based on a selection function. Experimental results on synthetic and real-world datasets show approximate tree kernels achieve similar performance to parse tree kernels while reducing runtime by up to three orders of magnitude and memory usage from gigabytes to kilobytes.
This document discusses optimal detection theory for digital modulation and coding. It contains the following key points:
1) The goal of detection is to minimize error probability by choosing the optimal decision rule that maximizes the probability of the received signal given each possible transmitted signal.
2) The maximum a posteriori (MAP) and maximum likelihood (ML) receivers are introduced as optimal detectors.
3) For binary antipodal signaling in additive white Gaussian noise, the MAP detector reduces to choosing the signal closest to the received signal.
4) Expressions are provided for the error probability of binary signaling schemes in AWGN, including the well-known Q-function expression for binary antipodal signaling.
A Random Forest using a Multi-valued Decision Diagram on an FPGaHiroki Nakahara
The ISMVL (Int'l Symp. on Multiple-Valued Logic) presentation slide on May, 22nd, 2017 at Novi Sad, Serbia. It is a kind of machine learning to realize a high-performance and low power.
Tensorflow, deep learning and recurrent neural networks without a ph dDanielGinot
This document discusses recurrent neural networks and batch normalization. It begins by introducing RNN cells and how they can be stacked into deep RNNs. It then discusses the LSTM cell and GRU cell variations of RNNs that are better able to learn long-term dependencies. The document next explains how batch normalization works, including its use in convolutional networks. It provides TensorFlow code examples for implementing batch normalization and language models using RNNs.
This document discusses several topics related to Fourier transforms including:
1) Representing polynomials in value representation by evaluating them at roots of unity allows for faster multiplication using the Discrete Fourier Transform (DFT).
2) The DFT reduces the complexity of the Discrete Fourier Transform (DFT) from O(n2) to O(n log n) by formulating it recursively.
3) Converting images from the spatial to frequency domain using techniques like the Discrete Cosine Transform (DCT) allows for image compression by retaining only low frequency components with large coefficients.
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
The document discusses meta-learning and prototypical networks for few-shot learning. It introduces prototypical networks, which learn a metric space such that classification can be performed by finding the nearest class prototype to a query example in embedding space. The document summarizes results on few-shot image classification benchmarks like Omniglot and miniImageNet, finding that prototypical networks achieve state-of-the-art performance.
1. The document discusses maximum likelihood estimation and Bayesian parameter estimation for machine learning problems involving parametric densities like the Gaussian.
2. Maximum likelihood estimation finds the parameter values that maximize the probability of obtaining the observed training data. For Gaussian distributions with unknown mean and variance, MLE returns the sample mean and variance.
3. Bayesian parameter estimation treats the parameters as random variables and uses prior distributions and observed data to obtain posterior distributions over the parameters. This allows incorporation of prior knowledge with the training data.
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience hirokazutanaka
This document summarizes key concepts from a lecture on neural networks and neuroscience:
- Single-layer neural networks like perceptrons can only learn linearly separable patterns, while multi-layer networks can approximate any function. Backpropagation enables training multi-layer networks.
- Recurrent neural networks incorporate memory through recurrent connections between units. Backpropagation through time extends backpropagation to train recurrent networks.
- The cerebellum functions similarly to a perceptron for motor learning and control. Its feedforward circuitry from mossy fibers to Purkinje cells maps to the layers of a perceptron.
This document provides an introduction to Bayesian neural networks. It discusses how Bayesian neural networks approximate probabilities by performing a probability-weighted average over multiple neural networks, where each network's weights and biases are sampled from the posterior probability distribution. This allows the model to capture uncertainty in the network parameters rather than providing a single point estimate. The document outlines how to define prior distributions over the network weights, use Markov chain Monte Carlo methods to sample the posterior, and check for convergence using a held-out validation dataset. Bayesian neural networks are claimed to produce more accurate and stable results compared to a single neural network.
Basics of probability in statistical simulation and stochastic programmingSSA KPI
AACIMP 2010 Summer School lecture by Leonidas Sakalauskas. "Applied Mathematics" stream. "Stochastic Programming and Applications" course. Part 2.
More info at http://summerschool.ssa.org.ua
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfgrssieee
This document presents Kernel Entropy Component Analysis (KECA) for nonlinear dimensionality reduction and spectral clustering in remote sensing data. KECA extends Entropy Component Analysis (ECA) to kernel spaces to capture nonlinear feature relations. It works by maximizing the entropy of data projections while preserving between-cluster divergence. The paper describes KECA methodology, including kernel entropy estimation, nonlinear transformation to feature space, and spectral clustering based on Cauchy-Schwarz divergence between cluster means. Experimental results on cloud screening from MERIS satellite images show KECA outperforms k-means clustering, KPCA dimensionality reduction followed by k-means, and kernel k-means.
This document provides an overview of discrete-time signals and systems in digital signal processing (DSP). It discusses key concepts such as:
1) Discrete-time signals which are represented by sequences of numbers and how common signals like impulses and steps are represented.
2) Discrete-time systems which take a discrete-time signal as input and produce an output signal through a mathematical algorithm, with the impulse response characterizing the system.
3) Important properties of linear time-invariant (LTI) systems including superposition, time-shifting of inputs and outputs, and representation using convolution sums or difference equations.
This document discusses neural coding and how neurons encode information during cognitive tasks. It summarizes research using single-unit neural recordings in rats performing a directional control task. Three key findings are presented:
1) Neural firing rates in areas like PM and M1 encoded information about trial outcomes and the rat's learning progress over multiple sessions.
2) Precise spike timing provided information about the rat's cognitive state, with more synchronized activity observed early in learning.
3) Network modeling revealed changes in functional connectivity between neurons associated with synaptic plasticity, with more optimized patterns emerging as the rat became proficient at the task.
This document discusses quantiles and quantile regression. It begins by defining quantiles for the standard normal distribution and shows how to calculate probabilities based on quantiles. It then discusses how to estimate quantiles from sample data and different methods for calculating empirical quantiles. The document introduces quantile regression as a way to model relationships between variables at different quantile levels. It explains how quantile regression is formulated as an optimization problem and compares it to ordinary least squares regression.
Unbiased Markov chain Monte Carlo methods Pierre Jacob
This document describes unbiased Markov chain Monte Carlo methods for approximating integrals with respect to a target probability distribution π. It introduces the idea of coupling two Markov chains such that their states are equal with positive probability, which can be used to construct an unbiased estimator of integrals of the form Eπ[h(X)]. The document outlines conditions under which the proposed estimator is unbiased and has finite variance. It also discusses implementations of coupled Markov chains for common MCMC algorithms like Metropolis-Hastings and Gibbs sampling.
Functional specialization in human cognition: a large-scale neuroimaging init...Ana Luísa Pinho
Linking brain systems and mental functions requires accurate descriptions of behavioral tasks and fine demarcations of brain regions. Functional Magnetic Resonance Imaging (fMRI) has contributed to the investigation of brain regions involved in a variety of cognitive processes. However, to date, no data collection has systematically addressed the functional mapping of cognitive mechanisms at a fine spatial scale. The Individual Brain Charting (IBC) project stands for a high-resolution multi-task fMRI dataset that intends to provide the objective basis toward a comprehensive functional atlas of the human brain. The data refer to a permanent cohort performing many different tasks. The large amount of task-fMRI data on the same subjects yields a precise mapping of the underlying functions, free from both inter-subject and inter-site variability. The first release of the IBC dataset consists of data acquired from thirteen participants during performance of a dozen of tasks. Raw data from this release are publicly available in the OpenNeuro repository and derived statistical maps can be found in NeuroVault [1]. These maps reveal a successful cognitive encoding of many psychological domains in large areas of the human brain. Indeed, main findings of the original studies were replicated at higher resolution. Our results thus provide a comprehensive revision of the neural correlates underlying behavior, highlighting nonetheless the spatial variability of functional signatures between participants. In addition, this dataset supports investigations using alternative approaches to group-level analysis of task-specific studies. For instance, such rich task-wise dataset can be applied to mega-analytic encoding models towards the development of a brain-atlasing framework, by systematically mapping functional signatures associated with the cognitive components of the tasks.
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
This document discusses principal component analysis (PCA) and matrix factorizations for learning. It provides an overview of PCA and singular value decomposition (SVD), their history and applications. PCA and SVD are widely used techniques for dimensionality reduction and data transformation. The document also discusses how PCA relates to other methods like spectral clustering and correspondence analysis.
Ptychography is a technique for scanning diffractive imaging that allows reconstruction of the phase and amplitude of an object from multiple diffraction patterns collected at different positions. It uses an iterative algorithm to recover the object by alternating between updating an estimated object and simulated diffraction patterns. This document discusses using ptychography at scanning transmission x-ray microscopes to achieve resolutions below 10 nm, as well as its applications in 3D imaging of biological samples with resolutions of 100nm or better and quantitative chemical analysis.
This document presents a method for estimating the eigenvalues of a covariance matrix when there are few samples. It involves shifting the sampled eigenvalues toward the population values based on theoretical distributions, and balancing the energy across eigenvalues. This simple 3-matrix approach improves estimation and detection performance compared to using the sampled eigenvalues alone. Simulations and hyperspectral data experiments demonstrate the effectiveness of the method.
The document discusses particle filtering and state-space processes. It provides an overview of two commonly used particle filters: the bootstrap filter and auxiliary particle filter. It also presents an example of applying particle filtering to a stochastic volatility model.
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Jere Koskela's slides
The document proposes a hyperspectral unmixing method called Simplex Volume Analysis based on Triangular Factorization (SVATF). SVATF extracts endmembers by maximizing the volume of a simplex formed by the endmember spectra. It calculates the simplex volume using Cholesky factorization on a smaller matrix for faster computation compared to other methods. SVATF then estimates abundance fractions by solving a system of linear equations using the extracted endmembers. The method is evaluated on synthetic and real hyperspectral data and shows improved computational efficiency over other common endmember extraction algorithms.
SIMPLEX VOLUME ANALYSIS BASED ON TRIANGULAR FACTORIZATION: A FRAMEWORK FOR HY...grssieee
The document proposes a new method called Simplex Volume Analysis based on Triangular Factorization (SVATF) for hyperspectral unmixing. SVATF simplifies the endmember extraction process by using Cholesky factorization to calculate the volume of a simplex, allowing endmembers to be identified by maximizing values along the diagonal of the factorization. This reduces computational costs compared to existing methods. The paper evaluates SVATF on both synthetic and real hyperspectral data, finding the new approach can perform endmember extraction faster than alternatives with or without dimensionality reduction.
The document proposes a new method called Simplex Volume Analysis based on Triangular Factorization (SVATF) for hyperspectral unmixing. SVATF simplifies the endmember extraction process by using Cholesky factorization to calculate the volume of a simplex, allowing endmembers to be identified by maximizing values along the diagonal of the factorization. This reduces computational costs compared to existing methods. The paper evaluates SVATF on both synthetic and real hyperspectral data, finding the new approach can perform endmember extraction faster than alternatives with or without dimensionality reduction.
Similar to Statistical Analysis of Neural Coding (20)
1. A Principled Statistical Analysis
of Discrete Context-Dependent
Neural Coding
Yifei Huang
Thesis Advisor: Prof. Uri Eden
April 14th, 2010
1
2. Overview
• Basics of Neural Representations
• Point Process Modeling of Neural Systems
• Hippocampal Data Analyses
– Encoding
– Decoding
– Hypothesis Tests
– Other Topics
• Summary
2
3. Spikes are the Language of Neurons
Electrical Recordings
• Neurons send information in the form of an Electrical Impulse
electrical impulse that is termed a “spike”.
Electrode
Spike Train
Spike
A Neuron
Slide Courtesy of Mike Prerau 3
4. Place cells in Hippocampus
Human:
Rat:
y position (m)
Firing
activity of
a single
place cell:
4
x position (m)
5. Neural Coding
Input: Output:
Neural
X: Spiking N:
System
Example of x and N – hippocampal system
Challenge: construct a probability model of the relationship
between the spike sequence and biological or behavioral
variables x
Point Process Model: p( N | x)
5
6. The Decoding Problem
Neuron 1
p(x|N1,…,NC) Parameters
Biological
?
Stimulus
Neuron 2
Parameters
…
Neuron C
Parameters
Challenge: track the biological stimulus/behavior signal
as it evolves. e.g. Position 6
7. Conditional Intensity Function (CIF) and
the Likelihood Function
• Point Processes are modeled with CIF
Pr(Spike in (t , t + ∆t ) | H t )
λ (t | H t ) = lim = g(t, x(t), Ht; θ)
∆t → 0 ∆t
• In discrete time the log-likelihood for observing a
spike train is:
K K
log L = ∑ log[λ (tk | N1:k )∆t ]∆N k − ∑ λ (tk | N1:k )∆t
k =1 k =1
t1 t2 t3 t4 t5 t6 t7
ΔN1 ΔN2 ΔN3 ΔN4 ΔN5 ΔN6 ΔN7
0 0 1 0 0 0 1
7
8. Experimental Paradigm and Data
Decision Point
A rat was trained to
alternate between left /
right turns on a T-maze.
- Recordings were made from
47 hippocampal place cells.
- Position data were recorded Left-turn trial Right-turn trial
at 30 Hz (frames/sec).
Data acquisition by M.P. Brandon and A.L. Griffin from Prof. Hasselmo’s lab
Challenge: p(Ν | x, context), p(x|N1,…,NC), prediction of the
future turn direction
8
9. Encoding Recorded firing
activity of an
Analysis individual neuron
9
10. Generalized Linear Models
• Assume that the firing activity in each neuron follows a
point process with CIF:
Spline Basis
Functions
P Q
λ (t | H t ) = exp∑ θ i g i ( x(t )) + ∑ γ j ∆N t − j
i =1 j =1
History Dependent
λ(t) Component
x(t)
• Compute the ML estimates for [θ1,…,θP, γ1,..., γQ]
10
11. Encoding Results
P Q
λ (t | H t ) = exp∑ θ i g i ( x(t )) + ∑ γ j ∆N t − j
i =1 j =1
Firing Intensity
-753
0
753
x(t)
-800 0 800 11
12. Goodness-of-fit Results
P Q
Time-rescaling Theorem λ (t | H t ) = exp∑ θ i g i ( x(t )) + ∑ γ j ∆N t − j
(Meyer,1969 ; Papangelou,1972) i =1 j =1
λ (t | H t )dt
si +1
zi = ∫
si
~ i.i.d. exp(1)
KS Plot KS Plot
ui = 1 − exp(− zi )
~ i.i.d. Uniform [0,1]
Kolmogorov-Smirnov
Statistic
(Chakravarti et al., 1967)
KS = max Femp − F
F(u) = u Q=0 Q = 17
12
13. Ensemble results
• For an ensemble of 47 neurons:
– All neurons displayed highly specific position-
dependent firing
– Frequent peaks at multiple locations along the
maze
– 19 well fit by the inhomogeneous Poisson model
as measured by the KS test
– 29 well fit by the history-dependent point process
model
– Structure shown in the ACF reduced when a
simple history component was added
13
14. From Encoding to Decoding
( ) 1 ∆N k
1
At the kth time step, Pr(∆N | xk ) ∝ exp(−λ ) ⋅ λ
1
k
1
k k
Neuron 1
Parameters
? Pr(∆N k2 | xk )
p ( xk | ∆N k , , ∆N kC )
1
Neuron 2
Parameters
… Pr(∆N kC | xk )
Neuron C
Parameters
14
15. Derivation of the Decoding Algorithm
• Bayes Rule
C
p ( xk | ∆N , , ∆N ) ∝ ∏ Pr(∆N kc | xk ) p ( xk )
1
k
C
k
c =1
ρk
• Chapman-Kolmogorov equation
p( xk ) = ∫ p( xk | xk −1 ) p ( xk −1 | ∆N k −1 , ∆N kC−1 )dxk −1
1
ρk-1
• Numerical integration of exact posterior distribution
C
ρ k ∝ ∏ Pr(∆N kc | xk ) ∫ p( xk | xk −1 ) ρ k −1dxk −1
c −1
15
16. Decoding Analysis
Point Process Filter Derivation:
•State Model - State transition probability: p ( xk | xk −1 )
p (d k ) ~ N (0, σ 2 )
f(xk-1)
dk := xk = xk-1 + dk if the animal does not
xk – xk-1 move through the connection point
Alternatively – empirical state model
p (d k ) ~ N ( f ( xk −1 ), σ 2 )
f(xk-1) is the expected movement of
xk-1 the animal based on training data
•Conditional Intensity Models – Spline-based GLMs
•Decode algorithm - Numerical integration of exact posterior distribution
C
ρ k ∝ ∏ Pr(∆N kc | xk ) ∫ p ( xk | xk −1 ) ρ k −1dxk −1
c −1
16
20. Hypothesis Tests for Differential Firing
Motivation: which neurons fire differently under
two discrete contexts (left vs. right trial)?
Recorded Data for an Spiking activity before different turns:
Individual Neuron
20
21. Tests for Differential Firing
• Classic approach:
1. Break stem into 4-7
equally sized spatial
bins.
2. Perform 2-way ANOVA
on space and context
3. Look for significance in
context or interaction
terms.
21
22. Tests for Differential Firing
• ANOVA issues:
– Doesn’t capture spiking structure
– Asymptotic requirements are often not met
– Stationarity assumption
– Highly sensitive to number of bins
– Surprising previous decoding analysis
• Alternate approach:
– Tests based on point process models with
established goodness-of-fit procedures
22
23. λL
ˆ
λR
ˆ
Testing for splitter is equivalent to: λ0
ˆ
H0: λL(x) = λR(x) = λ0 (x)
Ha: λL(x) ≠ λR(x)
23
24. Test Statistics for Differential Firing
• Integrated squared error statistic:
ISE = ∫
0
D
(ˆ
L
ˆ
0 ) ( ˆ
R
ˆ
0 )
λ ( x) − λ ( x) 2 + λ ( x) − λ ( x) 2 dx
• Maximum difference statistic:
MD = max λL ( x) − λR ( x)
ˆ ˆ
x∈[ 0 , D ]
• Likelihood ratio statistic:
L0
W = −2 log
L1
24
25. Computing Sampling Distributions
• Nonparametric Bootstrap:
– Permute/sample from trial labels to
construct surrogates for each context.
• Parametric Bootstrap:
– Generate spikes according to λ0 ( x ) for
ˆ
each context.
– Alternative: sample λ0i ( x ) based on
ˆ
estimated model covariance, then generate
spikes.
25
26. Asymptotic distribution of W
– The asymptotic distribution of the LR Test Statistic:
L0
W = −2 log ~ χp
2
L1
p - number of constrained parameters under H0
– Asymptotic result holds when data size goes to
infinity; simulation study shows fast convergence
26
27. Simulation Data
( x − 245 / 2) 2 ( x − 245 / 2) 2
λL ( x) = exp− λR ( x) = C × exp−
2 ×σ L
2
2 ×σ R
2
σ L = 20 σ R = 31
245
245
Common Fit
Solid blue curves Irrespective
represent the of Context
estimated firing rates
245
27
28. Test Results for Simulated Data
σ L = 20 σ R = 31 σ R = 34
Test p-value p-value
ANOVA Main effect: 0.865 Main effect: 0.706
(4 bins) Interaction effect: 0.202 Interaction effect: 0.728
ANOVA Main effect: 0.995 Main effect: 0.729
Interaction effect: 0.225 Interaction effect: 0.001
(6 bins)
0.003
ISE 0.021
0.001
MD 0.030
<0.001
LR (Asymp. χ2) 0.002 28
29. λ1
ˆ Back to the Real
Data Example…
Empirical and Asymptotic
Distributions of the LR Test Stat
L0
λ
ˆ
2
W = −2 log ~ χ p
2
L1
Blue Curve: χ24
Yellow Bars:
Bootstrap samples
λ0
ˆ
29
30. Real Data Results
Test p-value Recorded Data
for an Individual
Main effect: 0.872 Neuron
ANOVA
(4 bins) Interaction effect: 0.461
Main effect: 0.790
ANOVA
(6 bins) Interaction effect: 0.843
ISE <0.001
MD <0.001
LR (Asymp. χ2) <0.001
30
31. Summary of Test Results
• From simulation study
– Tests based on point process models tend to be
more powerful and robust compared to ANOVA
• From real data analysis
– The three proposed tests can capture differential
firing based on the fine structure of data
– Simple point process models were able to detect
differential firing in a population for which no
splitting behavior was previously identified
31
32. Other Topics
Relationships between spikes and neural oscillation
Theta Rhythmicity &Theta Precession
x(t)
720
ϕ(t)
Point Process Models:
λ (t H t ) = g (t , x(t ),φ (t ), H t ;θ )
180 32
33. Conclusions
• Theory/Methods developed
– Model identification paradigm for firing activity of
hippocampal neurons
– Exact decoding on general topologies
– Hypothesis test framework
• Understanding of the brain
– Decoding results suggest the neuron ensemble from
hippocampus contain information for future turn
direction
– Theta rhythmicity is an essential component for
hippocampal neural firing
33
34. Acknowledgements
Advisor: Uri Eden
Experimental Data:
Mark Brandon
Amy Griffin
Professor Michael Hasselmo
Lab Members:
Michael Prerau Eugene Zaydens
Liang Meng Kyle Lepage
Committee Members:
Ashis Gangopadhyay Michael Hasselmo
Dan Weiner Kostas Kardaras
34