Finally we will talk about Artificial Neural Networks (ANN) . In this part we will focus on the building blocks of ANN's. We will describe the perceptron and its learning rules. We will talk in more details about the gradient descent algorithm.
We will also define an important concept in ANN which is : the back-propagation algorithm. For our examples we will use two libraries: neurolab and scikit-learn libraries.
[Notebook](https://colab.research.google.com/drive/1CqYwu9NzeXuUNeR8RmkDplUd71DHWNod)
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAminaRepo
We will focus in this part on two important libraries for ANN: Tensorflow and Keras. Both of them propose two types of model creation. We will use the high level API of tenshorflow, and the sequential models of Keras.
We will introduce you to some basic important concept related to tensorflow, and we will present you tensorboard. The later one is used to visualize, among other things, quantitative values related to a training process.
[Notebook](https://colab.research.google.com/drive/13KlhoNvYmeRZTZ-TLKAtW3rOkFzQVGYC)
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAminaRepo
We will focus in this part on two important libraries for ANN: Tensorflow and Keras. Both of them propose two types of model creation. We will use the high level API of tenshorflow, and the sequential models of Keras.
We will introduce you to some basic important concept related to tensorflow, and we will present you tensorboard. The later one is used to visualize, among other things, quantitative values related to a training process.
[Notebook](https://colab.research.google.com/drive/13KlhoNvYmeRZTZ-TLKAtW3rOkFzQVGYC)
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
Overview of TensorFlow For Natural Language Processingananth
TensorFlow open sourced recently by Google is one of the key frameworks that support development of deep learning architectures. In this slideset, part 1, we get started with a few basic primitives of TensorFlow. We will also discuss when and when not to use TensorFlow.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
TFFN: Two Hidden Layer Feed Forward Network using the randomness of Extreme L...Nimai Chand Das Adhikari
The learning speed of the feed forward neural
network takes a lot of time to be trained which is a major
drawback in their applications since the past decades. The
key reasons behind may be due to the slow gradient-based
learning algorithms which are extensively used to train the
neural networks or due to the parameters in the networks
which are tuned iteratively using some learning algorithms.
Thus, in order to eradicate the above pitfalls, a new learning
algorithm was proposed known as Extreme Learning Machines
(ELM). This algorithm tries to compute Hidden-layer-output
matrix that is made of randomly assigned input layer and
hidden layer weights and randomly assigned biases. Unlike the
other feedforward networks, ELM has the access of the whole
training dataset before going into the computation part. Here,
we have devised a new two-layer-feedforward network (TFFN)
for ELM in a new manner with randomly assigning the weights
and biases in both the hidden layers, which then calculates the
output-hidden layer weights using the Moore-Penrose generalized
inverse. TFFN doesn’t restricts the algorithm to fix the number
of hidden neurons that the algorithm should have. Rather it
searches the space which gives an optimized result in the neurons
combination in both the hidden layers. This algorithm provides a
good generalization capability than the parent Extreme Learning
Machines at an extremely fast learning speed. Here, we have
experimented the algorithm on various types of datasets and
various popular algorithm to find the performances and report
a comparison.
I think this could be useful for those who works in the field of Coputational Intelligence. Give your valuable reviews so that I can progree in my research
Learn about how to define and invoke methods in Java, how to use parameters and return results. Watch the video lesson here:
https://softuni.org/code-lessons/java-foundations-certification-methods
Overview of TensorFlow For Natural Language Processingananth
TensorFlow open sourced recently by Google is one of the key frameworks that support development of deep learning architectures. In this slideset, part 1, we get started with a few basic primitives of TensorFlow. We will also discuss when and when not to use TensorFlow.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
TFFN: Two Hidden Layer Feed Forward Network using the randomness of Extreme L...Nimai Chand Das Adhikari
The learning speed of the feed forward neural
network takes a lot of time to be trained which is a major
drawback in their applications since the past decades. The
key reasons behind may be due to the slow gradient-based
learning algorithms which are extensively used to train the
neural networks or due to the parameters in the networks
which are tuned iteratively using some learning algorithms.
Thus, in order to eradicate the above pitfalls, a new learning
algorithm was proposed known as Extreme Learning Machines
(ELM). This algorithm tries to compute Hidden-layer-output
matrix that is made of randomly assigned input layer and
hidden layer weights and randomly assigned biases. Unlike the
other feedforward networks, ELM has the access of the whole
training dataset before going into the computation part. Here,
we have devised a new two-layer-feedforward network (TFFN)
for ELM in a new manner with randomly assigning the weights
and biases in both the hidden layers, which then calculates the
output-hidden layer weights using the Moore-Penrose generalized
inverse. TFFN doesn’t restricts the algorithm to fix the number
of hidden neurons that the algorithm should have. Rather it
searches the space which gives an optimized result in the neurons
combination in both the hidden layers. This algorithm provides a
good generalization capability than the parent Extreme Learning
Machines at an extremely fast learning speed. Here, we have
experimented the algorithm on various types of datasets and
various popular algorithm to find the performances and report
a comparison.
I think this could be useful for those who works in the field of Coputational Intelligence. Give your valuable reviews so that I can progree in my research
Learn about how to define and invoke methods in Java, how to use parameters and return results. Watch the video lesson here:
https://softuni.org/code-lessons/java-foundations-certification-methods
Classification by back propagation, multi layered feed forward neural network...bihira aggrey
Classification by Back Propagation, Multi-layered feed forward Neural Networks - Provides a basic introduction of classification in data mining with neural networks
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
This Deep Learning interview questions and answers presentation will help you prepare for Deep Learning interviews. This presentation is ideal for both beginners as well as professionals who are appearing for Deep Learning, Machine Learning or Data Science interviews. Learn what are the most important Deep Learning interview questions and answers and know what will set you apart in the interview process.
Some of the important Deep Learning interview questions are listed below:
1. What is Deep Learning?
2. What is a Neural Network?
3. What is a Multilayer Perceptron (MLP)?
4. What is Data Normalization and why do we need it?
5. What is a Boltzmann Machine?
6. What is the role of Activation Functions in neural network?
7. What is a cost function?
8. What is Gradient Descent?
9. What do you understand by Backpropagation?
10. What is the difference between Feedforward Neural Network and Recurrent Neural Network?
11. What are some applications of Recurrent Neural Network?
12. What are Softmax and ReLU functions?
13. What are hyperparameters?
14. What will happen if learning rate is set too low or too high?
15. What is Dropout and Batch Normalization?
16. What is the difference between Batch Gradient Descent and Stochastic Gradient Descent?
17. Explain Overfitting and Underfitting and how to combat them.
18. How are weights initialized in a network?
19. What are the different layers in CNN?
20. What is Pooling in CNN and how does it work?
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you’ll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change.
There is booming demand for skilled deep learning engineers across a wide range of industries, making this deep learning course with TensorFlow training well-suited for professionals at the intermediate to advanced level of experience. We recommend this deep learning online course particularly for the following professionals:
1. Software engineers
2. Data scientists
3. Data analysts
4. Statisticians with an interest in deep learning
Learn more at: https//www.simplilearn.com
An Artificial Neural Network (ANN) is a computational model inspired by the structure and functioning of the human brain's neural networks. It consists of interconnected nodes, often referred to as neurons or units, organized in layers. These layers typically include an input layer, one or more hidden layers, and an output layer.
This is the other category of recommender systems: Content-based filtering.
We will apply three of the various available methods: Decision Trees, Nearest Neighbor method and Polynomial Regression.
[Notebook](https://colab.research.google.com/drive/1ohF6b5LO1XA0cCSLITSApaioUIKyQowQ)
This is the part where we will talk about model-based filtering. In here, we will see two types of models: SVM and SVD (for Singular Value Decomposition). We will introduce you to a useful library : Surprise library.
[notebook](https://colab.research.google.com/drive/1Xt3DImn43eMrEMMZadByrD1_bSS0fmGh)
As it is suggested in the name, we use recommender systems to recommend items to users bases on their preferences, and the preferences of other users.
We will talk about two categories of recommoncder systems : Content based filtering and Collaborative filtering. In the later one, there are two approaches: neighborhood approach, and model based approach. In this section, we see the first one.
[Notebook](https://colab.research.google.com/drive/12gM8EEa6gxhgpMB-QvCbfmwwZm7MVrku)
Aaa ped-18-Unsupervised Learning: Association Rule LearningAminaRepo
We use association rule methods to find the relationships between the data attributes. We will see 2 methods: apriori and eclat. We will also use 2 libraries : ML-extend and fim library.
Association rule learning can be for example used to identify the items that are more often bought together.
[notebook](https://colab.research.google.com/drive/1gABWBOK176R0q9HRYrDVpYREWKPKfbLK)
In order to be able to visulaize the data, or simply to speed up the process of learning without loosing the important features, we apply dimensionality reduction. methods.
We will talk about 2 methods: PCA and manifold.
[Notebook](https://colab.research.google.com/drive/1_ksjf1K49dUA8XtyDGoL5V3JEajHvFHb)
Unsupervised learning involves using unlabeled data. It is used for specific problematic like : clustering, dimensionality reduction and association rule learning.
In the first section we will talk about some of the clustering methods: k-mean, mean shift, Gaussian mixture and affinity propagation model . We will also define and use silhouette scores that will help to select the most appropriate number of clusters that the data may have.
[Notebook](https://colab.research.google.com/drive/1g4hcSfiO-TW35JbiQ_kGQAsgMZDPkp7L)
Aaa ped-14-Ensemble Learning: About Ensemble LearningAminaRepo
In this section we will start talking effectively about ensemble learning. We will simply talk about the different methods that exist to combine different models. We will then implement those methods in Python.
[Notebook](https://colab.research.google.com/drive/1fNkOh7iQ_AnjNWxm3hWyR4DIGRUNwzsS)
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes ClassiferAminaRepo
A particular type of models in supervised learning is SVM: Support Vector Machines. It can be used for both classification and regression. We will also see how to apply them in a face recognition problem.
Then, we will see a particular type of classifiers: Naive Bayes classifiers. We will talk precisely about the multinomial and the guassian naive bayes.
[Notebook](https://colab.research.google.com/drive/10hP0bCSt_H7AvY4EljEcP-q7EXEcb3Mt)
We have already seen some of the operations related to data preparation, but in this section we will talk about a particular aspect of this preparation, which is preprocessing data. Even if the data is clean, and complete, it has to go throw a certain steps of treatment to be usable in machine learning.
Beside data, we will define single and multi variable regressors. We will also define the classification regression model: the logistic regression. An other important element that we will define is the regularization added to regreesion models to avoid overfitting.
[Notebook](https://colab.research.google.com/drive/1cMINaTmIGKmOCoZPf44Oog_j9z8bRuTv)
Aaa ped-10-Supervised Learning: Introduction to Supervised LearningAminaRepo
Machine learning is a branch of AI that involves learning from data. In other words, you build a model, then you adjust it using the available data. Then, you will use that model to predict a certain behaviour from other data.
In supervised learning, you will use labeled data. And your model will learn how to predict these lables for any other given data.
You will encouter two types of supervised learning: classification, and regression.
In classification, the labels will represent categories of data. And in regression, the labels are simply values that correspond to each sample.
[Notebook](https://colab.research.google.com/drive/1CgOXIsafoxgDo3jDOFjq56aL5RNVrWJq)
Aaa ped-9-Data manipulation: Time Series & Geographical visualizationAminaRepo
Data can represent information about time and space. In this section, we will talk about time data, and the related operations.
We will also see how to plot and visualise geographic data using basemap library.
[Notebook](https://colab.research.google.com/drive/16TyzQ1w8km6RwTZwHbmv_FZlZM2ptM77)
Aaa ped-Data-8- manipulation: Plotting and VisualizationAminaRepo
The best way to understand your data is to visualise it. To do so, just use one of the available python libraries.
We will introduce you to some of them: seaborn, matplotlib, plotly ... etc. Some of them are dedicated to data visualisation, others offer the plotting as an additional functionality.
[Notebook](https://colab.research.google.com/drive/1LnRQo7194PdvITDzHZOLiTNeJ9JG5jDY)
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operationsAminaRepo
We will continue with data manipulation operations, especially the ones that cover indexing data, and aggreagating it.
These operations are necessary, because in real life, the data may be messy, not complete, and not in the right format. So in order to use it, it has to be cleaned, unified and transformed.
[Notebook ](https://colab.research.google.com/drive/1kexu32solZIT-P2pC_iyMFMwYcZ8JKmP)
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & PreparationAminaRepo
Since machine learning is an important part of AI, manipulating data represents an important part in the process of building learning models.
So we will talk about reading and writing data to disk, handling missing data issue, and other important utilities.
[Notebook](https://colab.research.google.com/drive/1RVrn0NVrUtx5gZsOKv-Ecw1pj5zDjJuS)
An other important library to discover: Pandas. It defines two other important structures : Series and Data Frames.
Of course, you will find examples of how to use this library, and all the operations related to its data structures.
[Notebooke](https://colab.research.google.com/drive/1hrOfW62t8iYZhclc378J1i15AyPNu9GD)
One of the most interesting things about python, is its libraries. They are just modules of predefined classes, methods, and functions. They encapsulate the most important utilities that facilitate greatly programming in general, and AI application in particular.
One of the indispensable python libraries is Numpy library.It defines the ndarray class among other classes and utilities.
In this lesson you will find examples of how to use this library.
[Notebook] (https://colab.research.google.com/drive/1zsuCMTsXrHHRnT7RzcNkJIdH4uN-TVHz)
In this section, we will see advanced concepts related to Python. We will introduce you to different types of data structures, like: lists, tuples, and dictionaries.
We will also define an important concept in programming which is control flow statements. We will show how to use conditional and repetitive statements.
Finally, we will talk about different concepts of object oriented programming (beside other concepts), and how to implement them in python.
[Notebook link] (https://drive.google.com/file/d/11AjOGxmhz-YOHVVqDVQVlpjVJMrSrDn2/view?usp=drivesdk)
In this lesson, we will start by defining the basic concept related to Pyhton: operations, variables, and basic types.
Then we will define the concept of a function, and how to use it.
[Notebook link]( https://drive.google.com/file/d/1T4T32lyCkT4J992tfnKUADQU88Qvlrvc/view?usp=drivesdk)
Finally, we will explain the concept of modules and libraries. And, we will show how to install these libraries in Google Colab.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
2. Plan
●
1- Introduction to ANN
●
2- Perceptron
●
3- Single Layer Perceptron
●
4- Single Layer Perceptron examples
●
5- Multi Layer Perceptron
●
6- ANN topologies
3. 3
1-Introduitionto
ANN
[By Amina Delali]
ConceptConcept
●
The idea of an Artifiial Neural Network: (ANN) is to build a
model based on the way the human brain learns new things.
●
It can be used in any type of maihine learning. It learns by
extracting the diferent underlying patterns in a given data.
●
This extraction is performed by stages, called layers. Each layer,
composed by a set of neurons, will identify a certain pattern. The
following layer, will identify another more iomplex pattern, from
its previous layer.
●
The most common architecture is as follow: a frst layer, has the
training data as input. It is called the input layer. In the last one,
the output of the neurons are the fnal output. It is called the
output layer. The layers in between, are called hidden layers.
●
From now, the term neural network: will mean artifiial neural
network:.
4. 4
1-Introduitionto
ANN
[By Amina Delali]
NeurolabNeurolab
●
Neurolab is Neural Network library for Python. It supports several
types of neural networks.
●
For installation:
●
Like other machine learning techniques, a neural network need to
be trained. Can be tested. And will be used to prediit results.
●
Here is an example of how to use neurolab to create a neural
network, and how to perform the fore-mentioned tasks:
The samples are uniformally
distributed from -0.5
(included) to 0.5 (excluded)
Input layer with 2 neurons,
hidden layer with 5
neurons, and output layer
with 1 neuron
Length of the outer list
Equals to the number of
Neurons of the input layer
5. 5
1-Introduitionto
ANN
[By Amina Delali]
Neurolab example detailsNeurolab example details
●
Data: 10 samples described by 2 features. The labels are the sum
of the 2 features. In fact, the NN tries to model the sum function
for values ranging from -0.5 to 0.5
●
After creating the data, the steps were:
➢
Create an instance of a neural network with specifed number of
layers and neurons (nl.net.newf)
➢
Train the neural network (myNN.train)
➢
Predict the output for the value [0.2,0.1] (myNN.sim)
➢
Compute the test error (the true label is known: 0.2+0.1)
6. 6
2-Perieptron
[By Amina Delali]
DefnitionDefnition
●
The term Perieptron refers to an input layer of data features
values, with forward weighted connections to an output layer of
one single neuron , or of multiple neurons.
●
One of the simplest form of a neuron is an LTU.
●
LTU, for Linear Threshold Unit, it is a component (neuron) that:
➢
Computes a weighted sum of its inputs: a linear function
➢
Applies a step function to the resulting sum, and outputs the
results
x1
x2
xn
Z = wt
.X = step(Z).
.
.
1 LTU
The Bias neuron is used to allow a linear function to be written as a matrix
multiplication: its multiplication by the corresponding weight w0
will represent
the intercept value b: (a1
x1
+ a2
x2
+ … + an
xn
+ b ).
One output
value:
hw
(x)=
step(Z)
Bias
neuron
Input
neuron
w0
w1
w2
wn
neuron
∑i=0
i=n
wi xi
7. 7
2-Perieptron
[By Amina Delali]
The functionsThe functions
●
The weighted sum function, is also called the Propagation
function.
●
The step function, can be :
➢
A non-linear function, in this case, it will be called the
threshold aitivation funition (this is the case in an LTU). For
example:
➔
Heaviside step function:
heaviside(z)=
➢
A linear function, simply called aitivation funition. For
example:
➔
The identity funition: which means that the value computed
by the propagation function, is the output value of the
neuron.
➢
A semi-linear function, that is monotonous and
diferentiable. Also called aitivation funition.
0 if z<0
1 if z≥0
−1 if z<0
0 if z=0
1 if z>0
➔
Sign function:
sgn(z)=
8. 8
2-Perieptron
[By Amina Delali]
Single Layer PerceptronSingle Layer Perceptron
●
A Single Layer Perceptron (SLP), is simply a Perceptron, with only
one layer (without counting the input layer ) .
●
So, it is composed of an input layer and an output layer. The later
one can have one ore more outputs. So, it can be used for binary
and for multi-output classifcation
●
Considering ANN in general, a Perieptron is considered as a
feedforward neural network. We are going to talk about it in the
next section.
●
An SLP an apply 2 diferent kind for rules to learn: the perieptron
rule, or the delta rule. Each of the rules is associated with a certain
type of activation function. To apply the delta rule, we need the
activation function to be diferentiable.
9. 9
3-SingleLayer
Perieptron
[By Amina Delali]
SLP LearningSLP Learning
SLP learning
With Perceptron
rule (variant of
Hebb’s rule)
With the delta
rule (Widrow
Hoff rule)
Uses a threshold
activation
function
Uses a linear or
semi-linear
activation
function
Learning algorithm
has the same
behavior of a
stochastic gradient
descent algorithm
Offline training:
batch gradient
descent.
Online (or
sequential )
training: Stochastic
Gradient descent
algorithm
10. 10
3-SingleLayer
Perieptron
[By Amina Delali]
SLP Learning: Perceptron RuleSLP Learning: Perceptron Rule
●
To update the weights, the following formula is used:
●
The concept is that each wrong prediction reinfories the
weight corresponding to the feature that would iontributed to
the iorreit prediition. The computation of the weights is
repeated until the samples are classifed correctly.
wi, j
(next step)
=wi, j+η( yj− ^y j)xi
The weight of the
connection between the
input node i and the output
node j, for the next training
sample.
The learning rate
True value the
node j for the
current training
sample.
The true value at the
node j, for the current
training sample.
ith
input value (ith
feature) for the
current training
sample.
The weight of the connection
between the input node i and the
output node j, for the current training
sample
11. 11
3-SingleLayer
Perieptron
[By Amina Delali]
Gradient DescentGradient Descent
●
The concept of the gradient desient is:
➢
With initial parameters of a model, predict an output value
➢
Compute the gradient of the “error” (“loss”) function (function
of the parameters of the learning model) at a certain point= the
slope of the surface of that function at that point calculated by its
derivative at that point.
➢
Update the parameters in order to fnd the loial minima by a
step proportional to the negative of that gradient. (opposite
direction ==> toward the local minima of the function). In the
case of :
➔
A stoihastii gradient desient: with one sample, prediit
→ update the parameters for the next sample → predict with
the next sample with the new parameters
➔
A Batih gradient desient predict for all samples ==1
epoih → update the parameters → predict again with the new
parameters
➢
Repeat the process in order to minimize the error.
12. 12
3-SingleLayer
Perieptron
[By Amina Delali]
SLP Learning: Delta RuleSLP Learning: Delta Rule
●
With the activation rule being linear or semi-linear but
diferentiable, the gradient descent is used to update the weights.
●
The weights are updated as follow:
➢
In general:
➢
In a case of a linear aitivation funition, and a Sum-Squared
error function
➔
In Online training
for a given sample:
➔
In Offline training
Δ wi, j=η⋅xi⋅( yj− ^y j)=η⋅δj
wi , j
next
=wi, j+Δwi, j
This is why it
is called the
delta rule
Δ wi, j=η⋅∑
s∈X
xi
s
⋅( yj
s
− ^yj
s
)=η⋅∑
s∈X
xi
s
⋅δj
s
Learning rate
The true label that
should have the
output neuron j, for
the sample s
Predicted
label for
output neuron
j, for the
sample s
The input
feature i value
of the sample s
The difference here with
the perceptron rule is the
type of activation
function used
Δ w=
−η⋅∂ E
∂w
13. 13
4-SingleLayer
Perieptronexamples
[By Amina Delali]
With sklearn: perceptron ruleWith sklearn: perceptron rule
●
●
Eta0==0.1 → Learning rate = 0.1
●
max_iter== 5000 → number of iterations if there
is no convergence
●
Shuffle == False → there is no shuffle after a
weight is updated for the next sample
The fact that after each weight
update the samples were
shuffled, only 20 iterations were
sufficient to reach convergence.
Sklearn uses Hinge
loss function with
threshol = 0 to compute
the error of the
prediction
14. 14
4-SingleLayer
Perieptronexamples
[By Amina Delali]
With sklearn: perceptron ruleWith sklearn: perceptron rule
●
Use of stochastic
gradient descent
classifier class
Since there is 3
classes, and
since the
classification
strategy used is
one vs all, the
classifier will run
on 3 times. Each
time with max
epochs == 20
15. 15
4-SingleLayer
Perieptronexamples
[By Amina Delali]
Example with neurolab: delta ruleExample with neurolab: delta rule
Since we have 4 features → we need an input layer
with 4 neurons. And since we have 3 classes, and
since we will use SoftMax activation function (ideal for
multi-class classification), we will need 3 output neuron:
Class 0 coded as 1 0 0
Class 1 coded as 0 1 0
Class 2 coded as 0 0 1
NB: the classes are not coded in a binary code.
The corresponding formula
of SoftMax function is :
f (xi)=
ei
x
∑
j=0
N
ej
x
N is the Number of samples
16. 16
4-SingleLayer
Perieptronexamples
[By Amina Delali]
Example with neurolab: delta ruleExample with neurolab: delta rule
●
Third class
→ third
column == 1
Select the index
corresponding to
the maximum
probability as
the predicted
class label
This
implementa
-tion in
neurolab
uses the
sum
squared
error as
cost
function.
And it
doesn’t
use the
derivative
of the
activation
function
used
17. 17
5-MultiLayer
Perieptron
[By Amina Delali]
Multi-Layer PerceptronMulti-Layer Perceptron
●
An MLP (Multi-Layer Perceptron) is a Perietron with one or
more hidden layers.
●
It is another Feed Forwad Artifiial neural network:. Each of
the layers (exiept the output layer) includes a bias neuron.
●
An ANN with more than one hidden layer is a Deep Neural
Network: ( DNN ).
●
Lets consider this MLP: input layer with 4 neurons,
second(hidden) layer with 2 neurons (for both frst and second
layer the bias neurons are not counted). The last layer is
composed of 3 neurons:
x1
x2
1
x3
x4
Z1
= w1
t
.X step1
(Z1
)
Z2
= w2
t
.X step1
(Z2
)
1
Z3
= w3
t
.h step2
(Z3
)
Z4
= w4
t
.h step2
(Z4
)
Z5
= w5
t
.h step2
(Z5
)
Forward propagation
Backpropagation
18. 18
5-MultiLayer
Perieptron
[By Amina Delali]
BackpropagationBackpropagation
●
It is a generalization of the delta rule. After a forward pass, a
baik:ward pass is applied to update the weights to back
propagate the errors, using gradient desient procedure.
●
This forward/backward passes are repeated until the error function
is minimized
●
The formula (of the generalized delta rule) is:
Δwk, h
=ηok δh
The update
value for the
weights of
the
connection
between the
neuron k and
neuron h
Output of
the
neuron k
δh=
{
´f act (neth)⋅( yh− ^yh)(h is an output neuron)
´f act (neth)⋅∑
l∈L
δwh, l
(h is a neuron of a hidden layer)
k h l2
l3
l1
The connection
with the weights
to update
k h
The connection
with the weights
to update
The derivative of
the activation
function of the
neuron h
Inputs of neuron
h: result of
propagation
function at h
True label
Predicted
label
19. 19
5-MultiLayer
Perieptron
[By Amina Delali]
ExampleExample
●
Activation function
for the second
(first hidden) layer
Activation function
for the third
(output) layer
TanSig formula:
y=
2
1+e
−2n −1
The
accuracy of
training is
equal = 1.0
(100%) but
with test data
we got less
good results
as with the
SLP ==>
over-fitting
20. 20
6-ANNtopologies
[By Amina Delali]
FeedForward Neural NetworksFeedForward Neural Networks
●
We have already seen feedforward neural network:s (SLP and
MLP) ●
One input layer + n hidden layers + one output layer (n>=1)
●
Connection are only allowed to neurons of the following layers
●
They can have shortcut connections: the connection are not
set to the following layers but to subsequent layers
1 2
3 4
5 6 7
The corresponding
Hinton Diagram
The white
squares will
correspond
to the
existing
connections
Connections
from the First
neuron: it has
only
connections to
the third and
fourth neuron
21. 21
6-ANNtopologies
[By Amina Delali]
Recurrent NetworksRecurrent Networks
●
Multiple layers with connections allowed to neurons
of the following layers
●
A neuron can also be connected to itself
1 2
3 4
5 6 7
Direct Recurrence
For visualization
purposes the
recurrent
connections are
represented by
smaller squares
Indirect Recurrence
1 2
3 4
5 7
●
Multiple layers with
connections allowed to
neurons of the following
layers
●
Connections are also
allowed between neurons
and preceding layer6
22. 22
6-ANNtopologies
[By Amina Delali]
Fully connected Neural NetworkFully connected Neural Network
●
Multiple layers with connections allowed
from any neuron to any other neuron
●
Direct recurrence is not allowed
●
Connections must be symmetric
23. Referenies
●
Joshi Prateek. Artifcial intelligence with Python. Packt Publishing, 2017.
●
Jake VanderPlas. Python data science handbook: essential tools for working
with data. O’Reilly Media, Inc, 2017.
●
Neural Networks – II, Version 2 CSE IIT, Kharagpur, available at:
https://nptel.ac.in/courses/106105078/pdf/Lesson%2038.pdf
●
Isaac Changhau, Loss Functions in Neural Networks, 2017, available at:
https://isaacchanghau.github.io/post/loss_functions/
●
Sebastian Seung, The delta rule, MIT Department of Brain and Cognitive
Sciences , Introduction to Neural Networks, 2005,
https://ocw.mit.edu/courses/brain-and-cognitive-sciences/9-641j-introduction
-to-neural-networks-spring-2005/lecture-notes/lec19_delta.pdf
●
NeuPy, Neural Networks in Python,http://neupy.com/pages/home.html