Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...IJERA Editor
The architectures of Artificial Neural Networks (ANN) are based on the problem domain and it is applied during
the „training phase‟ of sample data and used to infer results for the remaining data in the testing phase.
Normally, the architecture consist of three layers as input, hidden, output layers with the number of nodes in the
input layer as number of known values on hand and the number of nodes as result to be computed out of the
values of input nodes and hidden nodes as the output layer. The number of nodes in the hidden layer is
heuristically decided so that the optimum value is obtained with reasonable number of iterations with other
parameters with its default values. This study mainly focuses on Cascade-Correlation Neural Networks (CCNN)
using Back-Propagation (BP) algorithm which finds the number of neurons during the training phase itself by
appending one from the previous iteration satisfying the error condition gives a promising result on the optimum
number of neurons in the hidden layer
How to create a neural network that detects people wearing masks. Ultimate description, the A-to-Z workflow for creating a neural network that recognizes images.
A short intro to the paper: https://blog.fulcrum.rocks/neural-network-image-recognition
An Artificial Intelligence Approach to Ultra High Frequency Path Loss Modelli...ijtsrd
This study proposes Artificial Intelligence AI based path loss prediction models for the suburban areas of Abuja, Nigeria. The AI based models were created on the bases of two deep learning networks, namely the Adaptive Neuro Fuzzy Inference System ANFIS and the Generalized Radial Basis Function Neural network RBF NN . These prediction models were created, trained, validated and tested for path loss prediction using path loss data recorded at 1800MHz from multiple Base Transceiver Stations BTSs distributed across the areas under investigation. Results indicate that the ANFIS and RBF NN based models with Root Mean Squared Error RMSE values of 5.30dB and 5.31dB respectively, offer greater prediction accuracy over the widely used empirical COST 231 Hata, which has an RMSE of 8.18dB. Deme C. Abraham ""An Artificial Intelligence Approach to Ultra-High Frequency Path Loss Modelling of the Suburban Areas of Abuja, Nigeria"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-2 , February 2020,
URL: https://www.ijtsrd.com/papers/ijtsrd30227.pdf
Paper Url : https://www.ijtsrd.com/computer-science/artificial-intelligence/30227/an-artificial-intelligence-approach-to-ultra-high-frequency-path-loss-modelling-of-the-suburban-areas-of-abuja-nigeria/deme-c-abraham
Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...IJERA Editor
The architectures of Artificial Neural Networks (ANN) are based on the problem domain and it is applied during
the „training phase‟ of sample data and used to infer results for the remaining data in the testing phase.
Normally, the architecture consist of three layers as input, hidden, output layers with the number of nodes in the
input layer as number of known values on hand and the number of nodes as result to be computed out of the
values of input nodes and hidden nodes as the output layer. The number of nodes in the hidden layer is
heuristically decided so that the optimum value is obtained with reasonable number of iterations with other
parameters with its default values. This study mainly focuses on Cascade-Correlation Neural Networks (CCNN)
using Back-Propagation (BP) algorithm which finds the number of neurons during the training phase itself by
appending one from the previous iteration satisfying the error condition gives a promising result on the optimum
number of neurons in the hidden layer
How to create a neural network that detects people wearing masks. Ultimate description, the A-to-Z workflow for creating a neural network that recognizes images.
A short intro to the paper: https://blog.fulcrum.rocks/neural-network-image-recognition
An Artificial Intelligence Approach to Ultra High Frequency Path Loss Modelli...ijtsrd
This study proposes Artificial Intelligence AI based path loss prediction models for the suburban areas of Abuja, Nigeria. The AI based models were created on the bases of two deep learning networks, namely the Adaptive Neuro Fuzzy Inference System ANFIS and the Generalized Radial Basis Function Neural network RBF NN . These prediction models were created, trained, validated and tested for path loss prediction using path loss data recorded at 1800MHz from multiple Base Transceiver Stations BTSs distributed across the areas under investigation. Results indicate that the ANFIS and RBF NN based models with Root Mean Squared Error RMSE values of 5.30dB and 5.31dB respectively, offer greater prediction accuracy over the widely used empirical COST 231 Hata, which has an RMSE of 8.18dB. Deme C. Abraham ""An Artificial Intelligence Approach to Ultra-High Frequency Path Loss Modelling of the Suburban Areas of Abuja, Nigeria"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-2 , February 2020,
URL: https://www.ijtsrd.com/papers/ijtsrd30227.pdf
Paper Url : https://www.ijtsrd.com/computer-science/artificial-intelligence/30227/an-artificial-intelligence-approach-to-ultra-high-frequency-path-loss-modelling-of-the-suburban-areas-of-abuja-nigeria/deme-c-abraham
Simulation of Single and Multilayer of Artificial Neural Network using Verilogijsrd.com
Artificial neural network play an important role in VLSI circuit to find and diagnosis multiple fault in digital circuit. In this paper, the example of single layer and multi-layer neural network had been discussed secondly implement those structure by using verilog code and same idea must be implement in mat lab for getting number of iteration and verilog code gives us time taken to adjust the weight when error become almost equal to zero. The purposed aim at reducing resource requirement, without much compromises on the speed that neural network can be realized on single chip at lower cost.
In this deck, Huihuo Zheng from Argonne National Laboratory presents: Data Parallel Deep Learning.
"The Argonne Training Program on Extreme-Scale Computing (ATPESC) provides intensive, two weeks of training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current high-end computing systems and the leadership-class computing systems of the future."
Watch the video: https://wp.me/p3RLHQ-lsl
Learn more: https://extremecomputingtraining.anl.gov/archive/atpesc-2019/agenda-2019/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...Anirbit Mukherjee
This is a slightly expanded version of the talk I gave at the 2018 ISMP (International Symposium on Mathematical Programming). This SIAM talk has some more introductory material than the ISMP talk.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.Anirbit Mukherjee
This survey talk at the Vector Institute is a much more extended version of my overview talks at the ISMP 2018 and the SIAM Annual Meeting 2018. This gives a lot more details about background concepts and proof strategies.
In this deck, Pieter Abbeel from UC Berkeley describes his group research into making robots learn.
Watch the video: https://wp.me/p3RLHQ-hf7
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Improving of artifical neural networks performance by using gpu's a surveycsandit
In this paper we study the improvement in the performance of Artificial Neural Networks (ANN)
by using parallel programming in GPU or FPGA architectures. It is well known that ANN can
be parallelized according to particular characteristics of the training algorithm. We discuss
both approaches: the software (GPU) and the Hardware (FPGA). Different training strategies
are discussed: the Perceptron training unit, the Support Vector Machines (SVM) and Spiking
Neural Networks (SNN). The different approaches are evaluated by the training speed and
performance. On the other hand, algorithms were coded by authors in the hardware, like Nvidia
card, FPGA or sequential circuits that depends on methodology used, to compare learning time
with between GPU and CPU. Also, the main applications were made for recognition pattern,
like acoustic speech, odor and clustering According to literature, GPU has a great advantage
compared to CPU, this in the learning time except when it implies rendering of images, despite
several architectures of Nvidia cards and CPU’s. Also, in the survey we introduce a brief
description of the types of ANN and its techniques of execution to be related with results of
researching.
IMPROVING OF ARTIFICIAL NEURAL NETWORKS PERFORMANCE BY USING GPU’S: A SURVEYcsandit
In this paper we study the improvement in the performance of Artificial Neural Networks (ANN) by using parallel programming in GPU or FPGA architectures. It is well known that ANN can be parallelized according to particular characteristics of the training algorithm. We discuss both approaches: the software (GPU) and the Hardware (FPGA). Different training strategies
are discussed: the Perceptron training unit, the Support Vector Machines (SVM) and Spiking Neural Networks (SNN). The different approaches are evaluated by the training speed and performance. On the other hand, algorithms were coded by authors in the hardware, like Nvidia card, FPGA or sequential circuits that depends on methodology used, to compare learning time with between GPU and CPU. Also, the main applications were made for recognition pattern,like acoustic speech, odor and clustering According to literature, GPU has a great advantage compared to CPU, this in the learning time except when it implies rendering of images, despite
several architectures of Nvidia cards and CPU’s. Also, in the survey we introduce a brief description of the types of ANN and its techniques of execution to be related with results of researching.
Artificial Neural Network Based Object Recognizing RobotJaison Sabu
Main Project Presentation - Computer Science Department, College of Engineering Chengannur 2003-2007, Affiliated to Cochin University of Science and Technology, Kerala, India
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/new-methods-for-implementation-of-2-d-convolution-for-convolutional-neural-networks-a-presentation-from-santa-clara-university/
Tokunbo Ogunfunmi, Professor of Electrical Engineering and Director of the Signal Processing Research Laboratory at Santa Clara University, presents the “New Methods for Implementation of 2-D Convolution for Convolutional Neural Networks” tutorial at the September 2020 Embedded Vision Summit.
The increasing usage of convolutional neural networks (CNNs) in various applications on mobile and embedded devices and in data centers has led researchers to explore application specific hardware accelerators for CNNs. CNNs typically consist of a number of convolution, activation and pooling layers, with convolution layers being the most computationally demanding. Though popular for accelerating CNN training and inference, GPUs are not ideal for embedded applications because they are not energy efficient.
ASIC and FPGA accelerators have the potential to run CNNs in a highly efficient manner. Ogunfunmi presents two new methods for 2-D convolution that offer significant reduction in power consumption and computational complexity. The first method computes convolution results using row-wise inputs, as opposed to traditional tile-based processing, yielding considerably reduced latency. The second method, single partial product 2-D (SPP2D) convolution, avoids recalculation of partial weights and reduces input reuse. Hardware implementation results are presented.
Simulation of Single and Multilayer of Artificial Neural Network using Verilogijsrd.com
Artificial neural network play an important role in VLSI circuit to find and diagnosis multiple fault in digital circuit. In this paper, the example of single layer and multi-layer neural network had been discussed secondly implement those structure by using verilog code and same idea must be implement in mat lab for getting number of iteration and verilog code gives us time taken to adjust the weight when error become almost equal to zero. The purposed aim at reducing resource requirement, without much compromises on the speed that neural network can be realized on single chip at lower cost.
In this deck, Huihuo Zheng from Argonne National Laboratory presents: Data Parallel Deep Learning.
"The Argonne Training Program on Extreme-Scale Computing (ATPESC) provides intensive, two weeks of training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current high-end computing systems and the leadership-class computing systems of the future."
Watch the video: https://wp.me/p3RLHQ-lsl
Learn more: https://extremecomputingtraining.anl.gov/archive/atpesc-2019/agenda-2019/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...Anirbit Mukherjee
This is a slightly expanded version of the talk I gave at the 2018 ISMP (International Symposium on Mathematical Programming). This SIAM talk has some more introductory material than the ISMP talk.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.Anirbit Mukherjee
This survey talk at the Vector Institute is a much more extended version of my overview talks at the ISMP 2018 and the SIAM Annual Meeting 2018. This gives a lot more details about background concepts and proof strategies.
In this deck, Pieter Abbeel from UC Berkeley describes his group research into making robots learn.
Watch the video: https://wp.me/p3RLHQ-hf7
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Improving of artifical neural networks performance by using gpu's a surveycsandit
In this paper we study the improvement in the performance of Artificial Neural Networks (ANN)
by using parallel programming in GPU or FPGA architectures. It is well known that ANN can
be parallelized according to particular characteristics of the training algorithm. We discuss
both approaches: the software (GPU) and the Hardware (FPGA). Different training strategies
are discussed: the Perceptron training unit, the Support Vector Machines (SVM) and Spiking
Neural Networks (SNN). The different approaches are evaluated by the training speed and
performance. On the other hand, algorithms were coded by authors in the hardware, like Nvidia
card, FPGA or sequential circuits that depends on methodology used, to compare learning time
with between GPU and CPU. Also, the main applications were made for recognition pattern,
like acoustic speech, odor and clustering According to literature, GPU has a great advantage
compared to CPU, this in the learning time except when it implies rendering of images, despite
several architectures of Nvidia cards and CPU’s. Also, in the survey we introduce a brief
description of the types of ANN and its techniques of execution to be related with results of
researching.
IMPROVING OF ARTIFICIAL NEURAL NETWORKS PERFORMANCE BY USING GPU’S: A SURVEYcsandit
In this paper we study the improvement in the performance of Artificial Neural Networks (ANN) by using parallel programming in GPU or FPGA architectures. It is well known that ANN can be parallelized according to particular characteristics of the training algorithm. We discuss both approaches: the software (GPU) and the Hardware (FPGA). Different training strategies
are discussed: the Perceptron training unit, the Support Vector Machines (SVM) and Spiking Neural Networks (SNN). The different approaches are evaluated by the training speed and performance. On the other hand, algorithms were coded by authors in the hardware, like Nvidia card, FPGA or sequential circuits that depends on methodology used, to compare learning time with between GPU and CPU. Also, the main applications were made for recognition pattern,like acoustic speech, odor and clustering According to literature, GPU has a great advantage compared to CPU, this in the learning time except when it implies rendering of images, despite
several architectures of Nvidia cards and CPU’s. Also, in the survey we introduce a brief description of the types of ANN and its techniques of execution to be related with results of researching.
Artificial Neural Network Based Object Recognizing RobotJaison Sabu
Main Project Presentation - Computer Science Department, College of Engineering Chengannur 2003-2007, Affiliated to Cochin University of Science and Technology, Kerala, India
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/new-methods-for-implementation-of-2-d-convolution-for-convolutional-neural-networks-a-presentation-from-santa-clara-university/
Tokunbo Ogunfunmi, Professor of Electrical Engineering and Director of the Signal Processing Research Laboratory at Santa Clara University, presents the “New Methods for Implementation of 2-D Convolution for Convolutional Neural Networks” tutorial at the September 2020 Embedded Vision Summit.
The increasing usage of convolutional neural networks (CNNs) in various applications on mobile and embedded devices and in data centers has led researchers to explore application specific hardware accelerators for CNNs. CNNs typically consist of a number of convolution, activation and pooling layers, with convolution layers being the most computationally demanding. Though popular for accelerating CNN training and inference, GPUs are not ideal for embedded applications because they are not energy efficient.
ASIC and FPGA accelerators have the potential to run CNNs in a highly efficient manner. Ogunfunmi presents two new methods for 2-D convolution that offer significant reduction in power consumption and computational complexity. The first method computes convolution results using row-wise inputs, as opposed to traditional tile-based processing, yielding considerably reduced latency. The second method, single partial product 2-D (SPP2D) convolution, avoids recalculation of partial weights and reduces input reuse. Hardware implementation results are presented.
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in RManish Saraswat
Simple guide which explains deep learning and neural network with hands on experience in R using MXnet and H2o package. It also explains gradient descent and backpropagation algorithm.
Complete tutorial: http://blog.hackerearth.com/understanding-deep-learning-parameter-tuning-with-mxnet-h2o-package-r
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
In the last couple of years, deep learning techniques have transformed the world of artificial intelligence. One by one, the abilities and techniques that humans once imagined were uniquely our own have begun to fall to the onslaught of ever more powerful machines. Deep neural networks are now better than humans at tasks such as face recognition and object recognition. They’ve mastered the ancient game of Go and thrashed the best human players. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new hype? How is Deep Learning different from previous approaches? Let’s look behind the curtain and unravel the reality. This talk will introduce the core concept of deep learning, explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why “deep learning is probably one of the most exciting things that is happening in the computer industry“ (Jen-Hsun Huang – CEO NVIDIA).
Neural Networks, Spark MLlib, Deep LearningAsim Jalis
What are neural networks? How to use the neural networks algorithm in Apache Spark MLlib? What is Deep Learning? Presented at Data Science Meetup at Galvanize on 2/17/2016.
For code see IPython/Jupyter/Toree notebook at http://nbviewer.jupyter.org/gist/asimjalis/4f911882a1ab963859ce
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
This Deep Learning Presentation will help you in understanding what is Deep learning, why do we need Deep learning, applications of Deep Learning along with a detailed explanation on Neural Networks and how these Neural Networks work. Deep learning is inspired by the integral function of the human brain specific to artificial neural networks. These networks, which represent the decision-making process of the brain, use complex algorithms that process data in a non-linear way, learning in an unsupervised manner to make choices based on the input. This Deep Learning tutorial is ideal for professionals with beginners to intermediate levels of experience. Now, let us dive deep into this topic and understand what Deep learning actually is.
Below topics are explained in this Deep Learning Presentation:
1. What is Deep Learning?
2. Why do we need Deep Learning?
3. Applications of Deep Learning
4. What is Neural Network?
5. Activation Functions
6. Working of Neural Network
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you’ll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms.
There is booming demand for skilled deep learning engineers across a wide range of industries, making this deep learning course with TensorFlow training well-suited for professionals at the intermediate to advanced level of experience. We recommend this deep learning online course particularly for the following professionals:
1. Software engineers
2. Data scientists
3. Data analysts
4. Statisticians with an interest in deep learning
Introduction to ANN Principles and its Applications in Solar Energy TechnologyAli Al-Waeli
I presented the slides in 2022, at SERI, UKM. The aim of the presentation is to provide an overview of AI, Machine Learning and ANN. Moreover, to introduce their application in Solar energy technologies.
VGGFace Transfer Learning and Siamese Network for Face Recognitionijtsrd
Traditionally, data mining algorithms and machine learning algorithms are engineered to approach the problems in isolation. These algorithms are employed to train the model in separation on a specific feature space and same distribution. Depending on the business case, a model is trained by applying a machine learning algorithm for a specific task. A widespread assumption in the field of machine learning is that training data and test data must have identical feature spaces with the underlying distribution. On the contrary, in real world this assumption may not hold and thus models need to be rebuilt from the scratch if features and distribution changes. It is an arduous process to collect related training data and rebuild the models. In such cases, Transferring of Knowledge or transfer learning from disparate domains would be desirable. Transfer learning is a method of reusing a pre trained model knowledge for another task. Transfer learning can be used for classification, regression and clustering problems. This paper uses one of the pre trained models – VGGFace with Deep Convolutional Neural Network to classify images. Aafaq Altaf | Rahee Jan | Andleeb Qadir | Shahid Mohid-ud-din Bhat "VGGFace Transfer Learning and Siamese Network for Face Recognition" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-1 , December 2021, URL: https://www.ijtsrd.com/papers/ijtsrd47869.pdf Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/47869/vggface-transfer-learning-and-siamese-network-for-face-recognition/aafaq-altaf
This is an introduction to deep learning presented to Plymouth University students. In the introduction it is explained how a neural network works. In the practical section it is shown how to use Tensorflow for building simple models. Finally the case studies, how to use deep learning in real world applications.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
1. Introduction to Neural
Network and Deep
Learning
ISSAM A. AL-ZINATI
OUTREACH & TECHNICAL ADVISOR
UCAS TECHNOLOGY INCUBATOR
ISSAM A. AL-ZINATI - UCASTI 1
3. Artificial Intelligence vs Machine Learning
Artificial Intelligence is the replication of human intelligence in
computers.
Machine Learning refers to the ability of a machine to learn using
large data sets instead of hard coded rules.
ISSAM A. AL-ZINATI - UCASTI 3
4. Supervised learning vs unsupervised
learning
Supervised Learning involves using labelled data sets that have
inputs and expected outputs.
Unsupervised Learning is the task of machine learning using data
sets with no specified structure.
ISSAM A. AL-ZINATI - UCASTI 4
5. What is Deep Learning
ISSAM A. AL-ZINATI - UCASTI 5
Is a Neural Network
6. What is Deep Learning
ISSAM A. AL-ZINATI - UCASTI 6
Is a Neural Network Neuron
Can run small specific
mathematical task
7. What is Deep Learning
ISSAM A. AL-ZINATI - UCASTI 7
Is a Neural Network Neuron
Can run small specific
mathematical task
Edge
Connects Neurons
Holds weights to adjust inputs
8. What is Deep Learning
ISSAM A. AL-ZINATI - UCASTI 8
Is a Neural Network
With More Layers
9. What is Deep Learning
ISSAM A. AL-ZINATI - UCASTI 9
Is a Neural Network
With More Layers
And More Neurons
10. What is Deep Learning
Deep Learning is a machine learning method. It allows us to train an
AI to predict outputs, given a set of inputs. Both supervised and
unsupervised learning can be used to train the AI.
ISSAM A. AL-ZINATI - UCASTI 10
11. How it wok – The Magic
ISSAM A. AL-ZINATI - UCASTI 11
12. How it work – No Magic
Deep Neural network is not magic. But it is very good at finding patterns.
“The hierarchy of concepts allows the computer to learn complicated concepts
by building them out of simpler ones. If we draw a graph showing how these
concepts are built on top of each other, the graph is deep, with many layers. For
this reason, we call this approach to AI deep learning”, Ian Goodfellow.
Deep Learning is Hierarchical Feature Learning.
ISSAM A. AL-ZINATI - UCASTI 12
13. How human brain works exactly?
ISSAM A. AL-ZINATI - UCASTI 13
14. How human brain works exactly?
ISSAM A. AL-ZINATI - UCASTI 14
15. How perceptron as an artificial neuron
works - Forward neural Network?
ISSAM A. AL-ZINATI - UCASTI 15
16. How perceptron as an artificial neuron
works - Forward neural Network?
ISSAM A. AL-ZINATI - UCASTI 16
17. What is weight in Neural Network?
Weight refers to the strength of connection between nodes. Unsigned value
(without +, -) of weight depends on how nodes have power to connect to each
other.
It can be positive or negative. Positive means it is more likely to transmit data
and having strong connection among neurons while negative is vice versa. At the
initialize point we select weight randomly but for having reasonable result it is
better to normalize input data as follow, X is input data:
ISSAM A. AL-ZINATI - UCASTI 17
18. What is Activation Function role in
Neural Network?
Activation function is (although a bit) equivalent to polarization and stabilizing.
ISSAM A. AL-ZINATI - UCASTI 18
19. How backward propagation works?
In backward propagation because we need optimum value so we differentiate from sigmoid
function and go inversely from right to left, in order to finding new values for weights.
(1) output_node′ = Sigmoid′ (hidden_sigma) * margin
(2) weight_2 ′ = (output_node′ / hidden_node) + weight_2
(3) hidden_node ′ = (output_node′ / weight_2) * Sigmoid′ (input_sigma)
(4) weight_1 ′ = (hidden_node′ / input_node) + weight_1
(5) Again we repeat steps 1 to 5 with new weights and comparison value from current margin
errors and previous margin errors if current error is less than previous one, so it shows us that
we are in right direction.
(6) We iterate step 1 to 10 until margin error is near to our “Y”.
ISSAM A. AL-ZINATI - UCASTI 19
23. Why Now- Scale
ISSAM A. AL-ZINATI - UCASTI 23
Data
Small Meduim Large
Performance Based on Data Size
Performance
The more data
you feed the
model, the
better results
you get
25. Why Now- Scale
ISSAM A. AL-ZINATI - UCASTI 25
Model Size & GPU
Small Meduim Large
Performance Based on Model Size
Performance
Bigger model could
achieve better
results.
GPUs help to train
those models in
much faster, 20X!!
26. Why Now– vs Others
ISSAM A. AL-ZINATI - UCASTI 26
What about other kind of machine learning algorithms, i.e. SVM, DT, Boosting, ….
Would they do better if they got more data and power?
27. Why Now– vs Others
Small Data Medium Data Large Data
Performance of NN VS Others
Based on Model Size and Data Amount
Others Small NN Medium NN Large NN
ISSAM A. AL-ZINATI - UCASTI 27
28. Why Now– End-To-End
ISSAM A. AL-ZINATI - UCASTI 28
Usual machine learning approach contains a pipeline of stages that are
responsible of feature extraction.
Each stage passes a set of engineered features which help model to better
understand the case it works on.
This approach is complex and prone to errors.
29. Why Now– End-To-End
ISSAM A. AL-ZINATI - UCASTI 29
Data (Audio)
Speech Recognition Pipeline
Audio
Features
Phonemes
Language
Model
Transcript
30. Why Now– End-To-End
ISSAM A. AL-ZINATI - UCASTI 30
Data (Audio)
Speech Recognition - DL
Audio
Features Phonemes Language
Model
Transcript
31. Why Now– End-To-End
ISSAM A. AL-ZINATI - UCASTI 31
Data (Audio)
Speech Recognition - DL
Audio
Features Phonemes Language
Model
Transcript
The Magic
32. Deep Learning Models
ISSAM A. AL-ZINATI - UCASTI 32
General
Model
FC
Sequence
Model
RNN
LSTM
Image
Model
CNN
Other
Models
Unsupervised
RL
33. Deep Learning Models
ISSAM A. AL-ZINATI - UCASTI 33
General
Model
FC
Sequence
Model
RNN
LSTM
Image
Model
CNN
Other
Models
Unsupervised
RL
Hot Research Topic
34. Advanced Deep Learning Models –
VGGNET - ResNet
Achieves 7.3% on ImageNet-2014 classification Challenge, come in the first
place.
It Used
120 million
parameters.
ISSAM A. AL-ZINATI - UCASTI 34
35. Advanced Deep Learning Models –
Google Inception V3
Achieves 5.64% on ImageNet-2015 classification Challenge, come in the second place.
ISSAM A. AL-ZINATI - UCASTI 35
36. Advanced Deep Learning Models –
Google Inception V3
Based on ConvNet concept with the addition
of inception module.
ISSAM A. AL-ZINATI - UCASTI 36
Using a network with a
computational cost of 5 billion
multiply-adds per inference and
with using less than 25 million
parameters.
37. Deep Learning Applications – Deep Voice
Baidu Research presents Deep Voice, a production-quality text-to-speech system
constructed entirely from deep neural networks.
Ground Truth
Generated Voice
ISSAM A. AL-ZINATI - UCASTI 37
38. Deep Learning Applications – Image
Captioning
Multimodal Recurrent Neural Architecture generates sentence descriptions from
images. Source.
ISSAM A. AL-ZINATI - UCASTI 38
"man in black shirt is playing guitar." "two young girls are playing with lego toy."
39. Deep Learning Applications – Generating
Videos
ISSAM A. AL-ZINATI - UCASTI 39
This approach was driven by using Adversarial Network to
1) Generate Videos
2) Conditional Video Generation based on Static Images
Source