Getting started with neural networks
using Keras andTensorflow
By Mohamed Gamal
Deep Learning
▪ Anatomy of a neural network.
▪ A closer look at TensorFlow, Keras and their relationship.
▪ Setting up deep learning workspace.
▪ Example.
Contents
Anatomy of a neural network
Training a neural network revolves around the following objects:
1) Layers, which are combined into a network (or model).
2) The input data and corresponding targets.
3) The loss function, which defines the feedback signal used for learning.
4) The optimizer, which determines how learning proceeds.
1) Layers
— A layer is a data-processing module that takes as input one or more tensors and outputs one
or more tensors.
— Different layers are appropriate for different tensor formats and different types of data
processing.
▪ For example, simple vector data, stored in 2D tensors (samples, features)
✓ often processed by densely connected (aka fully connected or dense) layers.
✓ For Example, (samples=1000 houses, features=[size, no. rooms, location]).
▪ Sequence data (text, audio, etc.), stored in 3D tensors (samples, timesteps, features)
✓ typically processed by recurrent layers.
✓ For Example, (samples=500 text documents, timesteps=100 words, features=dimensionality of the
word embeddings (e.g., 300-dimensional word embeddings)).
▪ Image data, stored in 4D tensors (samples, height, width, channels)
✓ usually processed by 2D convolution layers (Conv2D) to extract features.
✓ For Example, (samples=10,000 images, height=28 pixels, width=28 pixels, channel=1).
1) Layers (Cont.)
— Some layers in a neural network are stateless, meaning they do not have any internal state
or memory. They perform computations solely based on the input data.
— However, many layers have a state, which refers to the layer's weights.
▪ These weights are learned during the training process using techniques like stochastic gradient
descent (SGD).
▪ The weights contain the network's knowledge and are adjusted iteratively during training to
minimize the loss function.
▪ Often processed by densely connected (aka fully connected or dense) layers.
2) Models (networks of layers)
— Network topologies, some common ones include the following
▪ Two-branch networks
▪ Multihead networks
▪ Inception blocks
— The outputs are often combined in some way to produce the final output
of the network.
3) Loss functions and optimizers
— Once the network architecture is defined, you still have two more things
▪ Loss function (objective function):
✓ The quantity that will be minimized during training.
✓ It represents a measure of success for the task at hand.
▪ Optimizer:
✓ Determines how the network will be updated based on the loss function.
✓ It implements a specific variant of stochastic gradient descent (SGD) —
e.g., RMSprop, Adam … etc.
What is TensorFlow?
▪ Python–based , free, open-source machine learning platform
• Developed primary by Google.
• Computes automatically the gradient (i.e., the derivative) of any differentiable
expression (highly suitable for DL).
• Runs on CPU, GPU, TPU.
• Computations can be distributed to many machines.
• Supports other programming languages (e.g., C++, JavaScript, etc.).
What is Keras?
▪ Keras is a deep leaning API for python, built on top of TensorFlow.
• GPU: Graphics Processing Unit.
• CPU: Central Processing Unit.
• TPU: Tensor Processing Unit (by Google).
Setting up a deep learning workspace
— Prefer using GPU instead of CPU when possible (5-10x faster).
▪ GPU options:
✓ Buy one ($1500 – $2000).
✓ Rent one (AWS , Google cloud , paperspace … etc.) for $250 per hour.
✓ Use free one (Google Colab or paperspace Gradient).
— Use Unix (i.e., Linux) for Compatibility with Deep Learning Libraries.
1) Jupyter notebooks
— The preferred way to run deep learning experiments.
— A way to edit code in your browser.
— Can add markdown text for annotation/comments.
— Can run blocks of code independently while maintaining the current state in
memory (no need to rerun the whole script).
2) Google Colab
— Free Jupyter notebook service.
— No installation needed.
— Runs entirely on the cloud.
— Free (but limited) GPU/TPU runtime.
Link: https://colab.research.google.com
First Steps with TensorFlow
— Low–level tensor manipulation
▪ Tensors/Variables
▪ Tensor Operations (Relu, matmul)
▪ Backpropagation (TF Gradient Tape)
— High–level (i.e., Keras APIs)
▪ Layers
▪ Loss function
▪ Optimizer
▪ Metrics
▪ Training loop
Constant Tensors/Variables
▪ All ones tensors
▪ All zeros tensors
Random Tensors
NumPy arrays are assignable
▪ Assigning value to TensorFlow variable
▪ Assigning value to a subset to a TensorFlow variable
▪ Creating a variable
TensorFlow
Tensor operations: Doing Math in TensorFlow
▪ A few Basic math operations
The Boston Housing Price dataset
We’ll attempt to predict the median price of homes in a given Boston
suburb in the mid-1970s, given data points about the suburb at the
time, such as the crime rate, the local property tax rate, and so on.
Regression Example
Getting started with neural networks (NNs)
Getting started with neural networks (NNs)
Getting started with neural networks (NNs)
Getting started with neural networks (NNs)

Getting started with neural networks (NNs)

  • 1.
    Getting started withneural networks using Keras andTensorflow By Mohamed Gamal Deep Learning
  • 2.
    ▪ Anatomy ofa neural network. ▪ A closer look at TensorFlow, Keras and their relationship. ▪ Setting up deep learning workspace. ▪ Example. Contents
  • 3.
    Anatomy of aneural network Training a neural network revolves around the following objects: 1) Layers, which are combined into a network (or model). 2) The input data and corresponding targets. 3) The loss function, which defines the feedback signal used for learning. 4) The optimizer, which determines how learning proceeds.
  • 5.
    1) Layers — Alayer is a data-processing module that takes as input one or more tensors and outputs one or more tensors. — Different layers are appropriate for different tensor formats and different types of data processing. ▪ For example, simple vector data, stored in 2D tensors (samples, features) ✓ often processed by densely connected (aka fully connected or dense) layers. ✓ For Example, (samples=1000 houses, features=[size, no. rooms, location]). ▪ Sequence data (text, audio, etc.), stored in 3D tensors (samples, timesteps, features) ✓ typically processed by recurrent layers. ✓ For Example, (samples=500 text documents, timesteps=100 words, features=dimensionality of the word embeddings (e.g., 300-dimensional word embeddings)). ▪ Image data, stored in 4D tensors (samples, height, width, channels) ✓ usually processed by 2D convolution layers (Conv2D) to extract features. ✓ For Example, (samples=10,000 images, height=28 pixels, width=28 pixels, channel=1).
  • 6.
    1) Layers (Cont.) —Some layers in a neural network are stateless, meaning they do not have any internal state or memory. They perform computations solely based on the input data. — However, many layers have a state, which refers to the layer's weights. ▪ These weights are learned during the training process using techniques like stochastic gradient descent (SGD). ▪ The weights contain the network's knowledge and are adjusted iteratively during training to minimize the loss function. ▪ Often processed by densely connected (aka fully connected or dense) layers.
  • 7.
    2) Models (networksof layers) — Network topologies, some common ones include the following ▪ Two-branch networks ▪ Multihead networks ▪ Inception blocks — The outputs are often combined in some way to produce the final output of the network.
  • 8.
    3) Loss functionsand optimizers — Once the network architecture is defined, you still have two more things ▪ Loss function (objective function): ✓ The quantity that will be minimized during training. ✓ It represents a measure of success for the task at hand. ▪ Optimizer: ✓ Determines how the network will be updated based on the loss function. ✓ It implements a specific variant of stochastic gradient descent (SGD) — e.g., RMSprop, Adam … etc.
  • 9.
    What is TensorFlow? ▪Python–based , free, open-source machine learning platform • Developed primary by Google. • Computes automatically the gradient (i.e., the derivative) of any differentiable expression (highly suitable for DL). • Runs on CPU, GPU, TPU. • Computations can be distributed to many machines. • Supports other programming languages (e.g., C++, JavaScript, etc.).
  • 10.
    What is Keras? ▪Keras is a deep leaning API for python, built on top of TensorFlow. • GPU: Graphics Processing Unit. • CPU: Central Processing Unit. • TPU: Tensor Processing Unit (by Google).
  • 11.
    Setting up adeep learning workspace — Prefer using GPU instead of CPU when possible (5-10x faster). ▪ GPU options: ✓ Buy one ($1500 – $2000). ✓ Rent one (AWS , Google cloud , paperspace … etc.) for $250 per hour. ✓ Use free one (Google Colab or paperspace Gradient). — Use Unix (i.e., Linux) for Compatibility with Deep Learning Libraries.
  • 12.
    1) Jupyter notebooks —The preferred way to run deep learning experiments. — A way to edit code in your browser. — Can add markdown text for annotation/comments. — Can run blocks of code independently while maintaining the current state in memory (no need to rerun the whole script).
  • 13.
    2) Google Colab —Free Jupyter notebook service. — No installation needed. — Runs entirely on the cloud. — Free (but limited) GPU/TPU runtime. Link: https://colab.research.google.com
  • 14.
    First Steps withTensorFlow — Low–level tensor manipulation ▪ Tensors/Variables ▪ Tensor Operations (Relu, matmul) ▪ Backpropagation (TF Gradient Tape) — High–level (i.e., Keras APIs) ▪ Layers ▪ Loss function ▪ Optimizer ▪ Metrics ▪ Training loop
  • 15.
    Constant Tensors/Variables ▪ Allones tensors ▪ All zeros tensors
  • 16.
  • 17.
    NumPy arrays areassignable
  • 18.
    ▪ Assigning valueto TensorFlow variable ▪ Assigning value to a subset to a TensorFlow variable ▪ Creating a variable TensorFlow
  • 19.
    Tensor operations: DoingMath in TensorFlow ▪ A few Basic math operations
  • 20.
    The Boston HousingPrice dataset We’ll attempt to predict the median price of homes in a given Boston suburb in the mid-1970s, given data points about the suburb at the time, such as the crime rate, the local property tax rate, and so on. Regression Example