Understanding deep learning

Understanding deep
learning
A COMPLETE NOVICE’S PERSPECTIVE

Why now?
1. Data deluge
2. Cheaper GPUs
3. New techniques

Why is it popular?
Amazing performance in many tasks like never before
1. Machine translation
2. Speech recognition
3. Computer vision
4. Reinforcement learning
5. Natural language processing

Machine translation: Before deep
learning
Rule-based machine translation (1970s)
◦ Bilingual dictionary and linguistic rules
◦ Interlingua
◦ Find a ‘universal language’ as a middle layer
◦ Impossible task, can’t handle exceptions
Example-based machine translation (1980s)
◦ 1984, Makoto Nago (University of Tokyo)
◦ Learn through translations
Statistical machine translation (1990s)
◦ Use corpora to extract statistical relationships

Machine translation: Deep learning
Paper in 2014 by Bengio’s Lab
◦ Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
◦ https://arxiv.org/abs/1406.1078
Basic idea: Recurrent Neural Network Encoder-Decoder

Machine translation: Deep learning
27 September, 2016
A Neural Network for Machine Translation, at Production Scale
◦ https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html
A few years ago we started using Recurrent Neural Networks (RNNs) to directly learn the mapping
between an input sequence (e.g. a sentence in one language) to an output sequence (that same
sentence in another language) [2].
Whereas Phrase-Based Machine Translation (PBMT) breaks an input sentence into words and phrases
to be translated largely independently, Neural Machine Translation (NMT) considers the entire input
sentence as a unit for translation.
The advantage of this approach is that it requires fewer engineering design choices than previous
Phrase-Based translation systems. When it first came out, NMT showed equivalent accuracy with
existing Phrase-Based translation systems on modest-sized public benchmark data sets.

Machine translation: deep learning

Word embeddings
Turn text into numbers
◦ Word2Vec
Perform operations on them
Based on shallow neural networks (used as input to deep neural networks)

Intuition
Automatic hierarchical feature extraction

Simple feedforward neural networks
Most common type
◦ Input: 1 vector
◦ Output: probabily, real number, or multiple outputs

Recurrent neural network
Like feedforward, but signal feeds back into itself

Recurrent neural networks
Useful for sequences where the past can affect the future
◦ Natural language
◦ Time series (e.g. finance)
Provide ‘memory’ to neural networks
LSTM (Long-Short Term Memory)
◦ Longer dependencies
◦ Gated Recurrent Units

RNN: Neural machine translation
Seq2Seq model
◦ Deep recurrent architecture
◦ Je suis étudiant -> I am a student

RNN: Text generation
Feed a sequence of characters
◦ Predict the next character
◦ Recurrent units keep the context
Then feed the output back into itself!

Convolutional neural networks
Use a sliding window to capture parts of an image
◦ Then use pooling
◦ E.g. keep only 1 pixel out of 9, or average their values
Allows the extraction of higher level features
◦ By utilising feature locality
◦ And ignoring noise

Image classification
VCG (right), inception module (bottom), Alexnet (Middle)

Reinforcement learning
Deep Q-learning
Approach by Google Deep Mind
◦ AI company in London
Create AI that can play video games
◦ Goal to extend to real environments
Current evolution
◦ Networks play against each other
◦ Managed to beat professional Go players

Generative Adversarial Network

Image captioning
Combination of convolutional units and RNN
Same architecture (but with 3d convolution) can be used for video captioning

Style transfer
Feed random images to pretrained network
Dual loss (content and style)
Train to combine the two

Image generation
Through GAN (left – real, right – generated)

Image translation through GANs

Tools for deep learning
https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software
Tensorflow
◦ Google
◦ Very flexible
PyTorch
◦ Open source
◦ Facebook, Nvidia, Twitter and other companies develop it
◦ Useful for research
Keras
◦ Python higher-level interface for Tensorflow
Caffe
◦ Berkley AI research
◦ Useful for computer vision

Commoditised services
Google Cloud AI
◦ https://cloud.google.com/products/machine-learning/
◦ Vision, speech-to-text, text-to-speech, translation, and other
IBM
◦ https://www.ibm.com/watson/products-services/
◦ Visual recognition, translation, sentiment analysis, entity extraction
Microsoft Azure
◦ https://azure.microsoft.com/en-gb/solutions/
◦ Vision, NLP, etc.

So when to use deep learning
Amazing for anything relating to
◦ Audio
◦ Computer vision
◦ NLP
Drawbacks
◦ Loads of data
◦ Lots of processing power
◦ 1000s of hyperparameter
◦ Months of training
When to use
◦ ML or stats better for many problems (especially when datasets are smaller)
◦ If you face a computer vision, audio, etc. problem then deep learning is the best bet
◦ Try using a commoditized service before developing your own
◦ Developing your own solution -> cost effective in the long run (plus IP)

Learn more
Tesseract Academy
◦ http://tesseract.academy
◦ https://www.youtube.com/playlist?list=PLVce3C5Hi9BBfabvhEzYQTQDYEg2vtuxH
◦ Data science, big data and blockchain for executives and managers.
The Data scientist
◦ Personal blog
◦ Covers data science, analytics, blockchain, tokenomics and many more subjects
◦ http://thedatascientist.com/what-deep-learning-is-and-isnt/

Understanding deep learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Understanding deep learning

Similar to Understanding deep learning (20)

More from Dr. Stylianos Kampakis

More from Dr. Stylianos Kampakis (8)

Recently uploaded

Recently uploaded (20)

Understanding deep learning