The document discusses recurrent neural networks (RNNs) and their use for sequence problems. It begins with an agenda that covers sequence problems, RNNs, vanishing gradients and LSTMs, machine translation with bidirectionality and attention, CTC loss, pros and cons of RNNs, and a preview of non-recurrent sequence models. It then discusses why RNNs are useful for sequence problems by addressing issues with using feedforward networks on variable length sequences and the need for memory over time.
In this talk we explore how to build Machine Learning Systems that can that can learn "continuously" from their mistakes (feedback loop) and adapt to an evolving data distribution.
The youtube link to video of the talk is here:
https://www.youtube.com/watch?v=VtBvmrmMJaI
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2J5SAcT.
Detlef Nauck explains why the testing of data is essential, as it not only drives the machine learning phase itself, but it is paramount for producing reliable predictions after deployment. Testing the decisions made by a deployed machine learning model is equally important to understand if it delivers the expected business value. Filmed at qconlondon.com.
Detlef Nauck is Chief Research Scientist for Data Science with BT's Research and Innovation Division. He is leading a group of scientists working on research into Data Science, ML and AI. He focuses on establishing best practices in DS for conducting analytics professionally and responsibly leading to new ways of analysing data for achieving better insights.
Mis 589 Massive Success / snaptutorial.comStephenson185
. (TCO A) A ___ defines the format and the order of messages exchanged between 2 or more communicating entities.
2. (TCO A) While the job of the link layer is to move entire frames from one network element to another, The job of the physical layer is to do what?
3. (TCO A) Which of the following is not true about ISO:
4. (TCO A) What are the two fundamental approaches to moving data through a network of links and switches?
5. (TCO A) The IP protocol works at which layer of the OSI model?
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
Slides for talk at PyData Seattle 2017 about Matthew Honnibal's 4-step recipe for Deep Learning NLP pipelines. Description of the stages in pipeline as well as 3 examples of document classification, document similarity and sentence similarity. Examples include Keras custom layers for different types of attention.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
RuleML2015: Using PSL to Extend and Evaluate Event OntologiesRuleML
The representation of events plays a key role in a wide range
of Semantic Web applications, and several ontologies have been proposed
to support this task. However, a review of existing event ontologies on
the web reveals limited reasoning being done in their applications. To
investigate this, we designed a set of reasoning problems (competency
questions) aimed at providing a pragmatic assessment of the reasoning
capabilities of three well-known Semantic Web event ontologies – SEM,
The Event Ontology, and LODE. Using OWL and SWRL axiomatizations
of the Process Specification Language (PSL) Ontology, we specify
maximal extensions of the existing event ontologies.We then evaluate the
resulting set of OWL and SWRL ontologies against our reasoning problems,
using the results to both assess the abilities of existing Semantic
Web event ontologies, and to explore the potential gains that may be
achieved through additional axioms.
In this talk we explore how to build Machine Learning Systems that can that can learn "continuously" from their mistakes (feedback loop) and adapt to an evolving data distribution.
The youtube link to video of the talk is here:
https://www.youtube.com/watch?v=VtBvmrmMJaI
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2J5SAcT.
Detlef Nauck explains why the testing of data is essential, as it not only drives the machine learning phase itself, but it is paramount for producing reliable predictions after deployment. Testing the decisions made by a deployed machine learning model is equally important to understand if it delivers the expected business value. Filmed at qconlondon.com.
Detlef Nauck is Chief Research Scientist for Data Science with BT's Research and Innovation Division. He is leading a group of scientists working on research into Data Science, ML and AI. He focuses on establishing best practices in DS for conducting analytics professionally and responsibly leading to new ways of analysing data for achieving better insights.
Mis 589 Massive Success / snaptutorial.comStephenson185
. (TCO A) A ___ defines the format and the order of messages exchanged between 2 or more communicating entities.
2. (TCO A) While the job of the link layer is to move entire frames from one network element to another, The job of the physical layer is to do what?
3. (TCO A) Which of the following is not true about ISO:
4. (TCO A) What are the two fundamental approaches to moving data through a network of links and switches?
5. (TCO A) The IP protocol works at which layer of the OSI model?
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
Slides for talk at PyData Seattle 2017 about Matthew Honnibal's 4-step recipe for Deep Learning NLP pipelines. Description of the stages in pipeline as well as 3 examples of document classification, document similarity and sentence similarity. Examples include Keras custom layers for different types of attention.
https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
RuleML2015: Using PSL to Extend and Evaluate Event OntologiesRuleML
The representation of events plays a key role in a wide range
of Semantic Web applications, and several ontologies have been proposed
to support this task. However, a review of existing event ontologies on
the web reveals limited reasoning being done in their applications. To
investigate this, we designed a set of reasoning problems (competency
questions) aimed at providing a pragmatic assessment of the reasoning
capabilities of three well-known Semantic Web event ontologies – SEM,
The Event Ontology, and LODE. Using OWL and SWRL axiomatizations
of the Process Specification Language (PSL) Ontology, we specify
maximal extensions of the existing event ontologies.We then evaluate the
resulting set of OWL and SWRL ontologies against our reasoning problems,
using the results to both assess the abilities of existing Semantic
Web event ontologies, and to explore the potential gains that may be
achieved through additional axioms.
This presentation on Recurrent Neural Network will help you understand what is a neural network, what are the popular neural networks, why we need recurrent neural network, what is a recurrent neural network, how does a RNN work, what is vanishing and exploding gradient problem, what is LSTM and you will also see a use case implementation of LSTM (Long short term memory). Neural networks used in Deep Learning consists of different layers connected to each other and work on the structure and functions of the human brain. It learns from huge volumes of data and used complex algorithms to train a neural net. The recurrent neural network works on the principle of saving the output of a layer and feeding this back to the input in order to predict the output of the layer. Now lets deep dive into this presentation and understand what is RNN and how does it actually work.
Below topics are explained in this recurrent neural networks tutorial:
1. What is a neural network?
2. Popular neural networks?
3. Why recurrent neural network?
4. What is a recurrent neural network?
5. How does an RNN work?
6. Vanishing and exploding gradient problem
7. Long short term memory (LSTM)
8. Use case implementation of LSTM
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you'll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
Learn more at: https://www.simplilearn.com/
Deep learning (also known as deep structured learning or hierarchical learning) is the application of artificial neural networks (ANNs) to learning tasks that contain more than one hidden layer. Deep learning is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, partially supervised or unsupervised.
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
Studied feasibility of applying state-of-the-art deep learning models like end-to-end memory networks and neural attention- based models to the problem of machine comprehension and subsequent question answering in corporate settings with huge
amount of unstructured textual data. Used pre-trained embeddings like word2vec and GLove to avoid huge training costs.
Recurrent Neural Networks hold great promise as general sequence learning algorithms. As such, they are a very promising tool for text analysis. However, outside of very specific use cases such as handwriting recognition and recently, machine translation, they have not seen wide spread use. Why has this been the case?
In this presentation, we will first introduce RNNs as a concept. Then we will sketch how to implement them and cover the tricks necessary to make them work well. With the basics covered, we will investigate using RNNs as general text classification and regression models, examining where they succeed and where they fail compared to more traditional text analysis models. A straightforward open-source Python and Theano library for training RNNs with a scikit-learn style interface will be introduced and we’ll see how to use it through a tutorial on a real world text dataset
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
When stars align: studies in data quality, knowledge graphs, and machine lear...
Lecture 3: RNNs - Full Stack Deep Learning - Spring 2021
1. Full Stack Deep Learning - UW Spring 2020 - Sergey Karayev - with content by Pieter Abbeel, Josh Tobin
Recurrent Neural Networks
2. Agenda
1. Sequence Problems
2. RNNs
3. Vanishing gradients and LSTMs
4. Case study: Machine Translation
(Bidirectionality and Attention)
5. CTC loss
6. Pros and Cons
7. A preview of non-recurrent sequence models
3. Agenda
1. Sequence Problems
2. RNNs
3. Vanishing gradients and LSTMs
4. Case study: Machine Translation
(Bidirectionality and Attention)
5. CTC loss
6. Pros and Cons
7. A preview of non-recurrent sequence models
4. Sequence
Time series
forecasting Time series Predicted next value
Sentiment
classification
Review text Predicted sentiment
Translation English text French text
Speech recognition
and generation
Audio waveform Text
Text or music
generation Ø Text or
Image captioning Image
“The quick brown fox
jumped over the lazy
dog”
Description
Sequence problems
Question Answering Text Text
5. Types of sequence problems
1. Why RNNs?
(From http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
7. Why not use feedforward networks?
1. Why RNNs?
many to many
8. Why not use feedforward networks?
1. Why RNNs?
Concatenate
many to many
9. Why not use feedforward networks?
1. Why RNNs?
Concatenate
Fully
connected
Output
many to many
10. Why not use feedforward networks?
1. Why RNNs?
Concatenate
Fully
connected
Output
Reshape
many to many
11. Problem 1: variable length inputs
1. Why RNNs?
many to many
? Can deal with this by
padding all sequences to
the max length, but…
12. Problem 2: memory scaling
1. Why RNNs?
many to many
Memory requirement
scales linearly in number
of timesteps
k-dim feature t timesteps
k * t dim
d-dim
k * t * d dim
matrix
13. Problem 3: overkill
1. Why RNNs?
many to many
k-dim feature t timesteps
k * t dim
d-dim
k * t * d dim
matrix
This matrix has to learn
patterns everywhere
they may occur in the
sequence!
This ignores the nature
of the problem: patterns
over time.
15. Agenda
1. Sequence Problems
2. RNNs
3. Vanishing gradients and LSTMs
4. Case study: Machine Translation
(Bidirectionality and Attention)
5. CTC loss
6. Pros and Cons
7. A preview of non-recurrent sequence models
21. RNNs for many-to-one problems
2. Review of RNNs
RNN
0.5
0.2
-0.1
-0.3
0.4
1.2
FC
Input Encoder
State at last
timestep Classifier Output
Architecture
24. RNNs for one-to-many problems
2. Review of RNNs
ConvNet
(e.g.)
0.5
0.2
-0.1
-0.3
0.4
1.2
RNN
“The quick brown
fox jumped over
the lazy dog”
Input Encoder
Hidden
state Decoder Output
Encoder-decoder architectures
25. RNNs for one-to-many problems
2. Review of RNNs
h0
<latexit sha1_base64="u90zgAiIVkwK7FYP2F5fm926QP8=">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPBiyepaD+gDWWznbRLN5uwuxFK6E/w4kERr/4ib/4bt20O2vpg4PHeDDPzwlRwbTzv2ymtrW9sbpW3Kzu7e/sH7uFRSyeZYthkiUhUJ6QaBZfYNNwI7KQKaRwKbIfjm5nffkKleSIfzSTFIKZDySPOqLHSw6jv9d2qV/PmIKvEL0gVCjT67ldvkLAsRmmYoFp3fS81QU6V4UzgtNLLNKaUjekQu5ZKGqMO8vmpU3JmlQGJEmVLGjJXf0/kNNZ6Eoe2M6ZmpJe9mfif181MdB3kXKaZQckWi6JMEJOQ2d9kwBUyIyaWUKa4vZWwEVWUGZtOxYbgL7+8SloXNd+r+feX1fpdEUcZTuAUzsGHK6jDLTSgCQyG8Ayv8OYI58V5dz4WrSWnmDmGP3A+fwD1WY2b</latexit>
<latexit sha1_base64="u90zgAiIVkwK7FYP2F5fm926QP8=">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPBiyepaD+gDWWznbRLN5uwuxFK6E/w4kERr/4ib/4bt20O2vpg4PHeDDPzwlRwbTzv2ymtrW9sbpW3Kzu7e/sH7uFRSyeZYthkiUhUJ6QaBZfYNNwI7KQKaRwKbIfjm5nffkKleSIfzSTFIKZDySPOqLHSw6jv9d2qV/PmIKvEL0gVCjT67ldvkLAsRmmYoFp3fS81QU6V4UzgtNLLNKaUjekQu5ZKGqMO8vmpU3JmlQGJEmVLGjJXf0/kNNZ6Eoe2M6ZmpJe9mfif181MdB3kXKaZQckWi6JMEJOQ2d9kwBUyIyaWUKa4vZWwEVWUGZtOxYbgL7+8SloXNd+r+feX1fpdEUcZTuAUzsGHK6jDLTSgCQyG8Ayv8OYI58V5dz4WrSWnmDmGP3A+fwD1WY2b</latexit>
<latexit sha1_base64="u90zgAiIVkwK7FYP2F5fm926QP8=">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPBiyepaD+gDWWznbRLN5uwuxFK6E/w4kERr/4ib/4bt20O2vpg4PHeDDPzwlRwbTzv2ymtrW9sbpW3Kzu7e/sH7uFRSyeZYthkiUhUJ6QaBZfYNNwI7KQKaRwKbIfjm5nffkKleSIfzSTFIKZDySPOqLHSw6jv9d2qV/PmIKvEL0gVCjT67ldvkLAsRmmYoFp3fS81QU6V4UzgtNLLNKaUjekQu5ZKGqMO8vmpU3JmlQGJEmVLGjJXf0/kNNZ6Eoe2M6ZmpJe9mfif181MdB3kXKaZQckWi6JMEJOQ2d9kwBUyIyaWUKa4vZWwEVWUGZtOxYbgL7+8SloXNd+r+feX1fpdEUcZTuAUzsGHK6jDLTSgCQyG8Ayv8OYI58V5dz4WrSWnmDmGP3A+fwD1WY2b</latexit>
<latexit sha1_base64="u90zgAiIVkwK7FYP2F5fm926QP8=">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPBiyepaD+gDWWznbRLN5uwuxFK6E/w4kERr/4ib/4bt20O2vpg4PHeDDPzwlRwbTzv2ymtrW9sbpW3Kzu7e/sH7uFRSyeZYthkiUhUJ6QaBZfYNNwI7KQKaRwKbIfjm5nffkKleSIfzSTFIKZDySPOqLHSw6jv9d2qV/PmIKvEL0gVCjT67ldvkLAsRmmYoFp3fS81QU6V4UzgtNLLNKaUjekQu5ZKGqMO8vmpU3JmlQGJEmVLGjJXf0/kNNZ6Eoe2M6ZmpJe9mfif181MdB3kXKaZQckWi6JMEJOQ2d9kwBUyIyaWUKa4vZWwEVWUGZtOxYbgL7+8SloXNd+r+feX1fpdEUcZTuAUzsGHK6jDLTSgCQyG8Ayv8OYI58V5dz4WrSWnmDmGP3A+fwD1WY2b</latexit>
ConvNet
32. 2. Review of RNNs
RNN
0.5
0.2
-0.1
-0.3
0.4
1.2
RNN
Input Encoder
Hidden
state Decoder Output
Encoder-decoder architectures
“I am a
student”
“Je suis
étudient”
RNNs for many-to-many problems
33. I am a student <s>
Je suis étudiant <s>
RNNs for many-to-many problems
2. Review of RNNs
34. I am a student <s>
Je suis étudiant <s>
RNNs for many-to-many problems
2. Review of RNNs
All the information in the
input sentence is
condensed into one
hidden state vector!
(In practice, we need
more tricks for this to
work -- explained soon)
35. Agenda
1. Sequence Problems
2. RNNs
3. Vanishing gradients and LSTMs
4. Case study: Machine Translation
(Bidirectionality and Attention)
5. CTC loss
6. Pros and Cons
7. A preview of non-recurrent sequence models
36. RNN Desiderata
• Goal: handle long sequences
• Connect events from the past to outcomes
in the future
• i.e., Long-term dependencies
• e.g., remember the name of a character
from the first sentence
3. Vanishing gradients
37. Vanilla RNNs: the reality
• Can’t handle more than 10-20 timesteps
• Longer-term dependencies get lost
• Why? Vanishing gradients
3. Vanishing gradients
https://bair.berkeley.edu/blog/2018/08/06/recurrent/