Automatic text generation enables computers to summarize text, to have conversations in customer-service and other settings, and to customize content based on the characteristics and goals of the human interlocutor. Using neural networks to automatically generate text is appealing because they can be trained through examples with no need to manually specify what should be said when. In this talk, we will provide an overview of the existing algorithms used in neural text generation, such as sequence2sequence models, reinforcement learning, variational methods, and generative adversarial networks. We will also discuss existing work that specifies how the content of generated text can be determined by manipulating a latent code. The talk will conclude with a discussion of current challenges and shortcomings of neural text generation.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
This presentation on Recurrent Neural Network will help you understand what is a neural network, what are the popular neural networks, why we need recurrent neural network, what is a recurrent neural network, how does a RNN work, what is vanishing and exploding gradient problem, what is LSTM and you will also see a use case implementation of LSTM (Long short term memory). Neural networks used in Deep Learning consists of different layers connected to each other and work on the structure and functions of the human brain. It learns from huge volumes of data and used complex algorithms to train a neural net. The recurrent neural network works on the principle of saving the output of a layer and feeding this back to the input in order to predict the output of the layer. Now lets deep dive into this presentation and understand what is RNN and how does it actually work.
Below topics are explained in this recurrent neural networks tutorial:
1. What is a neural network?
2. Popular neural networks?
3. Why recurrent neural network?
4. What is a recurrent neural network?
5. How does an RNN work?
6. Vanishing and exploding gradient problem
7. Long short term memory (LSTM)
8. Use case implementation of LSTM
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you'll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
Learn more at: https://www.simplilearn.com/
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...Simplilearn
This presentation on Introduction to Machine Learning will explain what is Machine Learning and how does Machine Learning works. By the end of this presentation, you will be able to understand what are the types of Machine Learning, Machine Learning algorithms and some of the breakthroughs in Machine Learning industry. You will also learn what Machine Learning has to offer to us in terms of career opportunities.
This Machine Learning presentation will cover the following topics:
1. Real life applications of Machine Learning
2. Machine Learning Challenges
3. How did Machine Learning evolve?
4. Why Machine Learning / Machine Learning benefits
5. What is Machine Learning?
6. Types of Machine Learning ( Supervised, Unsupervised & Reinforcement Learning )
7. Machine Learning algorithms
8. Breakthroughs in Machine Learning
9. Machine Learning Future
10. Machine Learning Career
11. Machine Learning job trends
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
These slides cover machine learning models more specifically classification algorithms (Logistic Regression, Linear Discriminant Analysis (LDA),
K-Nearest Neighbors (KNN),
Trees, Random Forests, and Boosting
Support Vector Machines (SVM),
Neural Networks)
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
This presentation on Recurrent Neural Network will help you understand what is a neural network, what are the popular neural networks, why we need recurrent neural network, what is a recurrent neural network, how does a RNN work, what is vanishing and exploding gradient problem, what is LSTM and you will also see a use case implementation of LSTM (Long short term memory). Neural networks used in Deep Learning consists of different layers connected to each other and work on the structure and functions of the human brain. It learns from huge volumes of data and used complex algorithms to train a neural net. The recurrent neural network works on the principle of saving the output of a layer and feeding this back to the input in order to predict the output of the layer. Now lets deep dive into this presentation and understand what is RNN and how does it actually work.
Below topics are explained in this recurrent neural networks tutorial:
1. What is a neural network?
2. Popular neural networks?
3. Why recurrent neural network?
4. What is a recurrent neural network?
5. How does an RNN work?
6. Vanishing and exploding gradient problem
7. Long short term memory (LSTM)
8. Use case implementation of LSTM
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you'll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
Learn more at: https://www.simplilearn.com/
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...Simplilearn
This presentation on Introduction to Machine Learning will explain what is Machine Learning and how does Machine Learning works. By the end of this presentation, you will be able to understand what are the types of Machine Learning, Machine Learning algorithms and some of the breakthroughs in Machine Learning industry. You will also learn what Machine Learning has to offer to us in terms of career opportunities.
This Machine Learning presentation will cover the following topics:
1. Real life applications of Machine Learning
2. Machine Learning Challenges
3. How did Machine Learning evolve?
4. Why Machine Learning / Machine Learning benefits
5. What is Machine Learning?
6. Types of Machine Learning ( Supervised, Unsupervised & Reinforcement Learning )
7. Machine Learning algorithms
8. Breakthroughs in Machine Learning
9. Machine Learning Future
10. Machine Learning Career
11. Machine Learning job trends
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
These slides cover machine learning models more specifically classification algorithms (Logistic Regression, Linear Discriminant Analysis (LDA),
K-Nearest Neighbors (KNN),
Trees, Random Forests, and Boosting
Support Vector Machines (SVM),
Neural Networks)
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
This Support Vector Machine (SVM) presentation will help you understand Support Vector Machine algorithm, a supervised machine learning algorithm which can be used for both classification and regression problems. This SVM presentation will help you learn where and when to use SVM algorithm, how does the algorithm work, what are hyperplanes and support vectors in SVM, how distance margin helps in optimizing the hyperplane, kernel functions in SVM for data transformation and advantages of SVM algorithm. At the end, we will also implement Support Vector Machine algorithm in Python to differentiate crocodiles from alligators for a given dataset.
Below topics are explained in this Support Vector Machine presentation:
1. What is Machine Learning?
2. Why support vector machine?
3. What is support vector machine?
4. Understanding support vector machine
5. Advantages of support vector machine
6. Use case in Python
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, Naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
This Deep Learning Presentation will help you in understanding what is Deep learning, why do we need Deep learning, applications of Deep Learning along with a detailed explanation on Neural Networks and how these Neural Networks work. Deep learning is inspired by the integral function of the human brain specific to artificial neural networks. These networks, which represent the decision-making process of the brain, use complex algorithms that process data in a non-linear way, learning in an unsupervised manner to make choices based on the input. This Deep Learning tutorial is ideal for professionals with beginners to intermediate levels of experience. Now, let us dive deep into this topic and understand what Deep learning actually is.
Below topics are explained in this Deep Learning Presentation:
1. What is Deep Learning?
2. Why do we need Deep Learning?
3. Applications of Deep Learning
4. What is Neural Network?
5. Activation Functions
6. Working of Neural Network
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you’ll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms.
There is booming demand for skilled deep learning engineers across a wide range of industries, making this deep learning course with TensorFlow training well-suited for professionals at the intermediate to advanced level of experience. We recommend this deep learning online course particularly for the following professionals:
1. Software engineers
2. Data scientists
3. Data analysts
4. Statisticians with an interest in deep learning
This talk is about how we applied deep learning techinques to achieve state-of-the-art results in various NLP tasks like sentiment analysis and aspect identification, and how we deployed these models at Flipkart
Presentation at Advanced Intelligent Systems for Sustainable Development (AISSD 2021) 20-22 August 2021 organized by the scientific research group in Egypt with Collaboration with Faculty of Computers and AI, Cairo University and the Chinese University in Egypt
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
The Brain’s Guide to Dealing with Context in Language Understanding
Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity.
In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
This Support Vector Machine (SVM) presentation will help you understand Support Vector Machine algorithm, a supervised machine learning algorithm which can be used for both classification and regression problems. This SVM presentation will help you learn where and when to use SVM algorithm, how does the algorithm work, what are hyperplanes and support vectors in SVM, how distance margin helps in optimizing the hyperplane, kernel functions in SVM for data transformation and advantages of SVM algorithm. At the end, we will also implement Support Vector Machine algorithm in Python to differentiate crocodiles from alligators for a given dataset.
Below topics are explained in this Support Vector Machine presentation:
1. What is Machine Learning?
2. Why support vector machine?
3. What is support vector machine?
4. Understanding support vector machine
5. Advantages of support vector machine
6. Use case in Python
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, Naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
This Deep Learning Presentation will help you in understanding what is Deep learning, why do we need Deep learning, applications of Deep Learning along with a detailed explanation on Neural Networks and how these Neural Networks work. Deep learning is inspired by the integral function of the human brain specific to artificial neural networks. These networks, which represent the decision-making process of the brain, use complex algorithms that process data in a non-linear way, learning in an unsupervised manner to make choices based on the input. This Deep Learning tutorial is ideal for professionals with beginners to intermediate levels of experience. Now, let us dive deep into this topic and understand what Deep learning actually is.
Below topics are explained in this Deep Learning Presentation:
1. What is Deep Learning?
2. Why do we need Deep Learning?
3. Applications of Deep Learning
4. What is Neural Network?
5. Activation Functions
6. Working of Neural Network
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you’ll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms.
There is booming demand for skilled deep learning engineers across a wide range of industries, making this deep learning course with TensorFlow training well-suited for professionals at the intermediate to advanced level of experience. We recommend this deep learning online course particularly for the following professionals:
1. Software engineers
2. Data scientists
3. Data analysts
4. Statisticians with an interest in deep learning
This talk is about how we applied deep learning techinques to achieve state-of-the-art results in various NLP tasks like sentiment analysis and aspect identification, and how we deployed these models at Flipkart
Presentation at Advanced Intelligent Systems for Sustainable Development (AISSD 2021) 20-22 August 2021 organized by the scientific research group in Egypt with Collaboration with Faculty of Computers and AI, Cairo University and the Chinese University in Egypt
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
The Brain’s Guide to Dealing with Context in Language Understanding
Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity.
In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.
Building a Neural Machine Translation System From ScratchNatasha Latysheva
Human languages are complex, diverse and riddled with exceptions – translating between different languages is therefore a highly challenging technical problem. Deep learning approaches have proved powerful in modelling the intricacies of language, and have surpassed all statistics-based methods for automated translation. This session begins with an introduction to the problem of machine translation and discusses the two dominant neural architectures for solving it – recurrent neural networks and transformers. A practical overview of the workflow involved in training, optimising and adapting a competitive neural machine translation system is provided. Attendees will gain an understanding of the internal workings and capabilities of state-of-the-art systems for automatic translation, as well as an appreciation of the key challenges and open problems in the field.
Demystifying NLP Transformers: Understanding the Power and Architecture behin...NILESH VERMA
n this SlideShare presentation, we delve into the intricate world of NLP Transformers, exploring their underlying architecture and uncovering their immense power in Natural Language Processing (NLP). Join us as we demystify the complexities and provide a comprehensive overview of how Transformers revolutionize tasks such as machine translation, sentiment analysis, question answering, and more. Gain valuable insights into the transformer model, attention mechanisms, self-attention, and the transformer encoder-decoder structure. Whether you're an NLP enthusiast or a beginner, this presentation will equip you with a solid foundation to comprehend and harness the potential of NLP Transformers.
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0Plain Concepts
In this session we will explore Recurrent Neural Networks (RNN) - a type of neural networks specially designed to process sequences - and their applications to time series and text processing (NLP). To make the session even more interesting, all the code will be developed using the latest version of TensorFlow 2.0, using the implementation of the models to discuss the major changes with respect to versions 1.x of the Deep Learning framework, and it will leverage MLFLlow within Azure Databricks as a development platform and model serving.
Word embeddings have received a lot of attention since some Tomas Mikolov published word2vec in 2013 and showed that the embeddings that the neural network learned by “reading” a large corpus of text preserved semantic relations between words. As a result, this type of embedding started being studied in more detail and applied to more serious NLP and IR tasks such as summarization, query expansion, etc… More recently, researchers and practitioners alike have come to appreciate the power of this type of approach and have started a cottage industry of modifying Mikolov’s original approach to many different areas.
In this talk we will cover the implementation and mathematical details underlying tools like word2vec and some of the applications word embeddings have found in various areas. Starting from an intuitive overview of the main concepts and algorithms underlying the neural network architecture used in word2vec we will proceed to discussing the implementation details of the word2vec reference implementation in tensorflow. Finally, we will provide a birds eye view of the emerging field of “2vec" (dna2vec, node2vec, etc...) methods that use variations of the word2vec neural network architecture.
This (short) version of the Tutorial was presented at #AIWTB https://ai.withthebest.com/. See https://bmtgoncalves.github.io/word2vec-and-friends/ for further details on future (and longer) editions and sign up to http://tinyletter.com/dataforscience for related news and updates.
Word embeddings have received a lot of attention since some Tomas Mikolov published word2vec in 2013 and showed that the embeddings that the neural network learned by “reading” a large corpus of text preserved semantic relations between words. As a result, this type of embedding started being studied in more detail and applied to more serious NLP and IR tasks such as summarization, query expansion, etc… More recently, researchers and practitioners alike have come to appreciate the power of this type of approach and have started a cottage industry of modifying Mikolov’s original approach to many different areas.
In this talk we will cover the implementation and mathematical details underlying tools like word2vec and some of the applications word embeddings have found in various areas. Starting from an intuitive overview of the main concepts and algorithms underlying the neural network architecture used in word2vec we will proceed to discussing the implementation details of the word2vec reference implementation in tensorflow. Finally, we will provide a birds eye view of the emerging field of “2vec" (dna2vec, node2vec, etc...) methods that use variations of the word2vec neural network architecture.
This (long) version of the Tutorial was presented at #O'Reilly AI 2017 in San Francisco. See https://bmtgoncalves.github.io/word2vec-and-friends/ for further details.
This slide is presented at RubyKaigi 2014 on Sep 18.
We have designed and implemented the library that realizes non-linear pattern matching against unfree data types. We can directly express pattern-matching against lists, multisets, and sets using this library.
The library is already released via RubyGems.org as one of gems.
The expressive power of this gem derives from the theory behind the Egison programming language and is so strong that we can write poker-hands analyzer in a single pattern matching expression. This is impossible even in any other state-of-the-art programming language.
Pattern matching is one of the most important features of programming language for intuitive representation of algorithms.
Our library simplifies code by replacing not only complex conditional branches but also nested loops with an intuitive pattern-matching expression.
Neural networks for word embeddings have received a lot of attention since some Googlers published word2vec in 2013. They showed that the internal state (embeddings) that the neural network learned by "reading" a large corpus of text preserved semantic relations between words.
As a result, this type of embedding started being studied in more detail and applied to more serious Natural Language Processing + NLP and IR tasks such as summarization, query expansion, etc...
In this talk we will cover the intuitions and algorithms underlying word2vec family of algorithms. On the second half of the presentation we will quickly review than basics of tensorflow and analyze in detail the tensorflow reference implementation of word2vec
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
(date is wrong, sorry, it was January 28th, 2023)
So many intelligent robots have come and gone, failing to become a commercial success. We’ve lost Aibo, Romo, Jibo, Baxter, Scout, and even Alexa is reducing staff. I posit that they failed because you can’t talk to them, not really. AI has recently made substantial progress, speech recognition now actually works, and we have neural networks such as ChatGPT that produce astounding natural language. But you can’t just throw a neural network into a robot because there isn’t anybody home in those networks—they are just purposeless mimicry agents with no understanding of what they are saying. This lack of understanding means that they can’t be relied upon because the mistakes they make are ones that no human would make, not even a child. Like a child first learning about the world, a robot needs a mental model of the current real-world situation and must use that model to understand what you say and generate a meaningful response. This model must be composable to represent the infinite possibilities of our world, and to keep up in a conversation, it must be built on the fly using human-like, immediate learning. This talk will cover what is required to build that mental model so that robots can begin to understand the world as well as human children do. Intelligent robots won’t be a commercial success based on their form—they aren’t as cute and cuddly as cats and dogs—but robots can be much more if we can talk to them.
Moving Your Machine Learning Models to Production with TensorFlow ExtendedJonathan Mugan
ML is great fun, but now we want it to solve real problems. To do this, we need a way of keeping track of all of our data and models, and we need to know when our models fail and why. This talk will cover how to move ML to production with TensorFlow Extended (TFX). TFX is used by Google internally for machine-learning model development and deployment, and it has recently been made public. TFX consists of multiple pipeline elements and associated components, and this talk will cover them all, but three elements are particularly interesting: TensorFlow Data Validation, TensorFlow Model Analysis, and the What-If Tool.
The TensorFlow Data Validation library analyses incoming data and computes distributions over the feature values. This can show us which features many not be useful, maybe because they always have the same value, or which features may contain bugs. TensorFlow Model Analysis allows us to understand how well our data performs on different slices of the data. For example, we may find that our predictive models are more accurate for events that happen on Tuesdays, and such knowledge can be used to help us better understand our data and our business. The What-If Tool is as an interactive tool that allows you to change data and see what the model would say if a particular record had a particular feature value. It lets you probe your model, and it can automatically find the closest record with a different predicted label, which allows you to learn what the model is homing in on. Machine learning is growing up.
There are lots of frameworks for building chatbots, but those abstractions can obscure understanding and hinder application development. In this talk, we will cover building chatbots from the ground up in Python. This can be done with either classic NLP or deep learning. We will cover both approaches, but this talk will focus on how one can build a chatbot using spaCy, pattern matching, and context-free grammars.
From Natural Language Processing to Artificial IntelligenceJonathan Mugan
Overview of natural language processing (NLP) from both symbolic and deep learning perspectives. Covers tf-idf, sentiment analysis, LDA, WordNet, FrameNet, word2vec, and recurrent neural networks (RNNs).
Deep Learning for Natural Language ProcessingJonathan Mugan
Deep Learning represents a significant advance in artificial intelligence because it enables computers to represent concepts using vectors instead of symbols. Representing concepts using vectors is particularly useful in natural language processing, and this talk will elucidate those benefits and provide an understandable introduction to the technologies that make up deep learning. The talk will outline ways to get started in deep learning, and it will conclude with a discussion of the gaps that remain between our current technologies and true computer understanding.
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
Describes deep learning as applied to natural language processing, computer vision, and robot actions. Also discusses what deep learning still can't do.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. 22
Computers are Illiterate
Reading requires mapping the words on a page to shared concepts in our culture.
Writing requires mapping those shared concepts into other words on a page.
We currently don’t know how to endow computer systems with a conceptual
system rich enough to represent even what a small child knows.
However, AI researchers have developed a method that can
1. encode representations of our world, such as images or words on a page, into a
space of meanings (proto-read)
2. convert points in that meaning space into language (proto-write)
This method uses neural networks, so we call it neural text generation.
3. 33
Overview of neural text generation
world
state
encoder meaning space
decoder n
generated
text
decoder 1
generated
text
...
Neural text generation is useful for
• Machine translation
• Image captioning
• Style manipulation
4. 44
Overview of neural text generation
world
state
encoder meaning space
decoder n
generated
text
decoder 1
generated
text
...
• Imagine having the text on your website customized
for each user.
• Imagine being able to have your boring HR manual
translated into the style of Cormac McCarthy.
The coffee in the break room is available to all employees,
like some primordial soup handed out to the huddled and
wasted damned.
5. 55
Limited by a lack of common sense
world
state
encoder meaning space
decoder n
generated
text
decoder 1
generated
text
...
• It lacks precision: if you ask it to make you a reservation on Tuesday, it may make
it on Wednesday because the word vectors are similar (also see [0]).
• It also makes mistakes that no human would not make: in image captioning it
can mistake a toothbrush for a baseball bat [1].
[0] A Random Walk Through EMNLP 2017 http://approximatelycorrect.com/2017/09/26/a-random-walk-through-emnlp-2017/
[1] Deep Visual-Semantic Alignments for Generating Image Descriptions http://cs.stanford.edu/people/karpathy/deepimagesent/
The beautiful: as our neural networks get richer, the
meaning space will get closer to being able to
represent the concepts a small child can understand,
and we will get closer to human-level literacy. We are
taking baby steps toward real intelligence.
7. 77
Outline
• Introduction
• Learning from pairs of sentences
• Overview
• Encoders
• Decoders
• Attention
• Uses
• Mapping to and from meaning space
• Adversarial text generation
• Conclusion
8. 88
Sequence-to-sequence models
Both the encoding and decoding are often done using recurrent neural
networks (RNNs).
The initial big application for this was machine translation. For example,
where the source sentences are English and the target sentences are
Spanish.
A sequence-to-sequence (seq2seq) model
1. encodes a sequence of tokens, such as a sentence, into a single vector
2. decodes that vector into another sequences of tokens
10. 1010
La
ℎ&
mujer
ℎ'
comió
ℎ(
pizza
ℎ)
.
ℎ*
The model begins with the last hidden state from the
encoder and generates words until a special stop symbol is
generated.
Decoding the sentence into Spanish
Note that the lengths of the source and target sentences do
not need to be the same, because it goes until it generates the
stop symbol.
12. 1212
Outline
• Introduction
• Learning from pairs of sentences
• Overview
• Encoders
• Decoders
• Attention
• Uses
• Mapping to and from meaning space
• Adversarial text generation
• Conclusion
13. 1313
The
ℎ"
woman
ℎ#
ate
ℎ$
pizza
ℎ%
.
ℎ&
Let’s look at a simple cell so we can get
a grounded understanding.
Concrete example of an RNN encoder
aardvark 1.32 1.56 0.31 –1.21 0.31
ate 0.36 –0.26 0.31 –1.99 0.11
... –0.69 0.33 0.77 0.22 –1.29
zoology 0.41 0.21 –0.32 0.31 0.22
Vocabulary table
• 𝑉: vocabulary table of size 50,000 by 300
• 𝑣-: word vector is row from 𝑉
• 𝑊/: transition matrix of size 300 by 300
• 𝑊0: input matrix of size 300 by 300
• 𝑏 is a bias vector of size 300
• ℎ- = tanh(ℎ-8# 𝑊/ + 𝑣- 𝑊0 + 𝑏)
See Christopher Olah’s blog post for great
presentation of more complex cells like LSTMs
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
𝑣-
ℎ-
(each row is actually 300 dimensions)
Parameters start
off as random
values and are
updated through
backpropagation.
14. 1414
Tom ate with his brother
0.23 0.36 0.23 0.23 0.23
1.42 –0.26 1.32 1.27 –1.42
–0.33 0.31 0.33 0.59 0.33
1.23 –1.99 1.23 –1.31 1.23
1.11 0.11 0.11 2.19 1.83
0.21 0.56 0.21
1.12 –1.12 1.12
0.55 0.26 0.55
0.81 1.71 –0.31
–0.66 9.66 0.66
• 128 filters of width 3, 4, and 5
• Total of 384 filters
• 2) Slide each filter over the
sentence and element-
wise multiply and add (dot
product); then take the
max of the values seen.
• Result is a sentence vector
of length 384 that
represents the sentence.
• Analogous to
• For classification: each class has a vector of
weights of length 384
• 3) To decide which class, the class vector is
multiplied element wise by the sentence
vector and summed (dot product)
• The class with the highest sum wins
aardvark 1.32 1.56 0.31 –1.21 0.31
ate 0.36 –0.26 0.31 –1.99 0.11
... –0.69 0.33 0.77 0.22 –1.29
zoology 0.41 0.21 –0.32 0.31 0.22
Vocabulary table
1) For each word in the sentence, get its
word vector from the vocabulary table.
Convolutional Neural Network (CNN) Text Encoder
(each row is actually 300 dimensions)
• See the code here by Denny Britz https://github.com/dennybritz/cnn-text-classification-tf
• README.md contains reference to his blog post and the paper by Y. Kim
ℎ&
(example filter three words wide;
size 300 by 3)
15. 1515
Outline
• Introduction
• Learning from pairs of sentences
• Overview
• Encoders
• Decoders
• Attention
• Uses
• Mapping to and from meaning space
• Adversarial text generation
• Conclusion
16. 1616
Concrete example of RNN decoder
• 𝑉: vocabulary table of size 50,000 by 300
• 𝑣-8#: word vector row from 𝑉; 𝑣' is “mujer”
• 𝑊/
:
: transition matrix of size 300 by 300
• 𝑊0
:
: input matrix of size 300 by 300
• 𝑏 is a bias vector of size 300
• ℎ- = tanh(ℎ-8# 𝑊/
:
+ 𝑣-8# 𝑊0
:
+ 𝑏)
ℎ(
La
ℎ&
mujer
ℎ'
comió
ℎ(
pizza
ℎ)
.
ℎ*
• 𝑊;
:
: output matrix of size 300 by 50,000
• A probability distribution over next words
• 𝑝 𝑤𝑜𝑟𝑑𝑠 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(ℎ- 𝑊;
:
)
• The softmax function makes them all sum
to 1
• 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑥H = 𝑒JK/ ∑N 𝑒JO
17. 1717
Models are trained with paired sentences to minimize cost
At time 𝑡 = 6 we know the correct
word is “comió”, so the cost is
𝑐𝑜𝑠𝑡 = −log(𝑝(comió))
So the more likely “comió” is, the
lower the cost.
Also, note that regardless of what the model
says at time 𝑡 − 1, we feed in “mujer” at time 𝑡.
This is called teacher forcing.
La
ℎ&
mujer
ℎ'
comió
ℎ(
pizza
ℎ)
.
ℎ*
• 𝑊;
:
: output matrix of size 300 by 50,000
• A probability distribution over next words
• 𝑝 𝑤𝑜𝑟𝑑𝑠 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(ℎ- 𝑊;
:
)
• The softmax function makes them all sum
to 1
• 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑥H = 𝑒JK/ ∑N 𝑒JO
18. 1818
When it comes time to generate text
A compromise is beam search
• Start generating with the
first word
• Keep the 𝑘 (around 10)
most promising sequences
When you are actually using the model to generate text, you don’t know what
the right answer is.
You could do a greedy decoder
• Take the most likely first word,
then given that word, takes the
most likely next word, and so on.
• But that could lead you down a
suboptimal path.
You could look at all sequences
of words up to a given length
• Choose the sequence with
the lowest cost
• But there are far too many.
too cheap too expensive
La
ℎ&
mujer
ℎ'
comió
ℎ(
pizza
ℎ)
.
ℎ*
Beam search can sometimes lack diversity, and there have been proposals to address this;
see Li and Jurafsky, https://arxiv.org/pdf/1601.00372v2.pdf and Ashwin et al., https://arxiv.org/pdf/1610.02424v1.pdf
19. 1919
Outline
• Introduction
• Learning from pairs of sentences
• Overview
• Encoders
• Decoders
• Attention
• Uses
• Mapping to and from meaning space
• Adversarial text generation
• Conclusion
20. 2020
The network learns how to determine which previous memory is
most relevant at each decision point [Bahdanau et al., 2014].
The
ℎ"
woman
ℎ#
ate
ℎ$
tacos
ℎ%
.
ℎ&
La
ℎ&
mujer
ℎ'
comió
ℎ(
tacos
ℎ)
.
ℎ*
Attention helps the model consider more information
22. 2222
Outline
• Introduction
• Learning from pairs of sentences
• Overview
• Encoders
• Decoders
• Attention
• Uses
• Mapping to and from meaning space
• Adversarial text generation
• Conclusion
23. 2323
Seq2seq works on all kinds of sentence pairs
For example, take a bunch of dialogs and use machine learning
to predict what the next statement will be given the last
statement.
• Movie and TV subtitles
• OpenSubtitles http://opus.lingfil.uu.se/OpenSubtitles.php
• Can mine Twitter
• Look for replies to tweets using the API
• Ubuntu Dialog Corpus
• Dialogs of people wanting technical support
https://arxiv.org/abs/1506.08909
Another example, text summarization. E.g., learn to generate
headlines for news articles. See
https://memkite.com/deeplearningkit/2016/04/23/deep-learning-for-text-summarization/
24. 2424
Style and more
[0] Seq2seq style transfer: Dear Sir or Madam, May I introduce the YAFC Corpus: Corpus,
Benchmarks and Metrics for Formality Style Transfer https://arxiv.org/pdf/1803.06535.pdf
• You can even do style transfer [0].
• They had Mechanical Turk workers write different styles of
sentences for training.
• We need to get them on the Cormac McCarthy task.
25. 2525
ℎ&
Convolutional
neural network
Weird
ℎ&
dude
ℎ'
on
ℎ(
table
ℎ)
.
ℎ*
Karpathy and Fei-Fei, 2015
http://cs.stanford.edu/people/karpathy/deepimagesent/
Vinyals et al., 2014
http://googleresearch.blogspot.com/2014/11/a-picture-is-
worth-thousand-coherent.html
Can even automatically caption images
Chavo protects my office from Evil spirits.
Seq2seq takeaway: works well if you
have lots of pairs of things to train on.
26. 2626
Outline
• Introduction
• Learning from pairs of sentences
• Mapping to and from meaning space
• Meaning space should be smooth
• Variational methods ensure smoothness
• Navigating the meaning space
• Moving beyond pairs of sentences
• Conclusion
27. 2727
Outline
• Introduction
• Learning from pairs of sentences
• Mapping to and from meaning space
• Meaning space should be smooth
• Variational methods ensure smoothness
• Navigating the meaning space
• Moving beyond pairs of sentences
• Conclusion
28. 2828
Meaning Space Should be Smooth
Meaning space: mapping of concepts and ideas
to points in a continuous, high-dimensional
grid. We want:
• Semantically meaningful things to be near
each other.
• To be able to interpolate between sentences.
• Could be the last state of the RNN or the output of the CNN
• But we want to use the space as a representation of meaning.
• We can get more continuity in the space when we use a bias encourage the neural
network to represent the space compactly [0].
Meaning Space
This is an
outrage!
tacos are
me
I love
tacos
we are all
tacos
[0] Generating Sentences from a Continuous Space, https://arxiv.org/abs/1511.06349
29. 2929
Meaning Space Should be Smooth
• Get bias by adding prior probability
distribution as organizational constraint.
• Prior wants to put all points at (0,0,...0)
unless there is a good reason not to.
• That “good reason” is the meaning we
want to capture.
• By forcing the system to maintain that
balance, we encourage it to organize
such that concepts that are near each
other in space have similar meaning.
Meaning Space
This is an
outrage!
tacos are
me
I love
tacos
we are all
tacos
30. 3030
Outline
• Introduction
• Learning from pairs of sentences
• Mapping to and from meaning space
• Meaning space should be smooth
• Variational methods ensure smoothness
• Navigating the meaning space
• Moving beyond pairs of sentences
• Conclusion
31. 3131
Variational methods provide structure to meaning space
Variational methods can provide this
organizational constraint in the form of a prior.
• In the movie Arrival, variational principles in
physics allowed one to know the future.
• In our world, variational comes from the
calculus of variations; optimization over a space
of functions.
• Variational inference is coming up with a
simpler function that closely approximates the
true function.
meaning space
This is an
outrage! we are all
tacos
tacos are
me
I love
tacos
we are all
tacos
32. 3232
Variational methods provide structure to meaning space
meaning space
This is an
outrage!
Our simpler function for 𝑝(ℎ|𝑥) is
𝑞 ℎ 𝑥 = 𝑁(𝜇, 𝜎).
We use a neural network (RNN) to
output 𝜇 and 𝜎.
To maximize this approximation, it turns out that we
maximize the evidence lower bound (ELBO):
ℒ 𝑥, 𝑞 = 𝐸` ℎ 𝑥 [log 𝑝 𝑥 ℎ ] − 𝐾𝐿(𝑞(ℎ|𝑥)||𝑝 ℎ ).
See Kingma and Welling https://arxiv.org/pdf/1312.6114.pdf
In practice, we only take
one sample and we use
the reparameterization
trick to keep it
differentiable.
tacos are
me
I love
tacos
we are all
tacos
33. 3333
Variational methods provide structure to meaning space
Note that the equation
ℒ 𝑥, 𝑞 = 𝐸` ℎ 𝑥 [log 𝑝 𝑥 ℎ ] − 𝐾𝐿(𝑞(ℎ|𝑥)||𝑝 ℎ )
has the prior 𝑝 ℎ .
The prior 𝑝 ℎ = 𝑁(0, 𝐼) forces the algorithm to use the space efficiently,
which forces it to put semantically similar latent representations
together. Without this prior, the algorithm could put ℎ values anywhere
in the latent space that it wanted, willy-nilly.
𝑁(0, 𝐼)
This is an
outrage!
tacos are
me
I love
tacos
we are all
tacos
34. 3434
Trained using an autoencoder
The prior acts like regularization, so we are
1. maximizing the reconstruction fidelity 𝑝 𝑥 ℎ and
2. minimizing the difference of ℎ from the prior 𝑁(0, 𝐼).
Can be tricky to balance these things. Use KL annealing,
see Bowman et al., https://arxiv.org/pdf/1511.06349.pdf
text 𝑥 encoder decoder text 𝑥𝑁(0, 𝐼)
𝑞(ℎ|𝑥)
35. 3535
We can also do classification over meaning space
When we train with variational
methods to regenerate the input text,
this is called a variational
autoencoder.
classifier
This can be the basis for learning with less
labeled training data (semi-supervised learning).
See Kingma et al., https://arxiv.org/pdf/1406.5298.pdf
𝑁(0, 𝐼)
This is an
outrage!
tacos are
me
I love
tacos
we are all
tacos
36. 3636
We can place a hierarchical structure on the meaning space
Hierarchical
Meaning Space
tacos
Food
In the vision domain, Goyal et al. allows
one to learn a hierarchy on the prior space
using a hierarchical Dirichlet process
https://arxiv.org/pdf/1703.07027.pdf
Videos on understanding Dirichlet processes and the related
Chinese restaurant processes by Tamara Broderick
part I https://www.youtube.com/watch?v=kKZkNUvsJ4M
part II https://www.youtube.com/watch?v=oPcv8MaESGY
part III https://www.youtube.com/watch?v=HpcGlr19vNk
candy
GPUs
computers
VIC-20
old
cars
Cars
self-
driving
The Dirichlet process has allows for an infinite
number of clusters. As the world changes and new
concepts emerge, you can keep adding clusters.
37. 3737
Outline
• Introduction
• Seq2seq models and aligned data
• Representing text in meaning space
• Meaning space should be smooth
• Variational methods ensure smoothness
• Navigating the meaning space
• Adversarial text generation
• Conclusion
38. 3838
We can learn functions to find desired parts of the meaning space
meaning space
In the vision domain, Engel et al. allows
one to translate a point in meaning
space to a new point with desired
characteristics
https://arxiv.org/pdf/1711.05772.pdf
Bob
Bob w/
glasses
𝑓(𝐵𝑜𝑏)
39. 3939
We can learn functions to find desired parts of the meaning space
meaning space
Bob
Bob w/
glasses
𝑓(𝐵𝑜𝑏)
In the text domain, [0] provides a way to
rewrite sentences to change them
according to some function, such as
• making the sentiment more positive or
• making a Shakespearean sentence
sound more modern.
[0] Sequence to Better Sequence: Continuous Revision of Combinatorial Structures, http://www.mit.edu/~jonasm/info/Seq2betterSeq.pdf
40. 4040
We can learn functions to find desired parts of the meaning space
meaning space
Bob
Bob w/
glasses
𝑓(𝐵𝑜𝑏)
Since the space is smooth, we might be able to hill climb to get whatever digital
artifacts we wanted, such as computer programs.
[see Liang et al., https://arxiv.org/pdf/1611.00020.pdf and Yang et al., https://arxiv.org/pdf/1711.06744.pdf]
This movie looks pretty good, but make it a little more steampunk ...
[0] Sequence to Better Sequence: Continuous Revision of Combinatorial Structures, http://www.mit.edu/~jonasm/info/Seq2betterSeq.pdf
Method [0]
1. Encode sentence 𝑠 using a
variational autoencoder to get ℎ.
2. Look around the meaning space to find
the point ℎ∗ of the nearby space that
maximizes your evaluation function.
3. Decode ℎ∗ to create a modified
version of sentence 𝑠
42. 4242
Outline
• Introduction
• Learning from pairs of sentences
• Mapping to and from meaning space
• Moving beyond pairs of sentences
• Bridging meanings through shared space
• Adversarial text generation
• Learning through reinforcement learning
• Conclusion
43. 4343
You have pairs of languages, but not the right ones
• You may have pairs of languages or styles but not the pairs you want.
• For example, you have lots of translations from English to Spanish and from English
to Tagalog but you want to translate directly from Spanish to Tagalog.
Google faced this scenario with language translation [0].
• Use a single encoder and decoder for all languages.
• Use word segments so there is one vocabulary table; and
the encoder can encode all languages to the same space.
• Add the target language as the first symbol of the source
text, so the decoder knows the language to use.
[0] Zero-Shot Translation with Google’s Multilingual Neural Machine Translation System, https://research.googleblog.com/2016/11/zero-shot-translation-with-googles.html
Source
lang. text
& target
symbol
encoder meaning space decoder
Target
language
After, they could
translate pairs of
languages even when
there was no pairwise
training data.
44. 4444
You have no pairs at all
decoder
generated
text
meaning space decoder
generated
text
But how do you associate your language model with a meaning space?
How do you get the probability of the first word?
Imagine you have text written in two styles, but it is not tied together in pairs.
For each style, you can learn a language model,
which gives a probability distribution over the next
word given the pervious ones.
Language models are the technology
behind those posts of the form:
“I forced my computer to watch
1000 hours of Hee Haw and this
was the result.”
45. 4545
Assign a meaning space using an autoencoder!
You can associate a language model with a meaning space using an autoencoder.
But how can you line up the meaning spaces of the different languages and styles so
you can translate between them?
text 𝑥 encoder decoder text 𝑥𝑁(0, 𝐼)
𝑞(ℎ|𝑥)
46. 4646
1. Train languages 𝐴 and 𝐵 with autoencoders using
same encoder and different decoders.
2. Start with a sentence 𝑏 ∈ 𝐵 (start with 𝐵 because we want to train in the other direction)
3. Encode 𝑏 and decode it into 𝑎 ∈ 𝐴
4. You now have a synthetic pair (𝑎, 𝑏)
5. Train using the synthetic pairs
Bridging meaning spaces with synthetic pairs [0]
Lang. 𝐴
or 𝐵
encoder meaning space
decoder B
generated
B
decoder 𝐴
generated
𝐴
[0] Unsupervised Neural Machine Translation
https://arxiv.org/pdf/1710.11041.pdf
Creating pairs this way is called backtranslation
Improving Neural Machine Translation Models with Monolingual Data,
http://www.aclweb.org/anthology/P16-1009
47. 4747
Bridging meaning spaces with round-trip cost [0]
1. Start with sentence 𝑎 ∈ 𝐴
2. Backprop with autoencoder cost 𝑎 → 𝑎
3. Backprop with round trip cost 𝑎 → 𝑏 → 𝑎
4. Repeat with a sentence 𝑏 ∈ 𝐵
[0] Improved Neural Text Attribute Transfer with Non-
parallel Data https://arxiv.org/pdf/1711.09395.pdf
• For example, for a Yelp review, they automatically converted the positive sentiment phrase
“love the southwestern burger” to the negative sentiment phrase “avoid the grease burger.”
• Notice that the word “burger” is in both, as it should be. Maintaining semantic meaning can
sometimes be a challenge with these systems.
• In their work [0], they add an additional constraint that nouns should be maintained.
Lang. 𝐴
or 𝐵
encoder meaning space
decoder B
generated
B
decoder 𝐴
generated
𝐴
48. 4848
Making the most of data with dual learning [0]
Text
𝐴
Encoder
𝐴
meaning
space
Decoder
𝐵
Text
𝐵
Decoder
𝐵
Text
𝐵
Model from 𝐴 to 𝐵.
Language model for 𝐵.
Text
𝐵
Encoder
𝐵
meaning
space
Decoder
𝐴
Text
𝐴
Decoder
𝐴
Text
𝐴
Model from 𝐵 to 𝐴.
Language model for 𝐴.
Policy gradient methods to maximize
1. communication reward, which is they fidelity of a
round-trip translation of 𝐴 back to 𝐴 (and 𝐵 back to 𝐵)
2. language model reward, which for 𝐴 → 𝐵 is how
likely 𝐵 is in the monolingual language model for 𝐵
(and likewise for 𝐵 → 𝐴).
[0] Achieving Human Parity on Automatic Chinese to English
News Translation https://arxiv.org/pdf/1803.05567.pdf
49. 4949
Outline
• Introduction
• Learning from pairs of sentences
• Mapping to and from meaning space
• Moving beyond pairs of sentences
• Bridging meanings through shared space
• Adversarial text generation
• Learning through reinforcement learning
• Conclusion
50. 5050
Examples on movie reviews:
(“negative”, “past”)
the acting was also kind of hit or miss .
(“positive”, “past”)
his acting was impeccable
(“negative”, “present”)
the movie is very close to the show in plot and characters
(“positive”, “present”)
this is one of the better dance films
Using a discriminator [0]
Specify desired feature
text encoder 𝑁(0,1) decoder
generated
text with
feature
Learned
Discriminator on
that feature
[0] Hu et al. allow one to manipulate single
values of the meaning vector to change what
is written https://arxiv.org/abs/1703.00955
51. 5151
Generative Adversarial networks (GANs)
Generator
Synthetic Data
Discriminator Real Data
Guidance to
improve generator
• Generative Adversarial Networks
(GANs) are built on deep learning.
• A GAN consists of a generator
neural network and a
discriminator neural network
training together in an adversarial
relationship.
Goodfellow et al., Generative Adversarial
Networks, https://arxiv.org/abs/1406.2661
52. 5252
GAN for text generation
text encoder meaning space decoder
generated
text
Learned
Discriminator on
text
Problem: Generating text in this way is not differentiable.
Three solutions
1. Soft selection (take weighted average of words) (Hu does this.)
2. Run discriminator on continuous meaning space or
continuous decoder state. E.g., Professor Forcing [0]
3. Use reinforcement learning.
The discriminator could
try to differentiate
generated from real text.
[0] Professor Forcing: A New Algorithm for Training Recurrent Networks
https://arxiv.org/pdf/1610.09038.pdf
[1] Adversarial Learning for Neural Dialogue Generation
https://arxiv.org/pdf/1701.06547.pdf
[Note, encoding could be a line in dialog, like in [1]].
53. 5353
Outline
• Introduction
• Sequence-to-sequence models: learning form pairs of
sentences
• Mapping meaning to and from meaning space
• Moving beyond pairs of sentences
• Bridging meanings through shared space
• Adversarial text generation
• Learning through reinforcement learning
• Conclusion
54. 5454
Reinforcement learning is a gradual stamping in of behavior
• Reinforcement learning (RL) was studied in
animals by Thorndike [1898].
• Became the study of Behaviorism [Skinner, 1953]
(see Skinner box on the right).
• Formulated into artificial intelligence; see
leading book by Sutton and Barto, 1998.
(new draft available online
http://incompleteideas.net/book/bookdraft2018jan1.pdf)
By Andreas1 (Adapted from Image:Boite skinner.jpg) [GFDL
(http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0
(http://creativecommons.org/licenses/by-sa/3.0/)], via
Wikimedia Commons
A “Skinner box”
55. 5555
Beginning with random exploration
In reinforcement learning, the agent begins by
randomly exploring until it reaches its goal.
56. 5656
• When it reaches the goal, credit is propagated back to its previous
states.
• The agent learns the function 𝑄m(𝑠, 𝑎), which gives the cumulative
expected discounted reward of being in state 𝑠 and taking action
𝑎 and acting according to policy 𝜋 thereafter.
Reaching the goal
57. 5757
Eventually, the agent learns the value of being in each state and
taking each action and can therefore always do the best thing in
each state.
Learning the behavior
58. 5858
RL is now like supervised learning over actions
With the rise of deep learning, we can now more effectively work in high
dimensional and continuous domains.
When domains have large action spaces, the kind of reinforcement learning that
works best is policy gradient methods.
• D. Silver http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/pg.pdf
• A. Karpathy http://karpathy.github.io/2016/05/31/rl/
Policy gradient methods learn a policy directly on top of a neural network.
Basically softmax over actions on top of a neural network.
Reinforcement learning starts to look more like supervised learning, where the
labels are actions.
Instead of doing gradient descent on the error function as in supervised learning,
you are doing gradient ascent on the expected reward.
59. 5959
Searching forward to get this reward
You have to do a search forward to
get the reward.
La
ℎ&
mujer
ℎ'
comió
ℎ(
Since we are not doing teacher forcing where
we know what the next word should be, we
do a forward search: If we were to write
“comió” and follow our current policy after
that, how good would the final reward be?
That “how good” is the estimate we use to
determine whether writing “comió” should be
the action this network takes in the future.
[see Lantao et al., https://arxiv.org/pdf/1609.05473v3.pdf]
Called Monte Carlo search
This is similar how game playing AI is done, for example with the recent success in Go
[see Silver et al., https://arxiv.org/pdf/1712.01815.pdf ]
In the language domain, this kind of training can be very slow. There are 50,000 choices.
61. 6161
Conclusion: How to Get Started?
GPT2 https://gpt2.apps.allenai.org/?text=Joel%20is
https://github.com/huggingface/pytorch-pretrained-BERT
If you have pairs, OpenNMT http://opennmt.net/
Texar (Built on TensorFlow) https://github.com/asyml/texar
62. 6262
Conclusion: when should you use current neural
text generation?
If you can enumerate all of the patterns for the things you want the computer to
understand and say: use context-free grammars for understanding and templates for
language generation.
If you can’t enumerate what you want the computer to say and you have a lot of data,
neural text generation might be the way to go.
• Machine translation is an excellent example.
• Or, if you want your text converted to different styles.
• Machines will be illiterate until we can teach them common sense.
• But as algorithms get better at controlling and navigating the meaning space, neural
text generation has the potential to be transformative.
• Few things are more enjoyable than reading a story by your favorite author, and if
we could have all of our information given to us with a style tailored to our tastes,
the world could be enchanting.
Text can’t be long, and
it won’t be exact.