1. Neural networks are a type of machine learning model that can learn highly non-linear functions to map inputs to outputs. They consist of interconnected layers of nodes that mimic biological neurons.
2. Backpropagation is an algorithm that allows neural networks to be trained using gradient descent by efficiently computing the gradient of the loss function with respect to the network parameters. It works by propagating gradients from the output layer back through the network using the chain rule.
3. There are many design decisions that go into building a neural network architecture, such as the number of hidden layers and nodes, choice of activation functions, objective function, and training algorithm like stochastic gradient descent. Common activation functions are the sigmoid, tanh, and rectified linear
Deep learning uses neural networks with many hidden layers to learn representations of data with multiple levels of abstraction. It has been shown to outperform simpler models with fewer layers on complex tasks like image and speech recognition. Deep learning works by defining a set of candidate functions (neural networks) and using gradient descent to optimize the network parameters to minimize loss on training data. Deeper networks with more parameters generally perform better but require large datasets and computational resources to train effectively.
Machine learning techniques can be applied in formal verification in several ways:
1) To enhance current formal verification tools by automating tasks like debugging, specification mining, and theorem proving.
2) To enable the development of new formal verification tools by applying machine learning to problems like SAT solving, model checking, and property checking.
3) Specific applications include using machine learning for debugging and root cause identification, learning specifications from runtime traces, aiding theorem proving by selecting heuristics, and tuning SAT solver parameters and selection.
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
Deep learning models can learn hierarchical feature representations from raw input data. These learned features can then be used to build simple classifiers that achieve high accuracy, even when training data is limited. Transfer learning involves using features extracted from a model pre-trained on a large dataset to build classifiers for other related problems. This approach has been shown to outperform traditional feature engineering with hand-designed features. Deep features extracted from neural networks trained on large image or text datasets have proven to work well as general purpose features for other visual and language problems.
This document discusses deepnets, which are a type of supervised learning algorithm for classification and regression. Deepnets build upon logistic regression by adding hidden layers between the input and output layers. This allows deepnets to model more complex nonlinear relationships than logistic regression. While deepnets have powerful representational abilities, their success depends on finding the optimal network structure for a given problem. The document outlines how BigML uses metalearning and network search techniques to automate this process and make deepnets more accessible for users. Deepnets work best for problems where computational resources allow exploring many network structures to find the best performing one.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
This document summarizes a research paper on convolutional restricted Boltzmann machines (CRBMs) for feature learning. The paper proposes using CRBMs to learn hierarchical local feature detectors in an unsupervised and generative manner. CRBMs extend regular restricted Boltzmann machines to incorporate spatial locality. The learned features are evaluated on handwritten digit and human detection tasks, achieving results comparable to state-of-the-art. The paper contributes an approach to generative feature learning using CRBMs that can capture spatial relationships in images.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Deep learning uses neural networks with many hidden layers to learn representations of data with multiple levels of abstraction. It has been shown to outperform simpler models with fewer layers on complex tasks like image and speech recognition. Deep learning works by defining a set of candidate functions (neural networks) and using gradient descent to optimize the network parameters to minimize loss on training data. Deeper networks with more parameters generally perform better but require large datasets and computational resources to train effectively.
Machine learning techniques can be applied in formal verification in several ways:
1) To enhance current formal verification tools by automating tasks like debugging, specification mining, and theorem proving.
2) To enable the development of new formal verification tools by applying machine learning to problems like SAT solving, model checking, and property checking.
3) Specific applications include using machine learning for debugging and root cause identification, learning specifications from runtime traces, aiding theorem proving by selecting heuristics, and tuning SAT solver parameters and selection.
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
Deep learning models can learn hierarchical feature representations from raw input data. These learned features can then be used to build simple classifiers that achieve high accuracy, even when training data is limited. Transfer learning involves using features extracted from a model pre-trained on a large dataset to build classifiers for other related problems. This approach has been shown to outperform traditional feature engineering with hand-designed features. Deep features extracted from neural networks trained on large image or text datasets have proven to work well as general purpose features for other visual and language problems.
This document discusses deepnets, which are a type of supervised learning algorithm for classification and regression. Deepnets build upon logistic regression by adding hidden layers between the input and output layers. This allows deepnets to model more complex nonlinear relationships than logistic regression. While deepnets have powerful representational abilities, their success depends on finding the optimal network structure for a given problem. The document outlines how BigML uses metalearning and network search techniques to automate this process and make deepnets more accessible for users. Deepnets work best for problems where computational resources allow exploring many network structures to find the best performing one.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
This document summarizes a research paper on convolutional restricted Boltzmann machines (CRBMs) for feature learning. The paper proposes using CRBMs to learn hierarchical local feature detectors in an unsupervised and generative manner. CRBMs extend regular restricted Boltzmann machines to incorporate spatial locality. The learned features are evaluated on handwritten digit and human detection tasks, achieving results comparable to state-of-the-art. The paper contributes an approach to generative feature learning using CRBMs that can capture spatial relationships in images.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Machine Learning Crash Course by Sebastian RaschkaPawanJayarathna1
Sebastian Raschka gave a machine learning workshop at Michigan State University on February 21, 2018. The workshop covered topics including linear regression, classification, feature preprocessing, dimensionality reduction, and model evaluation. Raschka discussed different machine learning algorithms like logistic regression and K-nearest neighbors. He demonstrated concepts and algorithms using Python and scikit-learn in Jupyter notebooks.
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
The document provides an introduction and overview of machine learning, deep learning, and data analysis. It discusses key concepts like supervised and unsupervised learning. It also summarizes the speaker's experience taking online courses and studying resources to learn machine learning techniques. Examples of commonly used machine learning algorithms and neural network architectures are briefly outlined.
This document discusses how machines can make decisions using machine learning approaches. It provides an overview of machine learning vocabulary and techniques including supervised learning methods like regression and classification. It also discusses unsupervised learning and examples of clustering emails. The document then demonstrates simple linear and logistic regression models to predict outputs for given inputs. It discusses evaluating models through error measurement and mentions some other machine learning techniques. Finally, it provides an overview of neural networks including feedforward networks and different types like convolutional and recurrent neural networks.
This document discusses how machines can make decisions using machine learning approaches. It provides an overview of machine learning vocabulary and techniques including supervised learning methods like regression and classification. It also discusses unsupervised learning and examples of clustering emails. The document then demonstrates simple linear and logistic regression models to predict outputs given inputs. It discusses evaluating models through error measurement and mentions several other machine learning techniques. Finally, it provides an overview of neural networks including feedforward networks and different types like convolutional and recurrent neural networks.
230208 MLOps Getting from Good to Great.pptxArthur240715
1) MLOps is the process of maintaining machine learning models in production environments. It involves monitoring model performance over time and retraining models if needed due to data or concept drift.
2) The MLOps pipeline includes stages for data engineering, modelling, deployment, and monitoring. Key aspects are ensuring reproducibility, managing data processing pipelines, and defining deployment and monitoring strategies.
3) Successful MLOps requires automating model deployment, monitoring model and data metrics over time, and retraining models when performance degrades to keep models performing well as data evolves in production.
These slides discuss some milestone results in image classification using Deep Convolutional neural network and talks about our results on Obscenity detection in images by using Deep Convolutional neural network and transfer learning on ImageNet models.
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
This document discusses using deep learning and deep features to build an app that finds similar images. It begins with an overview of deep learning and how neural networks can learn complex patterns in data. The document then discusses how pre-trained neural networks can be used as feature extractors for other domains through transfer learning. This reduces data and tuning requirements compared to training new deep learning models. The rest of the document focuses on building an image similarity service using these techniques, including training a model with GraphLab Create and deploying it as a web service with Dato Predictive Services.
This document summarizes a study of deep learning models and Bayesian statistics. It discusses the history of artificial intelligence and machine learning before introducing restricted Boltzmann machines, deep belief networks, and Bayesian statistics. It describes experiments applying restricted Boltzmann machines to classify movies and generate images, and using a deep belief network to classify images from multiple datasets with 100% accuracy. The conclusion states that deep learning has advanced artificial intelligence by allowing algorithms to perform multiple tasks and taken us closer to the original goal of general artificial intelligence.
- How to tackle an object detection competition
- Schwert's 6th-place solution on Open Images Challenge 2019
- presented at the lunch workshop of the 26th Symposium on Sensing via Image Information (2020).
AILABS Lecture Series - Is AI The New Electricity. Topic - Deep Learning - Ev...AILABS Academy
Artificial Intelligence is transforming the world. Deep Learning, an integral part of this new Artificial Intelligence paradigm, is becoming one of the most sought after skills. Learn more about Deep Learning and its Evolution.
Network Metrics and Measurements in the Era of the Digital EconomiesPavel Loskot
Rapidly evolving socio-technical systems require radically new approaches to measure and monitor their performance. The system complexity and desire for autonomy requires to move from simple metrics to whole metrics frameworks. The metrics and measurements of complex systems is emerging as a new discipline.
This document provides an overview of deep learning and its applications. It discusses how deep learning can be used for image classification and how neural networks learn hierarchical representations from data. The document highlights some of the challenges of deep learning, such as the large amounts of data and computation required. It also covers how deep learning models can be deployed in production using services like Amazon Web Services to ensure low latency, high availability, and continuous learning.
Morichetta, A., Casas, P., & Mellia, M. (2019). EXPLAIN-IT: Towards explainable AI for unsupervised network traffic analysis. In Proceedings of the 3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks (pp. 22–28).
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/introducing-machine-learning-and-how-to-teach-machines-to-see-a-presentation-from-tryolabs/
Facundo Parodi, Research and Machine Learning Engineer at Tryolabs, presents the “Introduction to Machine Learning and How to Teach Machines to See” tutorial at the September 2020 Embedded Vision Summit.
What is machine learning? How can machines distinguish a cat from a dog in an image? What’s the magic behind convolutional neural networks? These are some of the questions Parodi answers in this introductory talk on machine learning in computer vision.
Parodi introduces machine learning and explores the different types of problems it can solve. He explains the main components of practical machine learning, from data gathering and training to deployment. He then focuses on deep learning as an important machine learning technique and provides an introduction to convolutional neural networks and how they can be used to solve image classification problems. Parodi will also touches on recent advancements in deep learning and how they have revolutionized the entire field of computer vision.
- The document provides an introduction to machine learning concepts and practical examples using neural networks. It discusses different machine learning categories and algorithms. It then demonstrates how to build and train simple feedforward neural networks to classify points and recognize handwritten digits. Code examples are provided using Python libraries like Keras. References are included for further reading.
This document discusses modern recommender systems used on large content websites. It begins by providing examples of recommender systems used on PIXNET and describes how basic and advanced recommender systems are built. It then explains the two-phase recommender system approach involving an item retrieving phase to narrow down millions of items to hundreds for each user, followed by an item ranking phase. Several item retrieving and ranking methods are discussed and compared. The document concludes by discussing recommender systems on data management platforms and providing an example of calculating pairwise distances between item vectors at scale.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
Machine Learning Crash Course by Sebastian RaschkaPawanJayarathna1
Sebastian Raschka gave a machine learning workshop at Michigan State University on February 21, 2018. The workshop covered topics including linear regression, classification, feature preprocessing, dimensionality reduction, and model evaluation. Raschka discussed different machine learning algorithms like logistic regression and K-nearest neighbors. He demonstrated concepts and algorithms using Python and scikit-learn in Jupyter notebooks.
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
The document provides an introduction and overview of machine learning, deep learning, and data analysis. It discusses key concepts like supervised and unsupervised learning. It also summarizes the speaker's experience taking online courses and studying resources to learn machine learning techniques. Examples of commonly used machine learning algorithms and neural network architectures are briefly outlined.
This document discusses how machines can make decisions using machine learning approaches. It provides an overview of machine learning vocabulary and techniques including supervised learning methods like regression and classification. It also discusses unsupervised learning and examples of clustering emails. The document then demonstrates simple linear and logistic regression models to predict outputs for given inputs. It discusses evaluating models through error measurement and mentions some other machine learning techniques. Finally, it provides an overview of neural networks including feedforward networks and different types like convolutional and recurrent neural networks.
This document discusses how machines can make decisions using machine learning approaches. It provides an overview of machine learning vocabulary and techniques including supervised learning methods like regression and classification. It also discusses unsupervised learning and examples of clustering emails. The document then demonstrates simple linear and logistic regression models to predict outputs given inputs. It discusses evaluating models through error measurement and mentions several other machine learning techniques. Finally, it provides an overview of neural networks including feedforward networks and different types like convolutional and recurrent neural networks.
230208 MLOps Getting from Good to Great.pptxArthur240715
1) MLOps is the process of maintaining machine learning models in production environments. It involves monitoring model performance over time and retraining models if needed due to data or concept drift.
2) The MLOps pipeline includes stages for data engineering, modelling, deployment, and monitoring. Key aspects are ensuring reproducibility, managing data processing pipelines, and defining deployment and monitoring strategies.
3) Successful MLOps requires automating model deployment, monitoring model and data metrics over time, and retraining models when performance degrades to keep models performing well as data evolves in production.
These slides discuss some milestone results in image classification using Deep Convolutional neural network and talks about our results on Obscenity detection in images by using Deep Convolutional neural network and transfer learning on ImageNet models.
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
This document discusses using deep learning and deep features to build an app that finds similar images. It begins with an overview of deep learning and how neural networks can learn complex patterns in data. The document then discusses how pre-trained neural networks can be used as feature extractors for other domains through transfer learning. This reduces data and tuning requirements compared to training new deep learning models. The rest of the document focuses on building an image similarity service using these techniques, including training a model with GraphLab Create and deploying it as a web service with Dato Predictive Services.
This document summarizes a study of deep learning models and Bayesian statistics. It discusses the history of artificial intelligence and machine learning before introducing restricted Boltzmann machines, deep belief networks, and Bayesian statistics. It describes experiments applying restricted Boltzmann machines to classify movies and generate images, and using a deep belief network to classify images from multiple datasets with 100% accuracy. The conclusion states that deep learning has advanced artificial intelligence by allowing algorithms to perform multiple tasks and taken us closer to the original goal of general artificial intelligence.
- How to tackle an object detection competition
- Schwert's 6th-place solution on Open Images Challenge 2019
- presented at the lunch workshop of the 26th Symposium on Sensing via Image Information (2020).
AILABS Lecture Series - Is AI The New Electricity. Topic - Deep Learning - Ev...AILABS Academy
Artificial Intelligence is transforming the world. Deep Learning, an integral part of this new Artificial Intelligence paradigm, is becoming one of the most sought after skills. Learn more about Deep Learning and its Evolution.
Network Metrics and Measurements in the Era of the Digital EconomiesPavel Loskot
Rapidly evolving socio-technical systems require radically new approaches to measure and monitor their performance. The system complexity and desire for autonomy requires to move from simple metrics to whole metrics frameworks. The metrics and measurements of complex systems is emerging as a new discipline.
This document provides an overview of deep learning and its applications. It discusses how deep learning can be used for image classification and how neural networks learn hierarchical representations from data. The document highlights some of the challenges of deep learning, such as the large amounts of data and computation required. It also covers how deep learning models can be deployed in production using services like Amazon Web Services to ensure low latency, high availability, and continuous learning.
Morichetta, A., Casas, P., & Mellia, M. (2019). EXPLAIN-IT: Towards explainable AI for unsupervised network traffic analysis. In Proceedings of the 3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks (pp. 22–28).
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/introducing-machine-learning-and-how-to-teach-machines-to-see-a-presentation-from-tryolabs/
Facundo Parodi, Research and Machine Learning Engineer at Tryolabs, presents the “Introduction to Machine Learning and How to Teach Machines to See” tutorial at the September 2020 Embedded Vision Summit.
What is machine learning? How can machines distinguish a cat from a dog in an image? What’s the magic behind convolutional neural networks? These are some of the questions Parodi answers in this introductory talk on machine learning in computer vision.
Parodi introduces machine learning and explores the different types of problems it can solve. He explains the main components of practical machine learning, from data gathering and training to deployment. He then focuses on deep learning as an important machine learning technique and provides an introduction to convolutional neural networks and how they can be used to solve image classification problems. Parodi will also touches on recent advancements in deep learning and how they have revolutionized the entire field of computer vision.
- The document provides an introduction to machine learning concepts and practical examples using neural networks. It discusses different machine learning categories and algorithms. It then demonstrates how to build and train simple feedforward neural networks to classify points and recognize handwritten digits. Code examples are provided using Python libraries like Keras. References are included for further reading.
This document discusses modern recommender systems used on large content websites. It begins by providing examples of recommender systems used on PIXNET and describes how basic and advanced recommender systems are built. It then explains the two-phase recommender system approach involving an item retrieving phase to narrow down millions of items to hundreds for each user, followed by an item ranking phase. Several item retrieving and ranking methods are discussed and compared. The document concludes by discussing recommender systems on data management platforms and providing an example of calculating pairwise distances between item vectors at scale.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
Enhanced data collection methods can help uncover the true extent of child abuse and neglect. This includes Integrated Data Systems from various sources (e.g., schools, healthcare providers, social services) to identify patterns and potential cases of abuse and neglect.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of March 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
Did you know that drowning is a leading cause of unintentional death among young children? According to recent data, children aged 1-4 years are at the highest risk. Let's raise awareness and take steps to prevent these tragic incidents. Supervision, barriers around pools, and learning CPR can make a difference. Stay safe this summer!
5. Using gradient ascent for linear
classifiers
Key idea behind today’s lecture:
1. Define a linear classifier (logistic regression)
2. Define an objective function (likelihood)
3. Optimize it with gradient descent to learn
parameters
4. Predict the class with highest probability under
the model
5
6. Using gradient ascent for linear
classifiers
6
Use a differentiable
function instead:
logistic(u) º
1
1+e-u
This decision function isn’t
differentiable:
sign(x)
7. Using gradient ascent for linear
classifiers
7
Use a differentiable
function instead:
logistic(u) º
1
1+e-u
This decision function isn’t
differentiable:
sign(x)
8. Logistic Regression
8
Learning: finds the parameters that minimize some
objective function.
Data: Inputs are continuous vectors of length K. Outputs
are discrete.
Prediction: Output is the most probable class.
Model: Logistic function applied to dot product of
parameters with input vector.
13. Why is everyone talking
about Deep Learning?
• Because a lot of money is invested in it…
– DeepMind: Acquired by Google for $400
million
– DNNResearch: Three person startup
(including Geoff Hinton) acquired by Google
for unknown price tag
– Enlitic, Ersatz, MetaMind, Nervana, Skylab:
Deep Learning startups commanding millions
of VC dollars
• Because it made the front page of the
New York Times
13
Motivation
14. Why is everyone talking
about Deep Learning?
Deep learning:
– Has won numerous pattern recognition
competitions
– Does so with minimal feature
engineering
14
Motivation
1960s
1980s
1990s
2006
2016
This wasn’t always the case!
Since 1980s: Form of models hasn’t changed much,
but lots of new tricks…
– More hidden units
– Better (online) optimization
– New nonlinear functions (ReLUs)
– Faster computers (CPUs and GPUs)
15. A Recipe for
Machine Learning
1. Given training data:
15
Background
2. Choose each of these:
– Decision function
– Loss function
Face Face Not a face
Examples: Linear regression,
Logistic regression, Neural Network
Examples: Mean-squared error,
Cross Entropy
16. A Recipe for
Machine Learning
1. Given training data: 3. Define goal:
16
Background
2. Choose each of these:
– Decision function
– Loss function
4. Train with SGD:
(take small steps
opposite the gradient)
17. A Recipe for
Machine Learning
1. Given training data: 3. Define goal:
17
Background
2. Choose each of these:
– Decision function
– Loss function
4. Train with SGD:
(take small steps
opposite the gradient)
Gradients
Backpropagation can compute this
gradient!
And it’s a special case of a more
general algorithm called reverse-
mode automatic differentiation that
can compute the gradient of any
differentiable function efficiently!
18. A Recipe for
Machine Learning
1. Given training data: 3. Define goal:
18
Background
2. Choose each of these:
– Decision function
– Loss function
4. Train with SGD:
(take small steps
opposite the gradient)
Goals for Today’s Lecture
1. Explore a new class of decision functions
(Neural Networks)
2. Consider variants of this recipe for training
43. Deeper Networks
Next lecture:
Making the
neural
networks
deeper
43
Decision
Functions
…
…
Input
Hidden Layer 1
…
Hidden Layer 2
…
Output
Hidden Layer 3
44. Different Levels of
Abstraction
• We don’t know
the “right”
levels of
abstraction
• So let the model
figure it out!
44
Decision
Functions
Example from Honglak Lee (NIPS 2010)
45. Different Levels of
Abstraction
Face Recognition:
– Deep Network
can build up
increasingly
higher levels of
abstraction
– Lines, parts,
regions
45
Decision
Functions
Example from Honglak Lee (NIPS 2010)
48. Neural Network Architectures
Even for a basic Neural Network, there are
many design decisions to make:
1. # of hidden layers (depth)
2. # of units per hidden layer (width)
3. Type of activation function (nonlinearity)
4. Form of objective function
48
51. Activation Functions
So far, we’ve
assumed that the
activation function
(nonlinearity) is
always the sigmoid
function…
51
Sigmoid / Logistic Function
logistic(u) º
1
1+e-u
52. Activation Functions
• A new change: modifying the nonlinearity
– The logistic is not widely used in modern ANNs
Alternate 1:
tanh
Like logistic function but
shifted to range [-1, +1]
Slide from William Cohen
54. Activation Functions
• A new change: modifying the nonlinearity
– reLU often used in vision tasks
Alternate 2: rectified linear unit
Linear with a cutoff at zero
(Implementation: clip the gradient
when you pass zero)
Slide from William Cohen
55. Activation Functions
• A new change: modifying the nonlinearity
– reLU often used in vision tasks
Alternate 2: rectified linear unit
Soft version: log(exp(x)+1)
Doesn’t saturate (at one end)
Sparsifies outputs
Helps with vanishing gradient
Slide from William Cohen
56. Objective Functions for NNs
• Regression:
– Use the same objective as Linear Regression
– Quadratic loss (i.e. mean squared error)
• Classification:
– Use the same objective as Logistic Regression
– Cross-entropy (i.e. negative log likelihood)
– This requires probabilities, so we add an additional
“softmax” layer at the end of our network
56
60. A Recipe for
Machine Learning
1. Given training data: 3. Define goal:
60
Background
2. Choose each of these:
– Decision function
– Loss function
4. Train with SGD:
(take small steps
opposite the gradient)
61. Objective Functions
Matching Quiz: Suppose you are given a neural net with a
single output, y, and one hidden layer.
61
1) Minimizing sum of squared
errors…
2) Minimizing sum of squared
errors plus squared Euclidean
norm of weights…
3) Minimizing cross-entropy…
4) Minimizing hinge loss…
5) …MLE estimates of weights assuming
target follows a Bernoulli with
parameter given by the output value
6) …MAP estimates of weights
assuming weight priors are zero mean
Gaussian
7) …estimates with a large margin on
the training data
8) …MLE estimates of weights assuming
zero mean Gaussian noise on the output
value
…gives…
A. 1=5, 2=7, 3=6, 4=8
B. 1=5, 2=7, 3=8, 4=6
C. 1=7, 2=5, 3=5, 4=7
E. 1=8, 2=6, 3=5, 4=7
F. 1=8, 2=6, 3=8, 4=6
63. A Recipe for
Machine Learning
1. Given training data: 3. Define goal:
63
Background
2. Choose each of these:
– Decision function
– Loss function
4. Train with SGD:
(take small steps
opposite the gradient)
64. Backpropagation
• Question 1:
When can we compute the gradients of the
parameters of an arbitrary neural network?
• Question 2:
When can we make the gradient
computation efficient?
64
Training
67. Chain Rule
67
Training
Chain Rule:
Given:
…
Backpropagation:
1. Instantiate the computation as a directed acyclic graph, where each
intermediate quantity is a node
2. At each node, store (a) the quantity computed in the forward pass
and (b) the partial derivative of the goal with respect to that node’s
intermediate quantity.
3. Initialize all partial derivatives to 0.
4. Visit each node in reverse topological order. At each node, add its
contribution to the partial derivatives of its parents
This algorithm is also called automatic differentiation in the reverse-
74. Chain Rule
74
Training
Chain Rule:
Given:
…
Backpropagation:
1. Instantiate the computation as a directed acyclic graph, where each
intermediate quantity is a node
2. At each node, store (a) the quantity computed in the forward pass
and (b) the partial derivative of the goal with respect to that node’s
intermediate quantity.
3. Initialize all partial derivatives to 0.
4. Visit each node in reverse topological order. At each node, add its
contribution to the partial derivatives of its parents
This algorithm is also called automatic differentiation in the reverse-
75. Chain Rule
75
Training
Chain Rule:
Given:
…
Backpropagation:
1. Instantiate the computation as a directed acyclic graph, where each
node represents a Tensor.
2. At each node, store (a) the quantity computed in the forward pass
and (b) the partial derivatives of the goal with respect to that node’s
Tensor.
3. Initialize all partial derivatives to 0.
4. Visit each node in reverse topological order. At each node, add its
contribution to the partial derivatives of its parents
This algorithm is also called automatic differentiation in the reverse-
77. A Recipe for
Machine Learning
1. Given training data: 3. Define goal:
77
Background
2. Choose each of these:
– Decision function
– Loss function
4. Train with SGD:
(take small steps
opposite the gradient)
Gradients
Backpropagation can compute this
gradient!
And it’s a special case of a more
general algorithm called reverse-
mode automatic differentiation that
can compute the gradient of any
differentiable function efficiently!
78. Summary
1. Neural Networks…
– provide a way of learning features
– are highly nonlinear prediction functions
– (can be) a highly parallel network of logistic
regression classifiers
– discover useful hidden representations of the
input
2. Backpropagation…
– provides an efficient way to compute gradients
– is a special case of reverse-mode automatic
differentiation
78