The document provides an overview of a course on machine learning, including defining machine learning and artificial intelligence, discussing different applications of machine learning such as speech recognition, robotics, and computer vision, and outlining the topics that will be covered in the course such as classifiers, regression, neural networks, and learning theory. The course aims to provide students with the tools and foundations of machine learning including optimization, statistics, and computer science to solve problems in areas like natural language processing, computer vision, robotics, and medicine.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.anant90
The goal of the session was to demystify Machine Learning for the participants and show them a real Machine Learning system in action. The secondary goal is to show that Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. The session aims to be highly collaborative and audience-driven and can be adjusted to suit the participants' familiarity with machine learning and coding.
The importance of model fairness and interpretability in AI systemsFrancesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them.
In this session, Francesca will go over a few methods and tools that enable you to "unpack” machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open-source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual data points.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.anant90
The goal of the session was to demystify Machine Learning for the participants and show them a real Machine Learning system in action. The secondary goal is to show that Machine Learning is itself just another tool, susceptible to adversarial attacks. These can have huge implications, especially in a world with self-driving cars and other automation. The session aims to be highly collaborative and audience-driven and can be adjusted to suit the participants' familiarity with machine learning and coding.
The importance of model fairness and interpretability in AI systemsFrancesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them.
In this session, Francesca will go over a few methods and tools that enable you to "unpack” machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open-source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual data points.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
Artificial intelligence
what is AI?
History
foundations of AI
Types of AI
Applications of AI
machine learning and applications
AI Vs Machine learning
Deep learning- advantages and disadvantages
Applications of Deep learning
Why is deep learning better than machine learning
Deep learning vs machine learning
Artificial Neural Network (ANN)
Architecture of ANN
Types of ANN
Applications of ANN
Softwares of ANN and their applications
Machine learning(ML) is the scientific study of algorithms and statistical models that computer systems used to progressively improve their performance on a specific task. Machine learning algorithms build a mathematical model of sample data, known as “Training Data", in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in the applications of email filtering, detection of network intruders and computer vision, where it is infeasible to develop an algorithm of specific instructions for performing the task. Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a field of study within machine learning and focuses on exploratory data analysis through unsupervised learning. In its application across business problems, Machine learning is the study of computer systems that learn from data and experience. It is applied in an incredibly wide variety of application areas, from medicine to advertising, from military to pedestrian. Any area in which you need to make sense of data is a potential customer of machine learning.
Machine Learning and AI: Core Methods and ApplicationsQuantUniversity
This session was presented at the CFA Institute on May 6th 2020
This deep-dive session discusses core methods and applications to provide an understanding of supervised and unsupervised machine learning. Participants will be introduced to advanced topics that include time series analysis, reinforcement learning, anomaly detection, and natural language processing. Case studies will also examine how to predict interest rates and credit risk with alternative data sets and how to analyze earning calls from EDGAR using Natural Language Processing Techniques.
Hot Topics in Machine Learning for Research and ThesisWriteMyThesis
Machine Learning is a hot topic for research for research. There are various good thesis topics in Machine Learning. Writemythesis provides thesis in Machine Learning along with proper guidance in this field. Find the list of thesis topics in this document.
http://www.writemythesis.org/master-thesis-topics-in-machine-learning/
The slide has details on below points:
1. Introduction to Machine Learning
2. What are the challenges in acceptance of Machine Learning in Banks
3. How to overcome the challenges in adoption of Machine Learning in Banks
4. How to find new use cases of Machine Learning
5. Few current interesting use cases of Machine Learning
Please contact me (shekup@gmail.com) or connect with me on LinkedIn (https://www.linkedin.com/in/shekup/) for more explanation on ML and how it may help your business.
The slides are inspired by:
Survey & interviews done by me with Bankers & Technology Professionals
Presentation from Google NEXT 2017
Presentation by DATUM on Youtube
Royal Society Machine Learning
Big Data & Social Analytics Course from MIT & GetSmarter
An introduction and basic concepts of machine learning without mathematics. This is a short presentation for beginners in machine learning.Andrews Cordolino Sobral, Ph.D., Computer Vision and Machine Learning Researcher, Activeeon
Machine Learning for Dummies (without mathematics)ActiveEon
It presents an introduction and the basic concepts of machine learning without mathematics. This is a short presentation for beginners in machine learning.
This slide gives brief overview of supervised, unsupervised and reinforcement learning. Algorithms discussed are Naive Bayes, K nearest neighbour, SVM,decision tree, Markov model.
Difference between regression and classification. difference between supervised and reinforcement, iterative functioning of Markov model and machine learning applications.
AI Professionals use top machine learning algorithms to automate models that analyze more extensive and complex data which was not possible in older machine learning algos.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
Artificial intelligence
what is AI?
History
foundations of AI
Types of AI
Applications of AI
machine learning and applications
AI Vs Machine learning
Deep learning- advantages and disadvantages
Applications of Deep learning
Why is deep learning better than machine learning
Deep learning vs machine learning
Artificial Neural Network (ANN)
Architecture of ANN
Types of ANN
Applications of ANN
Softwares of ANN and their applications
Machine learning(ML) is the scientific study of algorithms and statistical models that computer systems used to progressively improve their performance on a specific task. Machine learning algorithms build a mathematical model of sample data, known as “Training Data", in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in the applications of email filtering, detection of network intruders and computer vision, where it is infeasible to develop an algorithm of specific instructions for performing the task. Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a field of study within machine learning and focuses on exploratory data analysis through unsupervised learning. In its application across business problems, Machine learning is the study of computer systems that learn from data and experience. It is applied in an incredibly wide variety of application areas, from medicine to advertising, from military to pedestrian. Any area in which you need to make sense of data is a potential customer of machine learning.
Machine Learning and AI: Core Methods and ApplicationsQuantUniversity
This session was presented at the CFA Institute on May 6th 2020
This deep-dive session discusses core methods and applications to provide an understanding of supervised and unsupervised machine learning. Participants will be introduced to advanced topics that include time series analysis, reinforcement learning, anomaly detection, and natural language processing. Case studies will also examine how to predict interest rates and credit risk with alternative data sets and how to analyze earning calls from EDGAR using Natural Language Processing Techniques.
Hot Topics in Machine Learning for Research and ThesisWriteMyThesis
Machine Learning is a hot topic for research for research. There are various good thesis topics in Machine Learning. Writemythesis provides thesis in Machine Learning along with proper guidance in this field. Find the list of thesis topics in this document.
http://www.writemythesis.org/master-thesis-topics-in-machine-learning/
The slide has details on below points:
1. Introduction to Machine Learning
2. What are the challenges in acceptance of Machine Learning in Banks
3. How to overcome the challenges in adoption of Machine Learning in Banks
4. How to find new use cases of Machine Learning
5. Few current interesting use cases of Machine Learning
Please contact me (shekup@gmail.com) or connect with me on LinkedIn (https://www.linkedin.com/in/shekup/) for more explanation on ML and how it may help your business.
The slides are inspired by:
Survey & interviews done by me with Bankers & Technology Professionals
Presentation from Google NEXT 2017
Presentation by DATUM on Youtube
Royal Society Machine Learning
Big Data & Social Analytics Course from MIT & GetSmarter
An introduction and basic concepts of machine learning without mathematics. This is a short presentation for beginners in machine learning.Andrews Cordolino Sobral, Ph.D., Computer Vision and Machine Learning Researcher, Activeeon
Machine Learning for Dummies (without mathematics)ActiveEon
It presents an introduction and the basic concepts of machine learning without mathematics. This is a short presentation for beginners in machine learning.
This slide gives brief overview of supervised, unsupervised and reinforcement learning. Algorithms discussed are Naive Bayes, K nearest neighbour, SVM,decision tree, Markov model.
Difference between regression and classification. difference between supervised and reinforcement, iterative functioning of Markov model and machine learning applications.
AI Professionals use top machine learning algorithms to automate models that analyze more extensive and complex data which was not possible in older machine learning algos.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
Technical Drawings introduction to drawing of prisms
ML.pdf
1. Course Overview
1
10-301/10-601 Introduction to Machine Learning
Matt Gormley
Lecture 1
Jan. 18, 2023
Machine Learning Department
School of Computer Science
Carnegie Mellon University
3. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
3
Artificial
Intelligence
Machine
Learning
4. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
4
Artificial
Intelligence
Machine
Learning
5. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
6
Artificial
Intelligence
Machine
Learning
6. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
7
Artificial
Intelligence
Machine
Learning
7. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
8
Artificial
Intelligence
Machine
Learning
8. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
9
Artificial
Intelligence
Machine
Learning
9. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
10
Artificial
Intelligence
Machine
Learning
“Deep Style” from https://deepdreamgenerator.com/#gallery
10. Artificial Intelligence
The basic goal of AI is to develop intelligent
machines.
This consists of many sub-goals:
• Perception
• Reasoning
• Control / Motion / Manipulation
• Planning
• Communication
• Creativity
• Learning
11
Artificial
Intelligence
Machine
Learning
11. What is Machine Learning?
The goal of this
course is to provide
you with a toolbox:
12
Machine Learning
Optimization
Statistics
Probability
Computer
Science
12. What is ML?
13
Machine Learning
Optimization Statistics
Probability
Calculus Linear Algebra
Computer
Science
Domain of
Interest
Measure
Theory
14. Speech Recognition
1. Learning to recognize spoken words
15
“…the SPHINX system (e.g.
Lee 1989) learns speaker-
specific strategies for
recognizing the primitive
sounds (phonemes) and
words from the observed
speech signal…neural
network methods…hidden
Markov models…”
(Mitchell, 1997)
THEN
Figure from https://botpenguin.com/alexa-vs-siri-vs-google-assistant/
NOW
15. Robotics
2. Learning to drive an autonomous vehicle
16
“…the ALVINN system
(Pomerleau 1989) has used
its learned strategies to drive
unassisted at 70 miles per
hour for 90 miles on public
highways among other
cars…”
(Mitchell, 1997)
THEN
waymo.com
NOW
16. Robotics
2. Learning to drive an autonomous vehicle
17
“…the ALVINN system
(Pomerleau 1989) has used
its learned strategies to drive
unassisted at 70 miles per
hour for 90 miles on public
highways among other
cars…”
(Mitchell, 1997)
THEN
aurora.tech
NOW
Figure from https://www.bloomberg.com/news/articles/2019-02-07/aurora-self-driving-startup-gets-funding-from-sequoia-amazon
17. Robotics
2. Learning to drive an autonomous vehicle
18
“…the ALVINN system
(Pomerleau 1989) has used
its learned strategies to drive
unassisted at 70 miles per
hour for 90 miles on public
highways among other
cars…”
(Mitchell, 1997)
THEN
locomation.ai
NOW
Figure from https://locomation.ai/
18. Games / Reasoning
3. Learning to beat the masters at board games
19
“…the world’s top computer
program for backgammon,
TD-GAMMON (Tesauro,
1992, 1995), learned its
strategy by playing over one
million practice games
against itself…”
(Mitchell, 1997)
THEN NOW
19. Computer Vision
4. Learning to recognize images
20
LeRec: Hybrid for On-Line Handwriting Recognition 1295
3x3
INPUTAMAP
5820x18
I
....
2x2 convolve
feature maps
feature maps 889x8 feature maps
2505x4
8018x16 output code o ~ ~ ~ ~ x " p d e
8482x1
Figure 2: Convolutional neural network character recognizer. This architecture
is robust to local translationsand distortions,with subsampling,sharedweights,
and local receptive fields.
number of subsampling layers and the sizes of the kernels are chosen,
the sizes of all the layers, including the input, are determined unambigu-
ously. The only architectural parameters that remain to be selected are
the number of feature maps in each layer, and the information as to what
feature map is connected to what other feature map. In our case, the sub-
sampling rates were chosen as small as possible (2 x 2), and the kernels
as small as possible in the first layer (3 x 3) to limit the total number of
connections. Kernel sizes in the upper layers are chosen to be as small as
possible while satisfying the size constraints mentioned above. The last
subsampling layer performs a vertical subsampling to make the network
more robust to errors of the word normalizer (which tends to create vari-
ations in vertical position). Several architectures were tried (but clearly
not exhaustively), varying the type of layers (convolution, subsampling),
the kernel sizes, and the number of feature maps.
Larger architectures did not necessarily perform better and required
considerably more time to be trained. A very small architecture with
“…The recognizer is a
convolution network that
can be spatially replicated.
From the network output, a
hidden Markov model
produces word scores. The
entire system is globally
trained to minimize word-
level errors.…”
(LeCun et al., 1995)
THEN NOW
Figure from https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
20. Learning Theory
• 5. In what cases and how well can we learn?
21
Sample%Complexity%Results
34
Realizable Agnostic
Four$Cases$we$care$about…
1. How many examples do we need
to learn?
2. How do we quantify our ability to
generalize to unseen data?
3. Which algorithms are better
suited to specific learning
settings?
22. What is Machine Learning?
The goal of this
course is to provide
you with a toolbox:
23
Machine Learning
Optimization
Statistics
Probability
Computer
Science
To solve all the
problems above
and more
23. Societal Impacts of ML
What ethical responsibilities do we have as machine learning experts?
24
1) Search results for news are optimized
for ad revenue.
http://bing.com/
2) An autonomous vehicle is permitted
to drive unassisted on the road.
http://arstechnica.com/
Question: What are the possible societal impacts of machine learning for
each case below?
Answer:
3) A doctor is prompted by an intelligent
system with a plausible diagnosis for her
patient.
https://flic.kr/p/HNJUzV
24. Societal Impacts of ML
25
Figure from https://www.washingtonpost.com/dc-md-va/2022/12/28/beyer-student-artificial-intelligence-degree/
25. ML Big Picture
26
Learning Paradigms:
What data is available and
when? What form of prediction?
• supervised learning
• unsupervised learning
• semi-supervised learning
• reinforcement learning
• active learning
• imitation learning
• domain adaptation
• online learning
• density estimation
• recommender systems
• feature learning
• manifold learning
• dimensionality reduction
• ensemble learning
• distant supervision
• hyperparameter optimization
Problem Formulation:
What is the structure of our output prediction?
boolean Binary Classification
categorical Multiclass Classification
ordinal Ordinal Classification
real Regression
ordering Ranking
multiple discrete Structured Prediction
multiple continuous (e.g. dynamical systems)
both discrete &
cont.
(e.g. mixed graphical models)
Theoretical Foundations:
What principles guide learning?
q probabilistic
q information theoretic
q evolutionary search
q ML as optimization
Facets of Building ML
Systems:
How to build systems that are
robust, efficient, adaptive,
effective?
1. Data prep
2. Model selection
3. Training (optimization /
search)
4. Hyperparameter tuning on
validation data
5. (Blind) Assessment on test
data
Big Ideas in ML:
Which are the ideas driving
development of the field?
• inductive bias
• generalization / overfitting
• bias-variance decomposition
• generative vs. discriminative
• deep nets, graphical models
• PAC learning
• distant rewards
Application
Areas
Key
challenges?
NLP,
Speech,
Computer
Vision,
Robotics,
Medicine,
Search
26. Topics
• Foundations
– Probability
– MLE, MAP
– Optimization
• Classifiers
– KNN
– Naïve Bayes
– Logistic Regression
– Perceptron
– SVM
• Regression
– Linear Regression
• Important Concepts
– Kernels
– Regularization and Overfitting
– Experimental Design
• Unsupervised Learning
– K-means / Lloyd’s method
– PCA
– EM / GMMs
• Neural Networks
– Feedforward Neural Nets
– Basic architectures
– Backpropagation
– CNNs, LSTMs
• Graphical Models
– Bayesian Networks
– HMMs
– Learning and Inference
• Learning Theory
– Statistical Estimation (covered right
before midterm)
– PAC Learning
• Other Learning Paradigms
– Matrix Factorization
– Reinforcement Learning
– Information Theory
27
28. Well-Posed Learning Problems
Three components <T,P,E>:
1. Task, T
2. Performance measure, P
3. Experience, E
Definition of learning:
A computer program learns if its performance
at task T, as measured by P, improves with
experience E.
29
Definition from (Mitchell, 1997)
31. Capturing the Knowledge of Experts
32
Give me directions to Starbucks
If: “give me directions to X”
Then: directions(here, nearest(X))
How do I get to Starbucks?
If: “how do i get to X”
Then: directions(here, nearest(X))
Where is the nearest Starbucks?
If: “where is the nearest X”
Then: directions(here, nearest(X))
1990 2000
1980 2010
Solution #1: Expert Systems
• Over 20 years ago, we
had rule-based systems:
1. Put a bunch of linguists
in a room
2. Have them think about
the structure of their
native language and
write down the rules
they devise
32. Capturing the Knowledge of Experts
33
Give me directions to Starbucks
If: “give me directions to X”
Then: directions(here, nearest(X))
How do I get to Starbucks?
If: “how do i get to X”
Then: directions(here, nearest(X))
Where is the nearest Starbucks?
If: “where is the nearest X”
Then: directions(here, nearest(X))
I need directions to Starbucks
If: “I need directions to X”
Then: directions(here, nearest(X))
Is there a Starbucks nearby?
If: “Is there an X nearby”
Then: directions(here, nearest(X))
Starbucks directions
If: “X directions”
Then: directions(here, nearest(X))
1990 2000
1980 2010
Solution #1: Expert Systems
• Over 20 years ago, we
had rule-based systems:
1. Put a bunch of linguists
in a room
2. Have them think about
the structure of their
native language and
write down the rules
they devise
33. Capturing the Knowledge of Experts
34
Solution #2: Annotate Data and Learn
• Experts:
– Very good at answering questions about specific
cases
– Not very good at telling HOW they do it
• 1990s: So why not just have them tell you what
they do on SPECIFIC CASES and then let
MACHINE LEARNING tell you how to come to
the same decisions that they did
1990 2000
1980 2010
34. Capturing the Knowledge of Experts
35
Solution #2: Annotate Data and Learn
1. Collect raw sentences {x(1), …, x(n)}
2. Experts annotate their meaning {y(1), …, y(n)}
x(2): Show me the closest Starbucks
y(2): map(nearest(Starbucks))
x(3): Send a text to John that I’ll be late
y(3): txtmsg(John, I’ll be late)
x(1): How do I get to Starbucks?
y(1): directions(here,
nearest(Starbucks))
x(4): Set an alarm for seven in the morning
y(4): setalarm(7:00AM)
1990 2000
1980 2010
35. Example Learning Problems
Learning to respond to voice commands (Siri)
1. Task, T:
predicting action from speech
2. Performance measure, P:
percent of correct actions taken in user pilot
study
3. Experience, E:
examples of (speech, action) pairs
36
36. Problem Formulation
• Often, the same task can be formulated in more than one way:
• Ex: Loan applications
– creditworthiness/score (regression)
– probability of default (density estimation)
– loan decision (classification)
37
Problem Formulation:
What is the structure of our output prediction?
boolean Binary Classification
categorical Multiclass Classification
ordinal Ordinal Classification
real Regression
ordering Ranking
multiple discrete Structured Prediction
multiple continuous (e.g. dynamical systems)
both discrete & cont. (e.g. mixed graphical models)
37. Well-posed Learning Problems
In-Class Exercise
1. Select a task, T
2. Identify performance
measure, P
3. Identify experience, E
4. Report ideas back to
rest of class
38
Example Tasks
• Identify objects in an image
• Translate from one human language
to another
• Recognize speech
• Assess risk (e.g. in loan application)
• Make decisions (e.g. in loan
application)
• Assess potential (e.g. in admission
decisions)
• Categorize a complex situation (e.g.
medical diagnosis)
• Predict outcome (e.g. medical
prognosis, stock prices, inflation,
temperature)
• Predict events (default on loans,
quitting school, war)
• Plan ahead under perfect knowledge
(chess)
• Plan ahead under partial knowledge
(poker, bridge)
Examples from Roni Rosenfeld
39. Building a Trash Classifier
• Suppose the ask CMU to
build a robot for collecting trash along
Pittsburgh’s rivers
• You are tasked with building a classifier that
detects whether an object is a piece of trash
(+) or not a piece of trash (-)
• The robot can detect an object’s color,
sound, and weight
• You manually annotate the following
dataset based on objects you find
40
trash? color sound weight
+ green crinkly high
- brown crinkly low
- grey none high
+ clear none low
- green none low
40. WARNING!
Like many fields, Machine Learning
is riddled with copious amounts of
technical jargon!
For many terms we’ll define in this
class, you’ll find four or five
different terms in the literature
that refer to the same thing.
41
41. Supervised Binary Classification
42
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Labeled Dataset:
features
index color sound weight
1 brown none high
2 clear crinkly low
3 brown none low
Unlabeled Dataset:
label features
trash? color sound weight
- brown none high
One example:
• Def: an example contains a
label (aka. class) and features
(aka. point or attributes)
• Def: a labeled dataset consists
of rows, where each row is an
example
• Def: an unlabeled dataset only
has features
42. Supervised Binary Classification
• Def: an example contains a
label (aka. class) and features
(aka. point or attributes)
• Def: a labeled dataset consists
of rows, where each row is an
example
• Def: an unlabeled dataset only
has features
43
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Labeled Dataset:
features
index color sound weight
1 brown none high
2 clear crinkly low
3 brown none low
Unlabeled Dataset:
label features
trash? color sound weight
- brown none high
One example:
43. Supervised Binary Classification
• Def: an example contains a
label (aka. class) and features
(aka. point or attributes)
• Def: a labeled dataset consists
of rows, where each row is an
example
• Def: an unlabeled dataset only
has features
44
label features
index trash? color sound weight
1 + green crinkly high
2 - brown crinkly low
3 - grey none high
4 + clear none low
5 - green none low
Training Dataset:
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Test Dataset:
• Def: a training dataset is a
labeled dataset used to learn
a classifier
• Def: a classifier is a function
that takes in features and
predicts a label
• Def: a test dataset is a labeled
dataset used to evaluate a
classifier
Classifier
features àlabel
44. Supervised Binary Classification
• Def: predictions are the
output of a trained classifier
• Def: error rate is the
proportion of examples on
which we predicted the
wrong label
45
features
index color sound weight
1 brown none high
2 clear crinkly low
3 brown none low
(Unlabeled) Test Dataset:
• Def: a classifier is a function
that takes in features and
predicts a label
• Def: a training dataset is a
labeled dataset used to learn
a classifier
• Def: a test dataset is a labeled
dataset used to evaluate a
classifier
index trash?
1 +
2 +
3 -
Test Predictions:
predictions
Classifier
features àlabel
45. Supervised Binary Classification
• Def: predictions are the
output of a trained classifier
• Def: error rate is the
proportion of examples on
which we predicted the
wrong label
46
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
(Labeled) Test Dataset:
• Def: a classifier is a function
that takes in features and
predicts a label
• Def: a training dataset is a
labeled dataset used to learn
a classifier
• Def: a test dataset is a labeled
dataset used to evaluate a
classifier
error rate = 1/3
index trash?
1 +
2 +
3 -
Test Predictions:
predictions
46. Supervised Binary Classification
• Step 1: training
– Given: labeled training dataset
– Goal: learn a classifier from the
training dataset
• Step 2: prediction
– Given: unlabeled test dataset
– Given: learned classifier
– Goal: predict a label for each
instance
• Step 3: evaluation
– Given: predictions from Step II
– Given: labeled test dataset
– Goal: compute the test error
rate (i.e. error rate on the test
dataset)
48
error rate = 1/3
47. Supervised Binary Classification
• Step 1: training
– Given: labeled training dataset
– Goal: learn a classifier from the
training dataset
• Step 2: prediction
– Given: unlabeled test dataset
– Given: learned classifier
– Goal: predict a label for each
instance
• Step 3: evaluation
– Given: predictions from Step II
– Given: labeled test dataset
– Goal: compute the test error
rate (i.e. error rate on the test
dataset)
49
error rate = 1/3
48. Supervised Binary Classification
• Step 1: training
– Given: labeled training dataset
– Goal: learn a classifier from the
training dataset
• Step 2: prediction
– Given: unlabeled test dataset
– Given: learned classifier
– Goal: predict a label for each
instance
• Step 3: evaluation
– Given: predictions from Phase II
– Given: labeled test dataset
– Goal: compute the test error
rate (i.e. error rate on the test
dataset)
50
error rate = 1/3
“train time”
“test time”
49. Supervised Binary Classification
• Step 1: training
– Given: labeled training dataset
– Goal: learn a classifier from the
training dataset
• Step 2: prediction
– Given: unlabeled test dataset
– Given: learned classifier
– Goal: predict a label for each
instance
• Step 3: evaluation
– Given: predictions from Phase II
– Given: labeled test dataset
– Goal: compute the test error
rate (i.e. error rate on the test
dataset)
51
error rate = 1/3
Key question in
Machine Learning:
How do we learn the
classifier from data?
50. Random Classifier
The random classifier takes
in the features and always
predicts a random label.
…this is a terrible idea. It
completely ignores the
training data!
52
label features
index trash? color sound weight
1 + green crinkly high
2 - brown crinkly low
3 - grey none high
4 + clear none low
5 - green none low
Training Dataset:
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Test Dataset:
error rate = 2/3
index trash?
1 -
2 -
3 +
Test Predictions:
predictions
Classifier
features àrandom!
51. Random Classifier
The random classifier takes
in the features and always
predicts a random label.
…this is a terrible idea. It
completely ignores the
training data!
53
label features
index trash? color sound weight
1 + green crinkly high
2 - brown crinkly low
3 - grey none high
4 + clear none low
5 - green none low
Training Dataset:
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Test Dataset:
error rate = 1/3
index trash?
1 +
2 +
3 -
Test Predictions:
predictions
Classifier
features àrandom!
52. Random Classifier
The random classifier takes
in the features and always
predicts a random label.
…this is a terrible idea. It
completely ignores the
training data!
54
label features
index trash? color sound weight
1 + green crinkly high
2 - brown crinkly low
3 - grey none high
4 + clear none low
5 - green none low
Training Dataset:
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Test Dataset:
error rate = 3/3
index trash?
1 +
2 -
3 +
Test Predictions:
predictions
Classifier
features àrandom!
53. Majority Vote Classifier
The majority vote classifier
takes in the features and always
predicts the most common label
in the training dataset.
…this is still a pretty bad idea. It
completely ignores the features!
55
label features
index trash? color sound weight
1 + green crinkly high
2 - brown crinkly low
3 - grey none high
4 + clear none low
5 - green none low
Training Dataset:
label features
index trash? color sound weight
1 - brown none high
2 + clear crinkly low
3 - brown none low
Test Dataset:
error rate = 1/3
index trash?
1 -
2 -
3 -
Test Predictions:
predictions
Classifier
features àalways predict “-”
54. Majority Vote Classifier
The majority vote classifier
takes in the features and always
predicts the most common label
in the training dataset.
…this is still a pretty bad idea. It
completely ignores the features!
56
label features
index trash? color sound weight
1 + green crinkly high
2 - brown crinkly low
3 - grey none high
4 + clear none low
5 - green none low
Training Dataset:
error rate = 2/5
index trash?
1 -
2 -
3 -
4 -
5 -
Train Predictions:
predictions
Classifier
features àalways predict “-”
The majority vote classifier even
ignores the features if it’s making
predictions on the training dataset!
55. Majority Vote Classifier
• Step 1: training
– Given: labeled training dataset
– Goal: learn a classifier from the
training dataset
• Step 2: prediction
– Given: unlabeled test dataset
– Given: learned classifier
– Goal: predict a label for each
instance
• Step 3: evaluation
– Given: predictions from Step II
– Given: labeled test dataset
– Goal: compute the test error
rate (i.e. error rate on the test
dataset)
57
error rate = 1/3
Classifier
features àalways predict “-”
57. Syllabus Highlights
The syllabus is located on the course webpage:
http://www.cs.cmu.edu/~mgormley/courses/10601
or
http://mlcourse.org
The course policies are required reading.
59
58. Syllabus Highlights
• Grading: 50% homework, 15%
exam 1, 15% exam 2, 15% exam 3,
5% participation
• Exam 1: evening, Thu, Feb. 16
• Exam 2: evening, Thu, Mar. 30
• Exam 3: final exam week, date
TBD by registrar
• Homework: 3 written and 6
written + programming (Python)
– 6 grace days for homework
assignments
– Late submissions: 75% day 1, 50%
day 2, 25% day 3
– No submissions accepted after 3
days w/o extension; HW3, HW6,
HW9 only 2 days
– Extension requests: for
emergencies, see syllabus
• Recitations: Fridays, same
time/place as lecture (optional,
interactive sessions)
• Readings: required, online PDFs,
recommended for after lecture
• Technologies: Piazza (discussion),
Gradescope (homework), Google
Forms (polls)
• Academic Integrity:
– Collaboration encouraged, but
must be documented
– Solutions must always be written
independently
– No re-use of found code / past
assignments
– Severe penalties (e.g. -100%)
• Office Hours: posted on Google
Calendar on “Office Hours” page
60
59. Lectures
• You should ask lots of questions
– Interrupting (by raising a hand) to ask your question
is strongly encouraged
– Asking questions later (or in real time) on Piazza is
also great
• When I ask a question…
– I want you to answer
– Even if you don’t answer, think it through as though
I’m about to call on you
• Interaction improves learning (both in-class and
at my office hours)
61
64. In-Class Polls
66
Q: How do these In-Class Polls work?
A: Don’t worry about it for today. We won’t
use them until the second week of class, i.e.
the third lecture.
Details are on the syllabus.
66. Prerequisites
What they are:
• Significant programming experience (15-122)
– Written programs of 100s of lines of code
– Comfortable learning a new language
• Probability and statistics (36-217, 36-225,
etc.)
• Mathematical maturity: discrete
mathematics (21-127, 15-151), linear algebra,
and calculus
68
67. Prerequisites
What if you need additional review?
• Consider first taking 10-606/607: Mathematical/Computational
Foundations for Machine Learning
• More details here:
https://www.cs.cmu.edu/~pvirtue/10606/
69
How to describe 606/607 to a friend
606/607 is…
– a formal presentation of mathematics and computer science…
– motivated by (carefully chosen) real-world problems that arise in machine
learning…
– where the broader picture of how those problems arise is treated
somewhat informally.
69. Oh, the Places You’ll Use Probability!
Supervised Classification
• Naïve Bayes
• Logistic regression
72
Note: This is just
motivation – we’ll cover
these topics later!
P(Y = y|X = x; ) = p(y|x; )
=
( y · (x))
y ( y · (x)
p(y|x1, x2, . . . , xn) =
1
Z
p(y)
n
i=1
p(xi|y)
70. Oh, the Places You’ll Use Probability!
ML Theory
(Example: Sample Complexity)
73
• Goal: h has small error over D.
• Algo sees training sample S: (x1,c*(x1)),…, (xm,c*(xm)), xi i.i.d. from D
Training error: 𝑒𝑟𝑟𝑆 ℎ =
1
𝑚
𝐼 ℎ 𝑥𝑖 ≠ 𝑐∗ 𝑥𝑖
𝑖
True error: 𝑒𝑟𝑟𝐷 ℎ = Pr
𝑥~ 𝐷
(ℎ 𝑥 ≠ 𝑐∗
(𝑥))
• Does optimization over S, find hypothesis ℎ ∈ 𝐻.
PAC/SLT models for Supervised Learning
How often ℎ 𝑥 ≠ 𝑐∗
(𝑥) over future
instances drawn at random from D
• But, can only measure:
How often ℎ 𝑥 ≠ 𝑐∗
(𝑥) over training
instances
Sample complexity: bound 𝑒𝑟𝑟𝐷 ℎ in terms of 𝑒𝑟𝑟𝑆 ℎ
Note: This is just
motivation – we’ll cover
these topics later!
71. Oh, the Places You’ll Use Probability!
Deep Learning
(Example: Deep Bi-directional RNN)
74
x1
h1
y1
h1
x2
h2
y2
h2
x3
h3
y3
h3
x4
h4
y4
h4
Note: This is just
motivation – we’ll cover
these topics later!
72. Oh, the Places You’ll Use Probability!
Graphical Models
• Hidden Markov Model (HMM)
• Conditional Random Field (CRF)
75
time flies like an arrow
n v p d n
<START>
n ψ2 v ψ4 p ψ6 d ψ8 n
ψ1 ψ3 ψ5 ψ7 ψ9
ψ0
<START>
Note: This is just
motivation – we’ll cover
these topics later!
73. Prerequisites
What if I’m not sure whether I meet them?
• Don’t worry: we’re not sure either
• However, we’ve designed a way to assess
your background knowledge so that you
know what to study!
76
74. Syllabus Highlights
Background Test
• When: Fri, Jan 20, in-class
• Where: this lecture hall
• What: prerequisite material
(probability, statistics, linear
algebra, calculus, geometry,
computer science, programming)
• Why:
– an assessment tool to show you
what prereq topics to brush up on
– to save you some time on HW1 if
you already know it all
• How:
– α= % of points on Background Test
– β= % of points on Background
Exercises
– Grade: γ = α + (1−α)β
77
Your grade on HW1 will give you very little
information about which topics to study.
Hopefully, the Background Test does.
75. Syllabus Highlights
Background Test
• When: Fri, Jan 20, in-class
• Where: this lecture hall
• What: prerequisite material
(probability, statistics, linear
algebra, calculus, geometry,
computer science, programming)
• Why:
– an assessment tool to show you
what prereq topics to brush up on
– to save you some time on HW1 if
you already know it all
• How:
– α= % of points on Background Test
– β= % of points on Background
Exercises
– Grade: γ = α + (1−α)β
78
Correlation between Homework Average
and Midterm Exam:
• Pearson: 0.32 (weak - moderate)
• Spearman: 0.25 (weak)
Correlation between Background Test and
Midterm Exam:
• Pearson: 0.46 (moderate)
• Spearman: 0.43 (moderate)
76. Reminders
• Background Test
– Fri, Jan 20, in-class
• Homework 1: Background
– Out: Fri, Jan 20
– Due: Wed, Jan 25 at 11:59pm
– Two parts:
1. written part to Gradescope
2. programming part to Gradescope
79
77. Learning Objectives
You should be able to…
1. Formulate a well-posed learning problem for a real-
world task by identifying the task, performance
measure, and training experience
2. Describe common learning paradigms in terms of the
type of data available, when it’s available, the form of
prediction, and the structure of the output prediction
3. Implement Decision Tree training and prediction
(w/simple scoring function)
4. Explain the difference between memorization and
generalization [CIML]
5. Identify examples of the ethical responsibilities of an
ML expert
81