SlideShare a Scribd company logo
1 of 54
Machine Learning: An Overview
Sources
• AAAI. Machine Learning.
http://www.aaai.org/Pathfinder/html/machine.html
• Dietterich, T. (2003). Machine Learning. Nature Encyclopedia of
Cognitive Science.
• Doyle, P. Machine Learning.
http://www.cs.dartmouth.edu/~brd/Teaching/AI/Lectures/Summaries/lear
ning.html
• Dyer, C. (2004). Machine Learning.
http://www.cs.wisc.edu/~dyer/cs540/notes/learning.html
• Mitchell, T. (1997). Machine Learning.
• Nilsson, N. (2004). Introduction to Machine Learning.
http://robotics.stanford.edu/people/nilsson/mlbook.html
• Russell, S. (1997). Machine Learning. Handbook of Perception and
Cognition, Vol. 14, Chap. 4.
• Russell, S. (2002). Artificial Intelligence: A Modern Approach, Chap. 18-
20. http://aima.cs.berkeley.edu
What is Learning?
• “Learning denotes changes in a system that ... enable
a system to do the same task … more efficiently the
next time.” - Herbert Simon
• “Learning is constructing or modifying representations
of what is being experienced.” - Ryszard Michalski
• “Learning is making useful changes in our minds.” -
Marvin Minsky
“Machine learning refers to a system capable of the
autonomous acquisition and integration of knowledge.”
Why Machine Learning?
• No human experts
• industrial/manufacturing control
• mass spectrometer analysis, drug design, astronomic
discovery
• Black-box human expertise
• face/handwriting/speech recognition
• driving a car, flying a plane
• Rapidly changing phenomena
• credit scoring, financial modeling
• diagnosis, fraud detection
• Need for customization/personalization
• personalized news reader
• movie/book recommendation
Related Fields
Machine learning is primarily concerned with the
accuracy and effectiveness of the computer system.
psychological models
data
mining
cognitive science
decision theory
information theory
databases
machine
learning
neuroscience
statistics
evolutionary
models
control theory
Machine Learning Paradigms
• rote learning
• learning by being told (advice-taking)
• learning from examples (induction)
• learning by analogy
• speed-up learning
• concept learning
• clustering
• discovery
• …
Architecture of a Learning System
learning
element
critic
problem
generator
performance
element
ENVIRONMENT
feedback
changes
learning goals
actions
percepts
performance standard
knowledge
Learning Element
Design affected by:
• performance element used
• e.g., utility-based agent, reactive agent, logical
agent
• functional component to be learned
• e.g., classifier, evaluation function, perception-
action function,
• representation of functional component
• e.g., weighted linear function, logical theory, HMM
• feedback available
• e.g., correct action, reward, relative preferences
Dimensions of Learning Systems
• type of feedback
• supervised (labeled examples)
• unsupervised (unlabeled examples)
• reinforcement (reward)
• representation
• attribute-based (feature vector)
• relational (first-order logic)
• use of knowledge
• empirical (knowledge-free)
• analytical (knowledge-guided)
Outline
• Supervised learning
• empirical learning (knowledge-free)
• attribute-value representation
• logical representation
• analytical learning (knowledge-guided)
• Reinforcement learning
• Unsupervised learning
• Performance evaluation
• Computational learning theory
Inductive (Supervised) Learning
Basic Problem: Induce a representation of a function (a
systematic relationship between inputs and outputs)
from examples.
• target function f: X → Y
• example (x,f(x))
• hypothesis g: X → Y such that g(x) = f(x)
x = set of attribute values (attribute-value representation)
x = set of logical sentences (first-order representation)
Y = set of discrete labels (classification)
Y =  (regression)
Decision Trees
Should I wait at this restaurant?
Decision Tree Induction
(Recursively) partition examples according to the most
important attribute.
Key Concepts
• entropy
• impurity of a set of examples (entropy = 0 if perfectly
homogeneous)
• (#bits needed to encode class of an arbitrary example)
• information gain
• expected reduction in entropy caused by partitioning
Decision Tree Induction: Attribute Selection
Intuitively: A good attribute splits the examples
into subsets that are (ideally) all positive or all
negative.
Decision Tree Induction: Attribute Selection
Intuitively: A good attribute splits the examples
into subsets that are (ideally) all positive or all
negative.
Decision Tree Induction: Decision Boundary
Decision Tree Induction: Decision Boundary
Decision Tree Induction: Decision Boundary
Decision Tree Induction: Decision Boundary
(Artificial) Neural Networks
• Motivation: human brain
• massively parallel (1011
neurons, ~20 types)
• small computational units
with simple low-bandwidth
communication (1014
synapses, 1-10ms cycle
time)
• Realization: neural network
• units ( neurons) connected
by directed weighted links
• activation function from
inputs to output
Neural Networks (continued)
• neural network = parameterized family of nonlinear functions
• types
• feed-forward (acyclic): single-layer perceptrons, multi-layer networks
• recurrent (cyclic): Hopfield networks, Boltzmann machines
[ connectionism, parallel distributed processing]
Neural Network Learning
Key Idea: Adjusting the weights changes the function
represented by the neural network (learning =
optimization in weight space).
Iteratively adjust weights to reduce error (difference
between network output and target output).
• Weight Update
• perceptron training rule
• linear programming
• delta rule
• backpropagation
Neural Network Learning: Decision Boundary
single-layer perceptron multi-layer network
Support Vector Machines
Kernel Trick: Map data to higher-dimensional
space where they will be linearly separable.
Learning a Classifier
• optimal linear separator is one that has the
largest margin between positive examples on
one side and negative examples on the other
• = quadratic programming optimization
Support Vector Machines (continued)
Key Concept: Training data enters optimization problem
in the form of dot products of pairs of points.
• support vectors
• weights associated with data points are zero except for those
points nearest the separator (i.e., the support vectors)
• kernel function K(xi,xj)
• function that can be applied to pairs of points to evaluate dot
products in the corresponding (higher-dimensional) feature
space F (without having to directly compute F(x) first)
efficient training and complex functions!
Support Vector Machines: Decision Boundary
Ф
Bayesian Networks
Network topology reflects
direct causal influence
Basic Task: Compute
probability distribution
for unknown variables
given observed values
of other variables.
[belief networks, causal networks]
A B A B A B A B
C 0.9 0.3 0.5 0.1
C 0.1 0.7 0.5 0.9
conditional probability table
for NeighbourCalls
Bayesian Network Learning
Key Concepts
• nodes (attributes) = random variables
• conditional independence
• an attribute is conditionally independent of its non-
descendants, given its parents
• conditional probability table
• conditional probability distribution of an attribute
given its parents
• Bayes Theorem
• P(h|D) = P(D|h)P(h) / P(D)
Bayesian Network Learning (continued)
Find most probable hypothesis given the data.
In theory: Use posterior probabilities to weight
hypotheses. (Bayes optimal classifier)
In practice: Use single, maximum a posteriori (most
probable) hypothesis.
Settings
• known structure, fully observable (parameter learning)
• unknown structure, fully observable (structural
learning)
• known structure, hidden variables (EM algorithm)
• unknown structure, hidden variables (?)
Nearest Neighbor Models
Key Idea: Properties of an input x are likely to be similar
to those of points in the neighborhood of x.
Basic Idea: Find (k) nearest neighbor(s) of x and infer
target attribute value(s) of x based on corresponding
attribute value(s).
Form of non-parametric learning where hypothesis
complexity grows with data (learned model  all
examples seen so far)
[instance-based learning, case-based reasoning, analogical reasoning]
Nearest Neighbor Model: Decision Boundary
Learning Logical Theories
Logical Formulation of Supervised Learning
• attribute → unary predicate
• instance x → logical sentence
• positive/negative classifications → sentences
Q(xi),Q(xi)
• training set → conjunction of all description and
classification sentences
Learning Task: Find an equivalent logical expression for
the goal predicate Q to classify examples correctly.
Hypothesis  Descriptions ╞═ Classifications
Learning Logic Theories: Example
Input
• Father(Philip,Charles), Father(Philip,Anne), …
• Mother(Mum,Margaret), Mother(Mum,Elizabeth), …
• Married(Diana,Charles), Married(Elizabeth,Philip), …
• Male(Philip),Female(Anne),…
• Grandparent(Mum,Charles),Grandparent(Elizabeth,Beatrice),
Grandparent(Mum,Harry),Grandparent(Spencer,Pete),…
Output
• Grandparent(x,y) 
[z Mother(x,z)  Mother(z,y)]  [z Mother(x,z)  Father(z,y)] 
[z Father(x,z)  Mother(z,y)]  [z Father(x,z)  Father(z,y)]
Learning Logic Theories
Key Concepts
• specialization
• triggered by false positives (goal: exclude negative examples)
• achieved by adding conditions, dropping disjuncts
• generalization
• triggered by false negatives (goal: include positive examples)
• achieved by dropping conditions, adding disjuncts
Learning
• current-best-hypothesis: incrementally improve single
hypothesis (e.g., sequential covering)
• least-commitment search: maintain all hypotheses
consistent with examples seen so far (e.g., version
space)
Learning Logic Theories: Decision Boundary
Learning Logic Theories: Decision Boundary
Learning Logic Theories: Decision Boundary
Learning Logic Theories: Decision Boundary
Learning Logic Theories: Decision Boundary
Analytical Learning
Prior Knowledge in Learning
Recall:
Grandparent(x,y) 
[z Mother(x,z)  Mother)]  [z Mother(x,z)  Father(z,y)] 
[z Father(x,z)  Mother(z,y)]  [z Father(x,z)  Father(z,y)]
• Suppose initial theory also included:
• Parent(x,y)  [Mother(x,y)  Father(x,y)]
• Final Hypothesis:
• Grandparent(x,y)  [z Parent(x,z)  Parent(z,y)]
Background knowledge can dramatically reduce the size of
the hypothesis (greatly simplifying the learning problem).
Explanation-Based Learning
Amazed crowd of cavemen observe Zog roasting a
lizard on the end of a pointed stick (“Look what Zog
do!”) and thereafter abandon roasting with their
bare hands.
Basic Idea: Generalize by explaining observed instance.
• form of speedup learning
• doesn’t learn anything factually new from the observation
• instead converts first-principles theories into useful special-
purpose knowledge
• utility problem
• cost of determining if learned knowledge is applicable may
outweight benefits from its application
Relevance-Based Learning
Mary travels to Brazil and meets her first Brazilian
(Fernando), who speaks Portuguese. She concludes
that all Brazilians speak Portuguese but not that all
Brazilians are named Fernando.
Basic Idea: Use knowledge of what is relevant to infer
new properties about a new instance.
• form of deductive learning
• learns a new general rule that explains observations
• does not create knowledge outside logical content of prior
knowledge and observations
Knowledge-Based Inductive Learning
Medical student observes consulting session
between doctor and patient at the end of which the
doctor prescribes a particular medication. Student
concludes that the medication is effective
treatment for a particular type of infection.
Basic Idea: Use prior knowledge to guide hypothesis
generation.
• benefits in inductive logic programming
• only hypotheses consistent with prior knowledge and
observations are considered
• prior knowledge supports smaller (simpler) hypotheses
Reinforcement Learning
k-armed bandit problem:
Agent is in a room with k gambling machines (one-armed bandits).
When an arm is pulled, the machine pays off 1 or 0, according to
some unknown probability distribution. Given a fixed number of pulls,
what is the agent’s (optimal) strategy?
Basic Task: Find a policy , mapping states to actions, that
maximizes (long-term) reward.
Model (Markov Decision Process)
• set of states S
• set of actions A
• reward function R : S  A → 
• state transition function T : S  A → (S)
• T(s,a,s') = probability of reaching s' when a is executed in s
Reinforcement Learning (continued)
• Settings
• fully vs. partially observable environment
• deterministic vs. stochastic environment
• model-based vs. model-free
• rewards in goal state only or in any state
value of a state: expected infinite discounted sum of reward the
agent will gain if it starts from that state and executes the optimal
policy
Solving MDP when the model is known
• value iteration: find optimal value function (derive optimal policy)
• policy iteration: find optimal policy directly (derive value function)
Reinforcement Learning (continued)
Reinforcement learning is concerned with finding an
optimal policy for an MDP when the model (transition,
reward) is unknown.
exploration/exploitation tradeoff
model-free reinforcement learning
• learn a controller without learning a model first
• e.g., adaptive heuristic critic (TD()), Q-learning
model-based reinforcement learning
• learn a model first
• e.g., Dyna, prioritized sweeping, RTDP
Unsupervised Learning
Learn patterns from (unlabeled) data.
Approaches
• clustering (similarity-based)
• density estimation (e.g., EM algorithm)
Performance Tasks
• understanding and visualization
• anomaly detection
• information retrieval
• data compression
Performance Evaluation
• Randomly split examples into training set U
and test set V.
• Use training set to learn a hypothesis H.
• Measure % of V correctly classified by H.
• Repeat for different random splits and average
results.
Performance Evaluation: Learning Curves
#training examples
classification accuracy
classification error
Performance Evaluation: ROC Curves
false positives
false
negatives
Performance Evaluation: Accuracy/Coverage
coverage
classification
accuracy
Triple Tradeoff in Empirical Learning
• size/complexity of
learned classifier
• amount of training data
• generalization accuracy
bias-variance tradeoff
Computational Learning Theory
probably approximately correct (PAC) learning
With probability  1 - , error will be  .
Basic principle: Any hypothesis that is seriously wrong
will almost certainly be found out with high probability
after a small number of examples.
Key Concepts
• examples drawn from same distribution (stationarity
assumption)
• sample complexity is a function of confidence, error,
and size of hypothesis space
|)
|
ln
1
(ln
1
H
m 



Current Machine Learning Research
• Representation
• data sequences
• spatial/temporal data
• probabilistic relational models
• …
• Approaches
• ensemble methods
• cost-sensitive learning
• active learning
• semi-supervised learning
• collective classification
• …

More Related Content

Similar to ML_Overview.ppt

Topic_6
Topic_6Topic_6
Topic_6butest
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning IntroductionDong Guo
 
Week_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptxWeek_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptxmuhammadsamroz
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstractionEdward Blurock
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with TensorflowShubham Sharma
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural NetworkYan Xu
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
AI Unit 5 machine learning
AI Unit 5 machine learning AI Unit 5 machine learning
AI Unit 5 machine learning Narayan Dhamala
 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation LearningIsabelle Augenstein
 
Artificial Intelligence for Health Professional Education
Artificial Intelligence for Health Professional EducationArtificial Intelligence for Health Professional Education
Artificial Intelligence for Health Professional EducationVaikunthan Rajaratnam
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.butest
 
1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptxKv Sagar
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfSisayNegash4
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksKevin Lee
 
Natural-Inspired_Amany_Final.pptx
Natural-Inspired_Amany_Final.pptxNatural-Inspired_Amany_Final.pptx
Natural-Inspired_Amany_Final.pptxamanyarafa1
 
Big Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingBig Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingNa'im Tyson
 

Similar to ML_Overview.ppt (20)

ppt
pptppt
ppt
 
Topic_6
Topic_6Topic_6
Topic_6
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Week_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptxWeek_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptx
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstraction
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
AI Unit 5 machine learning
AI Unit 5 machine learning AI Unit 5 machine learning
AI Unit 5 machine learning
 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation Learning
 
Artificial Intelligence for Health Professional Education
Artificial Intelligence for Health Professional EducationArtificial Intelligence for Health Professional Education
Artificial Intelligence for Health Professional Education
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
Mini_Project
Mini_ProjectMini_Project
Mini_Project
 
1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx1. Introduction to deep learning.pptx
1. Introduction to deep learning.pptx
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it works
 
Natural-Inspired_Amany_Final.pptx
Natural-Inspired_Amany_Final.pptxNatural-Inspired_Amany_Final.pptx
Natural-Inspired_Amany_Final.pptx
 
Big Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingBig Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic Processing
 
machine learning
machine learningmachine learning
machine learning
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

ML_Overview.ppt

  • 2. Sources • AAAI. Machine Learning. http://www.aaai.org/Pathfinder/html/machine.html • Dietterich, T. (2003). Machine Learning. Nature Encyclopedia of Cognitive Science. • Doyle, P. Machine Learning. http://www.cs.dartmouth.edu/~brd/Teaching/AI/Lectures/Summaries/lear ning.html • Dyer, C. (2004). Machine Learning. http://www.cs.wisc.edu/~dyer/cs540/notes/learning.html • Mitchell, T. (1997). Machine Learning. • Nilsson, N. (2004). Introduction to Machine Learning. http://robotics.stanford.edu/people/nilsson/mlbook.html • Russell, S. (1997). Machine Learning. Handbook of Perception and Cognition, Vol. 14, Chap. 4. • Russell, S. (2002). Artificial Intelligence: A Modern Approach, Chap. 18- 20. http://aima.cs.berkeley.edu
  • 3. What is Learning? • “Learning denotes changes in a system that ... enable a system to do the same task … more efficiently the next time.” - Herbert Simon • “Learning is constructing or modifying representations of what is being experienced.” - Ryszard Michalski • “Learning is making useful changes in our minds.” - Marvin Minsky “Machine learning refers to a system capable of the autonomous acquisition and integration of knowledge.”
  • 4. Why Machine Learning? • No human experts • industrial/manufacturing control • mass spectrometer analysis, drug design, astronomic discovery • Black-box human expertise • face/handwriting/speech recognition • driving a car, flying a plane • Rapidly changing phenomena • credit scoring, financial modeling • diagnosis, fraud detection • Need for customization/personalization • personalized news reader • movie/book recommendation
  • 5. Related Fields Machine learning is primarily concerned with the accuracy and effectiveness of the computer system. psychological models data mining cognitive science decision theory information theory databases machine learning neuroscience statistics evolutionary models control theory
  • 6. Machine Learning Paradigms • rote learning • learning by being told (advice-taking) • learning from examples (induction) • learning by analogy • speed-up learning • concept learning • clustering • discovery • …
  • 7. Architecture of a Learning System learning element critic problem generator performance element ENVIRONMENT feedback changes learning goals actions percepts performance standard knowledge
  • 8. Learning Element Design affected by: • performance element used • e.g., utility-based agent, reactive agent, logical agent • functional component to be learned • e.g., classifier, evaluation function, perception- action function, • representation of functional component • e.g., weighted linear function, logical theory, HMM • feedback available • e.g., correct action, reward, relative preferences
  • 9. Dimensions of Learning Systems • type of feedback • supervised (labeled examples) • unsupervised (unlabeled examples) • reinforcement (reward) • representation • attribute-based (feature vector) • relational (first-order logic) • use of knowledge • empirical (knowledge-free) • analytical (knowledge-guided)
  • 10. Outline • Supervised learning • empirical learning (knowledge-free) • attribute-value representation • logical representation • analytical learning (knowledge-guided) • Reinforcement learning • Unsupervised learning • Performance evaluation • Computational learning theory
  • 11. Inductive (Supervised) Learning Basic Problem: Induce a representation of a function (a systematic relationship between inputs and outputs) from examples. • target function f: X → Y • example (x,f(x)) • hypothesis g: X → Y such that g(x) = f(x) x = set of attribute values (attribute-value representation) x = set of logical sentences (first-order representation) Y = set of discrete labels (classification) Y =  (regression)
  • 12. Decision Trees Should I wait at this restaurant?
  • 13. Decision Tree Induction (Recursively) partition examples according to the most important attribute. Key Concepts • entropy • impurity of a set of examples (entropy = 0 if perfectly homogeneous) • (#bits needed to encode class of an arbitrary example) • information gain • expected reduction in entropy caused by partitioning
  • 14. Decision Tree Induction: Attribute Selection Intuitively: A good attribute splits the examples into subsets that are (ideally) all positive or all negative.
  • 15. Decision Tree Induction: Attribute Selection Intuitively: A good attribute splits the examples into subsets that are (ideally) all positive or all negative.
  • 16. Decision Tree Induction: Decision Boundary
  • 17. Decision Tree Induction: Decision Boundary
  • 18. Decision Tree Induction: Decision Boundary
  • 19. Decision Tree Induction: Decision Boundary
  • 20. (Artificial) Neural Networks • Motivation: human brain • massively parallel (1011 neurons, ~20 types) • small computational units with simple low-bandwidth communication (1014 synapses, 1-10ms cycle time) • Realization: neural network • units ( neurons) connected by directed weighted links • activation function from inputs to output
  • 21. Neural Networks (continued) • neural network = parameterized family of nonlinear functions • types • feed-forward (acyclic): single-layer perceptrons, multi-layer networks • recurrent (cyclic): Hopfield networks, Boltzmann machines [ connectionism, parallel distributed processing]
  • 22. Neural Network Learning Key Idea: Adjusting the weights changes the function represented by the neural network (learning = optimization in weight space). Iteratively adjust weights to reduce error (difference between network output and target output). • Weight Update • perceptron training rule • linear programming • delta rule • backpropagation
  • 23. Neural Network Learning: Decision Boundary single-layer perceptron multi-layer network
  • 24. Support Vector Machines Kernel Trick: Map data to higher-dimensional space where they will be linearly separable. Learning a Classifier • optimal linear separator is one that has the largest margin between positive examples on one side and negative examples on the other • = quadratic programming optimization
  • 25. Support Vector Machines (continued) Key Concept: Training data enters optimization problem in the form of dot products of pairs of points. • support vectors • weights associated with data points are zero except for those points nearest the separator (i.e., the support vectors) • kernel function K(xi,xj) • function that can be applied to pairs of points to evaluate dot products in the corresponding (higher-dimensional) feature space F (without having to directly compute F(x) first) efficient training and complex functions!
  • 26. Support Vector Machines: Decision Boundary Ф
  • 27. Bayesian Networks Network topology reflects direct causal influence Basic Task: Compute probability distribution for unknown variables given observed values of other variables. [belief networks, causal networks] A B A B A B A B C 0.9 0.3 0.5 0.1 C 0.1 0.7 0.5 0.9 conditional probability table for NeighbourCalls
  • 28. Bayesian Network Learning Key Concepts • nodes (attributes) = random variables • conditional independence • an attribute is conditionally independent of its non- descendants, given its parents • conditional probability table • conditional probability distribution of an attribute given its parents • Bayes Theorem • P(h|D) = P(D|h)P(h) / P(D)
  • 29. Bayesian Network Learning (continued) Find most probable hypothesis given the data. In theory: Use posterior probabilities to weight hypotheses. (Bayes optimal classifier) In practice: Use single, maximum a posteriori (most probable) hypothesis. Settings • known structure, fully observable (parameter learning) • unknown structure, fully observable (structural learning) • known structure, hidden variables (EM algorithm) • unknown structure, hidden variables (?)
  • 30. Nearest Neighbor Models Key Idea: Properties of an input x are likely to be similar to those of points in the neighborhood of x. Basic Idea: Find (k) nearest neighbor(s) of x and infer target attribute value(s) of x based on corresponding attribute value(s). Form of non-parametric learning where hypothesis complexity grows with data (learned model  all examples seen so far) [instance-based learning, case-based reasoning, analogical reasoning]
  • 31. Nearest Neighbor Model: Decision Boundary
  • 32. Learning Logical Theories Logical Formulation of Supervised Learning • attribute → unary predicate • instance x → logical sentence • positive/negative classifications → sentences Q(xi),Q(xi) • training set → conjunction of all description and classification sentences Learning Task: Find an equivalent logical expression for the goal predicate Q to classify examples correctly. Hypothesis  Descriptions ╞═ Classifications
  • 33. Learning Logic Theories: Example Input • Father(Philip,Charles), Father(Philip,Anne), … • Mother(Mum,Margaret), Mother(Mum,Elizabeth), … • Married(Diana,Charles), Married(Elizabeth,Philip), … • Male(Philip),Female(Anne),… • Grandparent(Mum,Charles),Grandparent(Elizabeth,Beatrice), Grandparent(Mum,Harry),Grandparent(Spencer,Pete),… Output • Grandparent(x,y)  [z Mother(x,z)  Mother(z,y)]  [z Mother(x,z)  Father(z,y)]  [z Father(x,z)  Mother(z,y)]  [z Father(x,z)  Father(z,y)]
  • 34. Learning Logic Theories Key Concepts • specialization • triggered by false positives (goal: exclude negative examples) • achieved by adding conditions, dropping disjuncts • generalization • triggered by false negatives (goal: include positive examples) • achieved by dropping conditions, adding disjuncts Learning • current-best-hypothesis: incrementally improve single hypothesis (e.g., sequential covering) • least-commitment search: maintain all hypotheses consistent with examples seen so far (e.g., version space)
  • 35. Learning Logic Theories: Decision Boundary
  • 36. Learning Logic Theories: Decision Boundary
  • 37. Learning Logic Theories: Decision Boundary
  • 38. Learning Logic Theories: Decision Boundary
  • 39. Learning Logic Theories: Decision Boundary
  • 40. Analytical Learning Prior Knowledge in Learning Recall: Grandparent(x,y)  [z Mother(x,z)  Mother)]  [z Mother(x,z)  Father(z,y)]  [z Father(x,z)  Mother(z,y)]  [z Father(x,z)  Father(z,y)] • Suppose initial theory also included: • Parent(x,y)  [Mother(x,y)  Father(x,y)] • Final Hypothesis: • Grandparent(x,y)  [z Parent(x,z)  Parent(z,y)] Background knowledge can dramatically reduce the size of the hypothesis (greatly simplifying the learning problem).
  • 41. Explanation-Based Learning Amazed crowd of cavemen observe Zog roasting a lizard on the end of a pointed stick (“Look what Zog do!”) and thereafter abandon roasting with their bare hands. Basic Idea: Generalize by explaining observed instance. • form of speedup learning • doesn’t learn anything factually new from the observation • instead converts first-principles theories into useful special- purpose knowledge • utility problem • cost of determining if learned knowledge is applicable may outweight benefits from its application
  • 42. Relevance-Based Learning Mary travels to Brazil and meets her first Brazilian (Fernando), who speaks Portuguese. She concludes that all Brazilians speak Portuguese but not that all Brazilians are named Fernando. Basic Idea: Use knowledge of what is relevant to infer new properties about a new instance. • form of deductive learning • learns a new general rule that explains observations • does not create knowledge outside logical content of prior knowledge and observations
  • 43. Knowledge-Based Inductive Learning Medical student observes consulting session between doctor and patient at the end of which the doctor prescribes a particular medication. Student concludes that the medication is effective treatment for a particular type of infection. Basic Idea: Use prior knowledge to guide hypothesis generation. • benefits in inductive logic programming • only hypotheses consistent with prior knowledge and observations are considered • prior knowledge supports smaller (simpler) hypotheses
  • 44. Reinforcement Learning k-armed bandit problem: Agent is in a room with k gambling machines (one-armed bandits). When an arm is pulled, the machine pays off 1 or 0, according to some unknown probability distribution. Given a fixed number of pulls, what is the agent’s (optimal) strategy? Basic Task: Find a policy , mapping states to actions, that maximizes (long-term) reward. Model (Markov Decision Process) • set of states S • set of actions A • reward function R : S  A →  • state transition function T : S  A → (S) • T(s,a,s') = probability of reaching s' when a is executed in s
  • 45. Reinforcement Learning (continued) • Settings • fully vs. partially observable environment • deterministic vs. stochastic environment • model-based vs. model-free • rewards in goal state only or in any state value of a state: expected infinite discounted sum of reward the agent will gain if it starts from that state and executes the optimal policy Solving MDP when the model is known • value iteration: find optimal value function (derive optimal policy) • policy iteration: find optimal policy directly (derive value function)
  • 46. Reinforcement Learning (continued) Reinforcement learning is concerned with finding an optimal policy for an MDP when the model (transition, reward) is unknown. exploration/exploitation tradeoff model-free reinforcement learning • learn a controller without learning a model first • e.g., adaptive heuristic critic (TD()), Q-learning model-based reinforcement learning • learn a model first • e.g., Dyna, prioritized sweeping, RTDP
  • 47. Unsupervised Learning Learn patterns from (unlabeled) data. Approaches • clustering (similarity-based) • density estimation (e.g., EM algorithm) Performance Tasks • understanding and visualization • anomaly detection • information retrieval • data compression
  • 48. Performance Evaluation • Randomly split examples into training set U and test set V. • Use training set to learn a hypothesis H. • Measure % of V correctly classified by H. • Repeat for different random splits and average results.
  • 49. Performance Evaluation: Learning Curves #training examples classification accuracy classification error
  • 50. Performance Evaluation: ROC Curves false positives false negatives
  • 52. Triple Tradeoff in Empirical Learning • size/complexity of learned classifier • amount of training data • generalization accuracy bias-variance tradeoff
  • 53. Computational Learning Theory probably approximately correct (PAC) learning With probability  1 - , error will be  . Basic principle: Any hypothesis that is seriously wrong will almost certainly be found out with high probability after a small number of examples. Key Concepts • examples drawn from same distribution (stationarity assumption) • sample complexity is a function of confidence, error, and size of hypothesis space |) | ln 1 (ln 1 H m    
  • 54. Current Machine Learning Research • Representation • data sequences • spatial/temporal data • probabilistic relational models • … • Approaches • ensemble methods • cost-sensitive learning • active learning • semi-supervised learning • collective classification • …