SlideShare a Scribd company logo
1 of 18
Download to read offline
The Foundations of (Machine) Learning
Stefan Kühn
codecentric AG
CSMLS Meetup Hamburg - February 26th, 2015
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 1 / 18
Contents
1 Supervised Machine Learning
2 Optimization Theory
3 Concrete Optimization Methods
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 2 / 18
1 Supervised Machine Learning
2 Optimization Theory
3 Concrete Optimization Methods
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 3 / 18
Setting
Supervised Learning Approach
Use labeled training data in order to fit a given model to the data, i.e. to
learn from the given data.
Typical Problems:
Classification - discrete output
Logistic Regression
Neural Networks
Support Vector Machines
Regression - continuous output
Linear Regression
Support Vector Regression
Generalized Linear/Additive Models
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 4 / 18
Training and Learning
Ingredients:
Training Data Set
Model, e.g. Logistic Regression
Error Measure, e.g. Mean Squared Error
Learning Procedure:
Derive objective function from Model and Error Measure
Initialize Model parameters
Find a good fit!
Iterate with other initial parameters
What is Learning in this context?
Learning is nothing but the application of an algorithm for unconstrained
optimization to the given objective function.
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 5 / 18
1 Supervised Machine Learning
2 Optimization Theory
3 Concrete Optimization Methods
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 6 / 18
Unconstrained Optimization
Higher-order methods
Newton’s method (fast local convergence)
Gradient-based methods
Gradient Descent / Steepest Descent (globally convergent)
Conjugate Gradient (globally convergent)
Gauß-Newton, Levenberg-Marquardt, Quasi-Newton
Krylow Subspace methods
Derivative-free methods, direct search
Secant method (locally convergent)
Regula Falsi and successors (global convergence, typically slow)
Nelder-Mead / Downhill-Simplex
unconventional method, creates a moving simplex
driven by reflection/contraction/expansion of the corner points
globally convergent for differentiable functions f ∈ C1
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 7 / 18
General Iterative Algorithmic Scheme
Goal: Minimize a given function f :
min f (x), x ∈ Rn
Iterative Algorithms
Starting from a given point an iterative algorithm tries to minimize the
objective function step by step.
Preparation: k = 0
Initialization: Choose initial points and parameters
Iterate until convergence: k = 1, 2, 3, . . .
Termination criterion: Check optimality of the current iterate
Descent Direction: Find reasonable search direction
Stepsize: Determine length of the step in the given direction
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 8 / 18
Termination criteria
Critical points x∗
:
f (x∗
) = 0
Gradient: Should converge to zero
f (x∗
) < tol
Iterates: Distance between xk and xk+1 should converge to zero
xk
− xk+1
< tol
Function Values: Difference between f (xk) and f (xk+1) should
converge to zero
f (xk
) − f (xk+1
) < tol
Number of iterations: Terminate after maxiter iterations
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 9 / 18
Descent direction
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 10 / 18
Descent direction
Geometric interpretation
d is a descent direction if and only if the angle α between the gradient
f (x) and d is in a certain range:
π
2
= 90◦
< α < 270◦
=
3π
2
Algebraic equivalent
The sign of the scalar product between two vectors a and b is determined
by the cosine of the angle α between a and b:
a, b = aT
b = a b cos α(a, b)
d is a descent direction if and only if:
dT
f (x) < 0
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 11 / 18
Stepsize
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 12 / 18
Stepsize
Armijo’s rule
Takes two parameters 0 < σ < 1, and 0 < ρ < 0.5
For = 0, 1, 2, ... test Armijo condition:
f (p + σ d) < f (p) + ρσ dT
f (p)
Accepted stepsize
First that passes this test determines the accepted stepsize
t = σ
Standard Armijo implies, that for the accepted stepsize t always holds
t <= 1, only semi-efficient.
Technical detail: Widening
Test whether some t > 0 satisfy Armijo condition, i.e. check
= −1, −2, . . . as well, ensures efficiency.
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 13 / 18
1 Supervised Machine Learning
2 Optimization Theory
3 Concrete Optimization Methods
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 14 / 18
Gradient Descent
Descent direction
Direction of Steepest Descent, the negative gradient:
d = − f (x)
Motivation:
corresponds to α = 180◦ = π
obvious choice, always a descent direction, no test needed
guarantees the quickest win locally
works with inexact line search, e.g. Armijo’ s rule
works for functions f ∈ C1
always solves auxiliary optimization problem
min sT
f (x), s ∈ Rn
, s = 1
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 15 / 18
Conjugate Gradient
Motivation: Quadratic Model Problem, minimize
f (x) = Ax − b 2
Optimality condition:
f (x∗
) = 2AT
(Ax∗
− b) = 0
Obvious approach: Solve system of linear equations
AT
Ax = AT
b
Descent direction
Consecutive directions di , . . . , di+k satisfy certain orthogonality or
conjugacy conditions, M = AT A symmetric positive definite:
dT
i Mdj = 0, i = j
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 16 / 18
Nonlinear Conjugate Gradient
Initial Steps:
start at point x0 with d0 = − f (x0)
perform exact line search, find
t0 = arg min f (x0 + td0), t > 0
set x1 = x0 + t0d0.
Iteration:
set ∆k = − f (xk)
compute βk via one of the available formulas (next slide)
update conjugate search direction dk = ∆k + βkdk−1
perform exact line search, find
tk = arg min f (xk + tdk), t > 0
set xk+1 = xk + tkdk
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 17 / 18
Nonlinear Conjugate Gradient
Formulas for βk:
Fletcher-Reeves
βFR
k =
∆T
k ∆k
∆T
k−1∆k−1
Polak-Ribière βPR
k =
∆T
k (∆k − ∆k−1)
∆T
k−1∆k−1
Hestenes-Stiefel βHS
k = −
∆T
k (∆k − ∆k−1)
sT
k−1 (∆k − ∆k−1)
Dai-Yuan βDY
k = −
∆T
k ∆k
sT
k−1 (∆k − ∆k−1)
Reasonable choice with automatic direction reset:
β = max 0, βPR
Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 18 / 18

More Related Content

What's hot

Matrix multiplicationdesign
Matrix multiplicationdesignMatrix multiplicationdesign
Matrix multiplicationdesignRespa Peter
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMvikas dhakane
 
Matrix chain multiplication by MHM
Matrix chain multiplication by MHMMatrix chain multiplication by MHM
Matrix chain multiplication by MHMMd Mosharof Hosen
 
Matrix mult class-17
Matrix mult class-17Matrix mult class-17
Matrix mult class-17Kumar
 
Addition and subtraction with signed magnitude data (mano
Addition and subtraction with signed magnitude data (manoAddition and subtraction with signed magnitude data (mano
Addition and subtraction with signed magnitude data (manocs19club
 
Presentation - Bi-directional A-star search
Presentation - Bi-directional A-star searchPresentation - Bi-directional A-star search
Presentation - Bi-directional A-star searchMohammad Saiful Islam
 
hospital management
hospital managementhospital management
hospital managementguestbcbbb5c
 
Circular convolution Using DFT Matlab Code
Circular convolution Using DFT Matlab CodeCircular convolution Using DFT Matlab Code
Circular convolution Using DFT Matlab CodeBharti Airtel Ltd.
 
Data Structure: Algorithm and analysis
Data Structure: Algorithm and analysisData Structure: Algorithm and analysis
Data Structure: Algorithm and analysisDr. Rajdeep Chatterjee
 
Basic_analysis.ppt
Basic_analysis.pptBasic_analysis.ppt
Basic_analysis.pptSoumyaJ3
 
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...theijes
 
Lecture 21 problem reduction search ao star search
Lecture 21 problem reduction search ao star searchLecture 21 problem reduction search ao star search
Lecture 21 problem reduction search ao star searchHema Kashyap
 
Fast coputation of Phi(x) inverse
Fast coputation of Phi(x) inverseFast coputation of Phi(x) inverse
Fast coputation of Phi(x) inverseJohn Cook
 
Approximation algorithms
Approximation algorithmsApproximation algorithms
Approximation algorithmsGanesh Solanke
 

What's hot (20)

Matrix multiplicationdesign
Matrix multiplicationdesignMatrix multiplicationdesign
Matrix multiplicationdesign
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
 
Matrix chain multiplication by MHM
Matrix chain multiplication by MHMMatrix chain multiplication by MHM
Matrix chain multiplication by MHM
 
Matrix mult class-17
Matrix mult class-17Matrix mult class-17
Matrix mult class-17
 
3 analysis.gtm
3 analysis.gtm3 analysis.gtm
3 analysis.gtm
 
Addition and subtraction with signed magnitude data (mano
Addition and subtraction with signed magnitude data (manoAddition and subtraction with signed magnitude data (mano
Addition and subtraction with signed magnitude data (mano
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
Unit 2
Unit 2Unit 2
Unit 2
 
And or graph problem reduction using predicate logic
And or graph problem reduction using predicate logicAnd or graph problem reduction using predicate logic
And or graph problem reduction using predicate logic
 
Topic 2
Topic 2Topic 2
Topic 2
 
A* Search Algorithm
A* Search AlgorithmA* Search Algorithm
A* Search Algorithm
 
Presentation - Bi-directional A-star search
Presentation - Bi-directional A-star searchPresentation - Bi-directional A-star search
Presentation - Bi-directional A-star search
 
hospital management
hospital managementhospital management
hospital management
 
Circular convolution Using DFT Matlab Code
Circular convolution Using DFT Matlab CodeCircular convolution Using DFT Matlab Code
Circular convolution Using DFT Matlab Code
 
Data Structure: Algorithm and analysis
Data Structure: Algorithm and analysisData Structure: Algorithm and analysis
Data Structure: Algorithm and analysis
 
Basic_analysis.ppt
Basic_analysis.pptBasic_analysis.ppt
Basic_analysis.ppt
 
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
 
Lecture 21 problem reduction search ao star search
Lecture 21 problem reduction search ao star searchLecture 21 problem reduction search ao star search
Lecture 21 problem reduction search ao star search
 
Fast coputation of Phi(x) inverse
Fast coputation of Phi(x) inverseFast coputation of Phi(x) inverse
Fast coputation of Phi(x) inverse
 
Approximation algorithms
Approximation algorithmsApproximation algorithms
Approximation algorithms
 

Viewers also liked

NIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAIS
NIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAISNIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAIS
NIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAISNit Portal Social
 
License plate recognition an insight to the proposed approach for plate local...
License plate recognition an insight to the proposed approach for plate local...License plate recognition an insight to the proposed approach for plate local...
License plate recognition an insight to the proposed approach for plate local...Editor Jacotech
 
Better Broadband for All
Better Broadband for All Better Broadband for All
Better Broadband for All Shaun Ashworth
 
The Idaho MSA for Employer Groups & Employees
The Idaho MSA for Employer Groups & EmployeesThe Idaho MSA for Employer Groups & Employees
The Idaho MSA for Employer Groups & EmployeesAdam Berry
 
Zur Ökonomie von Netzneutralität
Zur Ökonomie von NetzneutralitätZur Ökonomie von Netzneutralität
Zur Ökonomie von NetzneutralitätDobusch Leonhard
 
4экспертиза связи заболевания с профессией
4экспертиза связи заболевания с профессией4экспертиза связи заболевания с профессией
4экспертиза связи заболевания с профессиейcdo_presentation
 
Detaljhandelsdagen Marcus wibergh
Detaljhandelsdagen Marcus wiberghDetaljhandelsdagen Marcus wibergh
Detaljhandelsdagen Marcus wiberghAnnika Andersson
 
From Strategy To Execution
From Strategy To ExecutionFrom Strategy To Execution
From Strategy To ExecutionRussell Cummings
 

Viewers also liked (17)

NIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAIS
NIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAISNIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAIS
NIT PORTAL SOCIAL -A SUSTENTABILIDADE NA SAÚDE COMEÇA ANTES DOS HOSPITAIS
 
Realidad nacional 2222
Realidad nacional 2222Realidad nacional 2222
Realidad nacional 2222
 
Diapositivas realidad
Diapositivas realidadDiapositivas realidad
Diapositivas realidad
 
License plate recognition an insight to the proposed approach for plate local...
License plate recognition an insight to the proposed approach for plate local...License plate recognition an insight to the proposed approach for plate local...
License plate recognition an insight to the proposed approach for plate local...
 
7. photos
7. photos7. photos
7. photos
 
Realidad nacional
Realidad nacionalRealidad nacional
Realidad nacional
 
Better Broadband for All
Better Broadband for All Better Broadband for All
Better Broadband for All
 
Leo el papi
Leo el papiLeo el papi
Leo el papi
 
The Idaho MSA for Employer Groups & Employees
The Idaho MSA for Employer Groups & EmployeesThe Idaho MSA for Employer Groups & Employees
The Idaho MSA for Employer Groups & Employees
 
Zur Ökonomie von Netzneutralität
Zur Ökonomie von NetzneutralitätZur Ökonomie von Netzneutralität
Zur Ökonomie von Netzneutralität
 
Cartilha Saúde Gratuita - Instituto Canguru
Cartilha Saúde Gratuita - Instituto CanguruCartilha Saúde Gratuita - Instituto Canguru
Cartilha Saúde Gratuita - Instituto Canguru
 
Diapositivas realidad
Diapositivas realidadDiapositivas realidad
Diapositivas realidad
 
Thesis
ThesisThesis
Thesis
 
4экспертиза связи заболевания с профессией
4экспертиза связи заболевания с профессией4экспертиза связи заболевания с профессией
4экспертиза связи заболевания с профессией
 
CV DENNY
CV DENNYCV DENNY
CV DENNY
 
Detaljhandelsdagen Marcus wibergh
Detaljhandelsdagen Marcus wiberghDetaljhandelsdagen Marcus wibergh
Detaljhandelsdagen Marcus wibergh
 
From Strategy To Execution
From Strategy To ExecutionFrom Strategy To Execution
From Strategy To Execution
 

Similar to SKuehn_MachineLearningAndOptimization_2015

MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4arogozhnikov
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfRajJain516913
 
Episode 50 : Simulation Problem Solution Approaches Convergence Techniques S...
Episode 50 :  Simulation Problem Solution Approaches Convergence Techniques S...Episode 50 :  Simulation Problem Solution Approaches Convergence Techniques S...
Episode 50 : Simulation Problem Solution Approaches Convergence Techniques S...SAJJAD KHUDHUR ABBAS
 
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...Xi'an Jiaotong-Liverpool University
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...Xi'an Jiaotong-Liverpool University
 

Similar to SKuehn_MachineLearningAndOptimization_2015 (20)

MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
Computer graphics 2
Computer graphics 2Computer graphics 2
Computer graphics 2
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
bv_cvxslides (1).pdf
bv_cvxslides (1).pdfbv_cvxslides (1).pdf
bv_cvxslides (1).pdf
 
sheet6.pdf
sheet6.pdfsheet6.pdf
sheet6.pdf
 
doc6.pdf
doc6.pdfdoc6.pdf
doc6.pdf
 
paper6.pdf
paper6.pdfpaper6.pdf
paper6.pdf
 
lecture5.pdf
lecture5.pdflecture5.pdf
lecture5.pdf
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
 
Episode 50 : Simulation Problem Solution Approaches Convergence Techniques S...
Episode 50 :  Simulation Problem Solution Approaches Convergence Techniques S...Episode 50 :  Simulation Problem Solution Approaches Convergence Techniques S...
Episode 50 : Simulation Problem Solution Approaches Convergence Techniques S...
 
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Clo...
 
Ydstie
YdstieYdstie
Ydstie
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
PECCS 2014
PECCS 2014PECCS 2014
PECCS 2014
 
ICPR 2016
ICPR 2016ICPR 2016
ICPR 2016
 
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...
 
2.ppt
2.ppt2.ppt
2.ppt
 

More from Stefan Kühn

data2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdfdata2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdfStefan Kühn
 
data2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdfdata2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdfStefan Kühn
 
Talk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsTalk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsStefan Kühn
 
Data Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational ChangeData Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational ChangeStefan Kühn
 
Interactive Dashboards with R
Interactive Dashboards with RInteractive Dashboards with R
Interactive Dashboards with RStefan Kühn
 
Talk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and ApplicationsTalk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and ApplicationsStefan Kühn
 
The Machinery behind Deep Learning
The Machinery behind Deep LearningThe Machinery behind Deep Learning
The Machinery behind Deep LearningStefan Kühn
 
Manifold Learning and Data Visualization
Manifold Learning and Data VisualizationManifold Learning and Data Visualization
Manifold Learning and Data VisualizationStefan Kühn
 
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing SolutionsBecoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing SolutionsStefan Kühn
 
Learning To Rank data2day 2017
Learning To Rank data2day 2017Learning To Rank data2day 2017
Learning To Rank data2day 2017Stefan Kühn
 
Deep Learning and Optimization Methods
Deep Learning and Optimization MethodsDeep Learning and Optimization Methods
Deep Learning and Optimization MethodsStefan Kühn
 
Visualizing and Communicating High-dimensional Data
Visualizing and Communicating High-dimensional DataVisualizing and Communicating High-dimensional Data
Visualizing and Communicating High-dimensional DataStefan Kühn
 
Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeStefan Kühn
 
Data Visualization at codetalks 2016
Data Visualization at codetalks 2016Data Visualization at codetalks 2016
Data Visualization at codetalks 2016Stefan Kühn
 
SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015Stefan Kühn
 

More from Stefan Kühn (16)

data2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdfdata2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdf
 
data2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdfdata2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdf
 
Talk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsTalk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and Applications
 
Data Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational ChangeData Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational Change
 
Interactive Dashboards with R
Interactive Dashboards with RInteractive Dashboards with R
Interactive Dashboards with R
 
Talk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and ApplicationsTalk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and Applications
 
Bridging the gap
Bridging the gapBridging the gap
Bridging the gap
 
The Machinery behind Deep Learning
The Machinery behind Deep LearningThe Machinery behind Deep Learning
The Machinery behind Deep Learning
 
Manifold Learning and Data Visualization
Manifold Learning and Data VisualizationManifold Learning and Data Visualization
Manifold Learning and Data Visualization
 
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing SolutionsBecoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
 
Learning To Rank data2day 2017
Learning To Rank data2day 2017Learning To Rank data2day 2017
Learning To Rank data2day 2017
 
Deep Learning and Optimization Methods
Deep Learning and Optimization MethodsDeep Learning and Optimization Methods
Deep Learning and Optimization Methods
 
Visualizing and Communicating High-dimensional Data
Visualizing and Communicating High-dimensional DataVisualizing and Communicating High-dimensional Data
Visualizing and Communicating High-dimensional Data
 
Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data Challenge
 
Data Visualization at codetalks 2016
Data Visualization at codetalks 2016Data Visualization at codetalks 2016
Data Visualization at codetalks 2016
 
SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015
 

Recently uploaded

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 

Recently uploaded (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 

SKuehn_MachineLearningAndOptimization_2015

  • 1. The Foundations of (Machine) Learning Stefan Kühn codecentric AG CSMLS Meetup Hamburg - February 26th, 2015 Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 1 / 18
  • 2. Contents 1 Supervised Machine Learning 2 Optimization Theory 3 Concrete Optimization Methods Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 2 / 18
  • 3. 1 Supervised Machine Learning 2 Optimization Theory 3 Concrete Optimization Methods Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 3 / 18
  • 4. Setting Supervised Learning Approach Use labeled training data in order to fit a given model to the data, i.e. to learn from the given data. Typical Problems: Classification - discrete output Logistic Regression Neural Networks Support Vector Machines Regression - continuous output Linear Regression Support Vector Regression Generalized Linear/Additive Models Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 4 / 18
  • 5. Training and Learning Ingredients: Training Data Set Model, e.g. Logistic Regression Error Measure, e.g. Mean Squared Error Learning Procedure: Derive objective function from Model and Error Measure Initialize Model parameters Find a good fit! Iterate with other initial parameters What is Learning in this context? Learning is nothing but the application of an algorithm for unconstrained optimization to the given objective function. Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 5 / 18
  • 6. 1 Supervised Machine Learning 2 Optimization Theory 3 Concrete Optimization Methods Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 6 / 18
  • 7. Unconstrained Optimization Higher-order methods Newton’s method (fast local convergence) Gradient-based methods Gradient Descent / Steepest Descent (globally convergent) Conjugate Gradient (globally convergent) Gauß-Newton, Levenberg-Marquardt, Quasi-Newton Krylow Subspace methods Derivative-free methods, direct search Secant method (locally convergent) Regula Falsi and successors (global convergence, typically slow) Nelder-Mead / Downhill-Simplex unconventional method, creates a moving simplex driven by reflection/contraction/expansion of the corner points globally convergent for differentiable functions f ∈ C1 Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 7 / 18
  • 8. General Iterative Algorithmic Scheme Goal: Minimize a given function f : min f (x), x ∈ Rn Iterative Algorithms Starting from a given point an iterative algorithm tries to minimize the objective function step by step. Preparation: k = 0 Initialization: Choose initial points and parameters Iterate until convergence: k = 1, 2, 3, . . . Termination criterion: Check optimality of the current iterate Descent Direction: Find reasonable search direction Stepsize: Determine length of the step in the given direction Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 8 / 18
  • 9. Termination criteria Critical points x∗ : f (x∗ ) = 0 Gradient: Should converge to zero f (x∗ ) < tol Iterates: Distance between xk and xk+1 should converge to zero xk − xk+1 < tol Function Values: Difference between f (xk) and f (xk+1) should converge to zero f (xk ) − f (xk+1 ) < tol Number of iterations: Terminate after maxiter iterations Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 9 / 18
  • 10. Descent direction Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 10 / 18
  • 11. Descent direction Geometric interpretation d is a descent direction if and only if the angle α between the gradient f (x) and d is in a certain range: π 2 = 90◦ < α < 270◦ = 3π 2 Algebraic equivalent The sign of the scalar product between two vectors a and b is determined by the cosine of the angle α between a and b: a, b = aT b = a b cos α(a, b) d is a descent direction if and only if: dT f (x) < 0 Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 11 / 18
  • 12. Stepsize Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 12 / 18
  • 13. Stepsize Armijo’s rule Takes two parameters 0 < σ < 1, and 0 < ρ < 0.5 For = 0, 1, 2, ... test Armijo condition: f (p + σ d) < f (p) + ρσ dT f (p) Accepted stepsize First that passes this test determines the accepted stepsize t = σ Standard Armijo implies, that for the accepted stepsize t always holds t <= 1, only semi-efficient. Technical detail: Widening Test whether some t > 0 satisfy Armijo condition, i.e. check = −1, −2, . . . as well, ensures efficiency. Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 13 / 18
  • 14. 1 Supervised Machine Learning 2 Optimization Theory 3 Concrete Optimization Methods Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 14 / 18
  • 15. Gradient Descent Descent direction Direction of Steepest Descent, the negative gradient: d = − f (x) Motivation: corresponds to α = 180◦ = π obvious choice, always a descent direction, no test needed guarantees the quickest win locally works with inexact line search, e.g. Armijo’ s rule works for functions f ∈ C1 always solves auxiliary optimization problem min sT f (x), s ∈ Rn , s = 1 Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 15 / 18
  • 16. Conjugate Gradient Motivation: Quadratic Model Problem, minimize f (x) = Ax − b 2 Optimality condition: f (x∗ ) = 2AT (Ax∗ − b) = 0 Obvious approach: Solve system of linear equations AT Ax = AT b Descent direction Consecutive directions di , . . . , di+k satisfy certain orthogonality or conjugacy conditions, M = AT A symmetric positive definite: dT i Mdj = 0, i = j Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 16 / 18
  • 17. Nonlinear Conjugate Gradient Initial Steps: start at point x0 with d0 = − f (x0) perform exact line search, find t0 = arg min f (x0 + td0), t > 0 set x1 = x0 + t0d0. Iteration: set ∆k = − f (xk) compute βk via one of the available formulas (next slide) update conjugate search direction dk = ∆k + βkdk−1 perform exact line search, find tk = arg min f (xk + tdk), t > 0 set xk+1 = xk + tkdk Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 17 / 18
  • 18. Nonlinear Conjugate Gradient Formulas for βk: Fletcher-Reeves βFR k = ∆T k ∆k ∆T k−1∆k−1 Polak-Ribière βPR k = ∆T k (∆k − ∆k−1) ∆T k−1∆k−1 Hestenes-Stiefel βHS k = − ∆T k (∆k − ∆k−1) sT k−1 (∆k − ∆k−1) Dai-Yuan βDY k = − ∆T k ∆k sT k−1 (∆k − ∆k−1) Reasonable choice with automatic direction reset: β = max 0, βPR Stefan Kühn (codecentric AG) Unconstrained Optimization 25.02.2015 18 / 18