Maximum Likelihood Calibration of the Hercules Data Set
1. Maximum Likelihood Calibration Techniques
Chris Garling
Haverford College
Last updated March 7, 2016
Contents
1 Introduction 1
2 Construction of the Gaussian Probability Distribution Func-
tion 1
3 Derivation of the Cost Function 2
3.1 The Foundation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2 Introduction of Parameters . . . . . . . . . . . . . . . . . . . . . 2
3.3 Preparation of Cost Function for Python Optimizer . . . . . . . . 3
1 Introduction
In this document, I will put forth the justification and derivation of the equations
used to calibrate the Hercules data set. These observations were obtained in the
SDSS g and r bands, and throughout this document I will name variables after
the g band. Adapt the variables as needed for your project. You may also find
that you need to include spatial parameters in your calibration to account for
pixel variation across the field of view–this was unnecessary for DECam data.
2 Construction of the Gaussian Probability Dis-
tribution Function
To use this method, we require stars for which we have raw instrumental mag-
nitudes and properly measured and calibrated magnitudes–in this case, we have
magnitudes from the Sloan Digital Sky Survey that we are using as our data
points, and we construct a model magnitude out of our instrumental magnitude
and other parameters. Conceptually, what we are doing is using a likelihood
function, which by definition states the likelihood L of a set of parameter val-
ues α given data point x is equal to the probability of measuring x given α. In
mathematical terms,
1
2. L (α|x) = P(x |α) (1)
We have the data point x: for each star, this is the SDSS measured mag-
nitude. The set of parameters α, however, will contain terms that we do not
know. Thus, we seek to find the values of the parameter set α that maximize
the likelihood of obtaining the data point x. In this document I will prepare
equations for a Python optimizer which will do the work of determining the
optimum values of the parameters that make up α.
3 Derivation of the Cost Function
3.1 The Foundation
Because we have an entire set of data, we will use a probability distribution
function to model the spread of our data set. For this calibration, we use a
Gaussian. So now we set up the basic form of our probability function:
L =
1
√
2πσ
e− 1
2 ( x−µ
σ )2
where L is the likelihood function of obtaining the data point x given the
parameters α (µ, σ) where µ is the expected value of the data point, and σ is
the standard deviation of our Gaussian probability distribution function. At
the current state of this equation, we are looking to maximize L . However, we
have a good bit of work to do on this equation before we can write it into an
optimization program. Let’s get into it.
3.2 Introduction of Parameters
The next step is write x, µ, and σ in terms of the parameters and data sets that
we wish to include in the calibration.
1. Our data point, x, we will rewrite as gSDSS,j. This might confuse you,
as the SDSS magnitude is not what we actually measured. However, the
SDSS measured magnitude is the data point because it was measured by
someone, and what we are looking to do is determine values for parameters
that maximize the likelihood of obtaining that data point.
2. µ we will rename to Gmod, as this is our “model magnitude.”
gmod,ijn = ginst,ij + go,i + cg(g − r)SDSS,j − knxi
• where the subscripts i, j, and n signify that the variable has discrete
values for each exposure, star, or night, respectively,
• ginst is the instrumental magnitude of a star,
• go is the zeropoint offset,
2
3. • cg is the color term,
• (g-r)SDSS,j is the color of the star as defined by SDSS magnitudes,
• kn is a positive extinction correction parameter,
• and x is the airmass at the time the exposure was taken.
So as you can see, this model magnitude has three free parameters:
one that corrects for a scalar discrepancy between the measured mag-
nitude and the expected, one that accounts for the color of the star in
question, and one that adjusts for the airmass at the time the expo-
sure was taken. The construction of this model magnitude equation
is critical to the quality of your calibration–take time to decide pre-
cisely which parameters you wish to include.
To elaborate on why we are using SDSS magnitudes as our data
points: the parameters of our “expected value” are the ones explained
above. Our measured, raw, instrumental magnitude is one of those
parameters. gmodis our expected value because the purpose of this
calibration is to obtain the values of the three free parameters that
create an “expected value” that maximizes the likelihood (L ) of ob-
taining the data point x.
3. And we will go ahead and define
σ2
ij = σ2
SDSS,j + σ2
inst,ij + [cg(σg,j − σr,j)]2
because we will need it when we begin to simplify our equation, which is
our next step.
3.3 Preparation of Cost Function for Python Optimizer
Now, we will take the negative natural logarithm of our probability equation,
as it will make the equation easier to input to our Python optimizer. We will
call this new function the “cost function” (C).
C = −ln (L ) = −ln
1
√
2πσ
e− 1
2 ( x−µ
σ )2
= −ln
1
√
2πσ
− ln e− 1
2 ( x−µ
σ )2
C =
1
2
x − µ
σ
2
− ln
1
√
2π
− ln
1
σ
We can neglect constant terms, so:
C =
1
2
x − µ
σ
2
− ln
1
σ
=
1
2
x − µ
σ
2
+ ln(σ) =
1
2
x − µ
σ
2
+ ln
σ2
2
3
4. And so finally, before putting any of our parameters in, our reduced cost
function is:
C = −ln(L ) =
1
2
(x − µ)2
σ2
+ ln(σ2
)
Next, we will plug gmod , gSDSS, and σ2
ij into this equation.
C =
1
2
(gSDSS,j − gmod,ij)2
σ2
ij
+ ln(σ2
ij)
And now, we will plug in all parameters
C =
1
2
[gSDSS,j − (ginst,ij + go,i + (g − r)SDSS,jcg − knxi)]
2
σ2
SDSS,j + σ2
inst,ij + [(σg,j − σr,j)cg)]2
+
1
2
ln(σ2
SDSS,j + σ2
inst,ij + [(σg,j − σr,j)cg)]2
)
But we want to evaluate this cost function for all of our data, so we have to
change the equation to account for that.
C =
1
2 i j n
[gSDSS,j − (ginst,ij + go,i + (g − r)SDSS,jcg − knxi)]2
σ2
SDSS,j + σ2
inst,ij + [(σg,j − σr,j)cg)]2
+ ln(σ2
SDSS,j + σ2
inst,ij + [(σg,j − σr,j)cg)]2
)
We are now ready to take the partial derivatives of the cost function with
respect to each of our free parameters, which the Python optimizer requires.
This expanded version of the cost function we will only use when necessary–
otherwise, we will use σ2
and gmod to keep the equations neater.
dC
dgo,i
= −
j n
gSDSS,j − gmod,jn
σ2
ij
The derivative dC
dcg
is more difficult, so I will take this step by step. We first
use the quotient and chain rules to set up our partial derivative, then we will
solve for all constituents of the function and substitute.
dC
dcg
=
1
2 i j n
σ2 d(gsdss,j −gmod,ijn)2
dcg
− (gSDSS,j − gmod,ijn)
dσ2
ij
dcg
σ4
ij
+
dσ2
ij
dcg
σ2
ij
d(gsdss,j − gmod,ijn)2
dcg
= −2(gSDSS,J − gmod,ijn)(g − r)SDSS,j
dσ2
ij
dcg
= 2cg(σg,j − σr,j)2
σij = σ2
SDSS,j + σ2
inst,ij + [(σg,j − σr,j)cg)]2
4
5. and substituting ,
dC
dcg
=
1
2 i j n
−σ2
2(gSDSS,J − gmod,ijn)(g − r)SDSS,j − (gSDSS,j − gmod,ijn) 2cg(σg,j − σr,j)2
σ4
ij
+
2cg(σg,j − σr,j)2
σ2
ij
and finally, reducing,
dC
dcg
= −
i j n
cg(gSDSS,J − gmod,ijn)(σg,j − σr,j)2
σ4
+
(gSDSS,J − gmod,ijn)(g − r)SDSS,j
σ2
ij
−
cg(σg,j − σr,j)2
σ2
ij
Next, the partial derivative with respect to k:
dC
dkn
=
i j
gSDSS,j − gmod,jn
σ2
ij
xi
Now all that remains is to input the cost function and its partial derivatives
into an optimizer to find the values of the unknown parameters that maximize
the likelihood, or minimize the cost. In Python, we use scipy.optimize.fmin_l_bfgs_b
5