2. 2. Subsequently, after we have made the measurement, we compute
a new a posteriori estimate by fusing the a priori prediction with
this latest measurement.
In our examples we will assume trivial dynamics: i.e., our prediction of
the next state is the current state. Of course, when your system dynamics
is more complex, the prediction step 1 must reflect this complexity, but it
wouldn’t change step 2, i.e., the procedure of the assimilation/fusion of a
prediction with a measurement.
For more in-depth treatment of the Kalman filter, see the texts of Brown
and Hwang (1997), Maybeck (1982), and, especially for neuroscience
applications, Schiff (2012). The website from Welch and Bishop listed in
the References is also very useful.
19.2 INTRODUCTORY TERMINOLOGY
This section briefly summarizes the most important terminology
required to develop the Kalman filter procedure in Section 19.3.
19.2.1 Normal or Gaussian Distribution
Although Kalman filter versions that deal with non-Gaussian noise
processes exist, the noise components in the Kalman filter approach
described in this chapter are Gaussian white noise terms with zero mean.
Recall that the probability density function (PDF) of the normal or
Gaussian distribution is
fðxÞ ¼
1
ffiffiffiffiffiffiffiffiffiffiffi
2ps2
p e$ ðx$ mÞ2
2s2
(19.1)
mdmean (zero in our case), and sdstandard deviation.
19.2.2 Minimum Mean-Square Error
An optimal fit of a series of predictions with a series of observations can
be achieved in several ways. Using the minimum mean-square error
(MMSE) approach, one computes the difference between an estimate b
y
and a target value y. The square of this difference,
"
b
y $ y
#2 gives an
impression of how well the two fit. In case of a series of N estimates and
target values, one gets a quantification of the error E by determining the
sum of the squares of their differences E ¼
PN
i¼1
"
b
y i $ yi
#2, and a common
technique is to minimize this expression with respect to the parameter
19. KALMAN FILTER
362
3. of interest. For example, if our model is b
y i ¼ axi, then a is the parameter
we need to find. The minimum of E with respect to parameter a can be
found by differentiating the expression for the sum of squares with
respect to this parameter and setting this derivative equal to zero, i.e.,
vE=va ¼ 0. In Section 19.3, the Kalman filter is developed using the
MMSE technique. For other examples of this minimization approach see
Chapters 5 and 8.
19.2.3 Recursive Algorithm
Assume we measure the resting potential of a neuron based on a series
of measurements: z1, z2, …, zn. It seems reasonable to use the estimated
mean m
_
n of the n measurements as the estimate of the resting potential:
m
_
n ¼
z1 þ z2 þ / þ zn
n
: (19.2)
For several reasons, this equation isn’t an optimal basis for the devel-
opment of an algorithm. For each new measurement we need more
memory: i.e., we need n memory locations for measurements z1 $ zn. In
addition, we need to complete all measurements before we have the
estimate.
With a recursive approach (as is used in the Kalman filter algorithm)
one estimates the resting potential after each measurement using updates
for each measurement
1st measurement: m
_
1 ¼ z1 (19.3)
2nd measurement: m
_
2 ¼
1
2
m
_
1 þ
1
2
z2 (19.4)
3rd measurement: m
_
3 ¼
2
3
m
_
2 þ
1
3
z3 (19.5)
and so on.
Now we update the estimate at each measurement and we only need
memory for the previous estimate and the new measurement to make a
new estimate.
19.2.4 Data Assimilation
Suppose we have two measurements, y1 and y2, and we would like to
combine/fuse these observations into a single estimate y. Further, the
uncertainty of the two measurements is given by their variance
(¼ standard deviation squared), s1 and s2, respectively. If we follow the
19.2 INTRODUCTORY TERMINOLOGY 363
4. fusion principle used in the Kalman approach, which is based on MMSE,
we will get
y ¼
s2
s1 þ s2
y1 þ
s1
s1 þ s2
y2 (19.6)
for the new estimate, and its variance s can be obtained from
1
s
¼
1
s1
þ
1
s2
(19.7)
By following this procedure, we can obtain an improved estimate
based on two separate ones (Fig. 19.1).
In the example in Fig. 19.1, we have two observations (black and blue)
associated with two distributions with mean values of 5 and 10 and
standard deviations of 1 and 3. Using Eqs. (19.6) and (19.7) we find that
the best estimate is reflected by a distribution with a mean of 5.5 and a
standard deviation of O0.9 z 0.95 (red). From this example it is obvious
that the measurement with less uncertainty (black curve) has more weight
in the estimate, while the one with more uncertainty (blue curve) only
contributes a little. In the context of the Kalman filter, the terms assimi-
lation and blending are sometimes used instead of data fusion in order to
describe the combination of estimate and measurement.
19.2.5 State Model
One way of characterizing the dynamics of a system is to describe how
it evolves through a number of states. If we characterize state as a vector of
0 5 10 15 20
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
FIGURE 19.1 An example of the fusion procedure.
19. KALMAN FILTER
364
5. variables in a state space, we can describe the system’s dynamics by its
path through state space. A simple example is a moving object. If we take
an example where there is no acceleration, its state can be described by the
coordinates of its position x
! in three dimensions: i.e., the 3-D space is its
state space. The dynamics can be seen by plotting position versus time,
which also gives its velocity d x
!=dt (i.e., the time derivative of its
position).
Let’s take another example, a second-order ordinary differential
equation (ODE)
€
x1 þ b _
x1 þ cx1 ¼ 0 (19.8)
Now we define
_
x1 ¼ x2; and rewrite the ODE as (19.9a)
_
x2 þ bx2 þ cx1 ¼ 0 (19.9b)
In this way we rewrote the single second-order ODE in Eq. (19.8) as two
first-order ODEs: Eqs. (19.9a) and (19.9b) (see also Section 9.5). Now we
can define a system state vector
x
! ¼
$
x1
x2
%
(19.10)
Note that, unlike the previous example, this vector is not the
position of a physical object but a vector that defines the state of the
system. The equations in (19.9a) and (19.9b) can now be compactly
written as
d x
!
dt
¼ A x
!; with A ¼
$
0 1
$ c $ b
%
(19.11)
Here, the dynamics is expressed as an operation of matrix A on state
vector x
!. This notation can be used as an alternative to any ODE.
19.2.6 Bayesian Analysis
In the Kalman filter approach we update the probability distribution
function (i.e., the mean and variance) of the state x of a system based on
measurement z, i.e., p
ðxjzÞ. Conditional probabilities and distributions are
the topic of Bayesian statistics and therefore the Kalman approach is a
form of Bayesian analysis.
19.2 INTRODUCTORY TERMINOLOGY 365
6. 19.3 DERIVATION OF A KALMAN FILTER FOR
A SIMPLE CASE
In this section, we derive a simple version of a Kalman filter application
following the procedure in Brown and Hwang (1997). They show a
derivation for a vector observation, here we show the equivalent
simplification for a scalar. All equations with a border are used in the
example algorithm of the MATLAB!
script pr19_1, and are also shown in
the flowchart in Fig. 19.2.
We start with a simple process x, for example, a neuron’s resting
potential: i.e., its membrane potential in the absence of overt stimulation.
We assume that the update rule for the state of the process is
xkþ 1 ¼ xk þ w (19.12)
The process measurement is z and is defined (as simple as possible) as
zk ¼ xk þ v (19.13)
)
( v
k
k
k
s
s
s
K
k
k
k
k
k
x
z
K
x
x
)
1
( k
k
k K
s
s
k
k x
x 1
w
k
k s
s
s 1
Initial Conditions:
Measurement
k
z
0
s 0
x
Step 1: Blending of Prediction and
Measurement
Step 2: Project Estimates for the
next Step
a posteriori estimates
a priori estimates
FIGURE 19.2 Flowchart of the Kalman filter procedure derived in this section and imple-
mented in the MATLAB!
script pr19_1.m. In this MATLAB!
script, steps 1e2 are indicated.
Step 1: Eqs. (19.19), (19.15), and (19.20); Step 2: Eqs. (19.21) and (19.22). The equations in the
diagram are derived in the text.
19. KALMAN FILTER
366
7. In the update procedure, w and v represent Gaussian white noise vari-
ables that model the process and measurement noise, respectively. For the
sake of this example, they are assumed stationary and ergodic (Chapter 3).
The mean for both is zero and the associated variance values are sw and sv.
The covariance of w and v is zero because they are independent.
Now assume we have an a priori estimate x$
k . Note that in most texts
authors include a “hat”: i.e., b
x $
k to indicate one is dealing with an estimate.
To simplify notation, hats to denote estimates are omitted here. The a
priori error we make with this estimate is xk $ x$
k and the variance s$
k
associated with this error is given by
s$
k ¼ E
h"
xk $ x$
k
#2
i
(19.14)
Now we perform a measurement zk and assimilate this measurement
with our a priori estimate to obtain an a posteriori estimate.
Note: If we integrate a noise term w (drawn from a zero mean normal
distribution with variance s2
) over time, i.e.,
Z t2
t1
w dt, we get s2ðt2 $ t1Þ.
So, in the case of a large interval between measurements at t1 and t2, the
uncertainty grows with a factor ðt2 $ t1Þ and, consequently, the uncertainty
s$
kþ 1 will become very large, and the prediction has almost no value in the
fusion of prediction and measurement.
The a posteriori estimate xþ
k , determined by the fusion of the a priori
prediction and the new measurement, is given by:
xþ
k ¼ x$
k þ Kk
"
zk $ x$
k
#
(19.15)
The factor Kk in Eq. (19.15) is the blending factor that determines to
what extent the measurement and a priori estimate affect the new a
posteriori estimate.
The a posteriori error we make with this estimate is xk $ xþ
k and the
variance sþ
k associated with this error is given by
sþ
k ¼ E
h"
xk $ xþ
k
#2
i
(19.16)
Substitution of Eq. (19.13) into Eq. (19.15) gives
xþ
k ¼ x$
k þ Kk
"
xk þ v $ x$
k
#
. If we plug this result into Eq. (19.16), we get:
sþ
k ¼ E
h""
xk $ x$
k
#
$ Kk
"
xk þ v $ x$
k
##2
i
(19.17)
19.3 DERIVATION OF A KALMAN FILTER FOR A SIMPLE CASE 367
8. We can expand this expression and take the expectations. Recall that
xk $ x$
k is the a priori estimation error and uncorrelated with measure-
ment noise v. The outcome of this is (in case this isn’t obvious, details of
these steps are summarized in Appendix 19.1):
sþ
k ¼ s$
k $ 2Kks$
k þ K2
ks$
k þ K2
ksv
This is usually written as:
sþ
k ¼ s$
k ð1 $ KkÞ2
þ K2
k sv; (19.18)
or
sþ
k ¼ s$
k $ 2Kks$
k þ
"
s$
k þ sv
#
K2
k
To optimize the a posteriori estimate using optimal blending in Eq.
(19.15), we must minimize the variance sþ
k (¼ square of the estimated
error) with respect to Kk, i.e., we differentiate the expression for sþ
k with
respect to Kk and set the result equal to zero:
$ 2s$
k þ 2
"
s$
k þ sv
#
Kk ¼ 0
This generates the expression for an optimal Kk,
Kk ¼
s$
k
"
s$
k þ sv
#: (19.19)
Note: At this point it is interesting to evaluate the result. Substituting the
expression for Kk in Eq. (19.15), you can see what this optimized result
means: xþ
k ¼ x$
k þ
s$
k
ðs$
k
þ svÞ
"
zk $ x$
k
#
. If the a priori estimate is unreliable,
indicated by its large variance s$
k /N, i.e., Kk / 1, we will ignore the
estimate and we believe more of the measurement, i.e., xþ
k /zk. On
the other hand, if the measurement is completely unreliable sv/N,
i.e., Kk / 0, we ignore the measurement and believe the a priori estimate:
xþ
k /x$
k .
Now we proceed and use the result in Eq. (19.19) to obtain an
expression for sv:
sv ¼
s$
k $ Kks$
k
Kk
19. KALMAN FILTER
368
9. Substitution of this result in Eq. (19.18) gives an expression to find the a
posteriori error based on an optimized blending factor Kk:
sþ
k ¼ s$
k ð1 $ KkÞ2
þ
K2
k s$
k $ K3
k s$
k
Kk
With a little bit of algebra we get
sþ
k ¼ s$
k $ 2Kks$
k þ K2
ks$
k þ Kks$
k $ K2
ks$
k ;
which we simplify to:
sþ
k ¼ s$
k ð1 $ KkÞ (19.20)
Here it should be noted that depending on how one simplifies
Eq. (19.18), one might get different expressions. Some work better than
others, depending on the situation. We use Eq. (19.20) here because of its
simplicity.
Finally we need to use the information to make a projection towards
the next time step occurring at tkþ 1 (Fig. 19.2, Step 2). This procedure
depends on the model for the process we are measuring from! Since we
employ a simple model in this example, Eq. (19.12), our a priori estimate
of x is simply the a posteriori estimate of the previous time step:
x$
kþ 1 ¼ xþ
k (19.21)
Here we ignore the noise (w) term in Eq. (19.12) since the expectation of
w is zero. In many applications this step can become more complicated or
even an extremely complex procedure depending on the dynamics
between tk and tkþ 1.
The a priori estimate of the variance at tkþ 1 is based on the error of the
prediction in Eq. (19.21),
e$
kþ 1 ¼ xkþ 1 $ x$
kþ 1:
If we substitute Eqs. (19.12) and (19.21) into this expression, we get
e$
kþ 1 ¼ xk þ w $ xþ
k .
Since the a posteriori error at tk is eþ
k ¼ xk $ xþ
k , we can plug this into
the above expression and we get
e$
kþ 1 ¼ eþ
k þ w
19.3 DERIVATION OF A KALMAN FILTER FOR A SIMPLE CASE 369
10. Using this expression for the a priori error, we see that the associated a
priori variance at tkþ 1 depends on the a posteriori error at tk and the
variance of the noise process w:
s$
kþ 1 ¼ E
h"
eþ
k þ w
#2
i
:
Because w and the error in the estimate are uncorrelated, we finally
have
s$
kþ 1 ¼ sþ
k þ sw (19.22)
19.4 MATLAB!
EXAMPLE
The example MATLAB!
script pr19_1 (listed below) uses the above
derivation to demonstrate how this could be applied to a recording of a
membrane potential of a nonstimulated neuron. In this example we
assume that the resting potential Vm ¼ $ 70 mV and that the standard
deviations of the process noise and the measurement noise are 1 and
2 mV, respectively. The program follows the flowchart in Fig. 19.2 and a
typical result is depicted in Fig. 19.3. Due to the random aspect of the
procedure, your result will vary slightly from the example shown here.
100 200 300 400 500 600 700 800 900 1000
-90
-85
-80
-75
-70
-65
-60
-55
-50
Time (AU)
Membrane
Potential
(mV)
FIGURE 19.3 Example of the Kalman filter application created with the attached
MATLAB!
script pr19_1. The true values are green, the measurements are represented by
the red dots, and the Kalman filter output is the blue line. Although the result is certainly
not perfect, it can be seen that the Kalman filter output is much closer to the real values
than the measurements.
19. KALMAN FILTER
370
11. MATLAB!
Script pr19_1.m that performs Kalman filtering
% pr19_1
%
% Example using a Kalman filter to
% estimate the resting membrane potential of a neuron
% (Algorithm isn't optimized)
clear
close all
% Parameters
N¼1000; % number of measurements
Vm¼-70; % the membrane potential (Vm) value
sw¼1ˆ2; % variance process noise
sv¼2ˆ2; % measurement noise
% Initial Estimates
x_apriori(1)¼Vm; % first estimate of Vm
% This is the a priori measurement
s_apriori(1)¼0; % a priori estimate of the error; here variance ¼0
% i.e., we assume that first estimate is perfect
% create a simulated measurement vector
% -------------------------------------
for i¼1:N;
tru(i)¼Vmþ randn*sw;
z(i)¼tru(i)þ randn*sv;
end;
% Perform the Kalman filter
% -------------------------
% Note: Malab indices start at 1, and not
% at 0 as in most textbook derivations
% Equations are numbered as in CH 19
for i¼1:N;
% step 1: Blending of Prediction and Measurement
% Compute blending factor K(i)
K(i)¼s_apriori(i)/(s_apriori(i)þ sv); % Eq (19.19)
% Update Estimate using the measurement
% this is the a posteriori estimate
x_aposteriori(i)¼x_apriori(i)þ K(i)*(z(i)-x_apriori(i)); % Eq (19.15)
% Update the error estimate [Note that there
19.4 MATLAB!
EXAMPLE 371
12. % are several variants for this procedure;
% here we use the simplest expression
s_aposteriori(i)¼s_apriori(i)*(1-K(i)); % Eq (19.20)
% step 2: Project Estimates for the next Step
x_apriori(iþ 1)¼x_aposteriori(i); % Eq (19.21)
s_apriori(iþ 1)¼s_aposteriori(i)þ sw; % Eq (19.22)
end;
% plot the results
figure;hold;
plot(z,'r.');
plot(tru,'g');
plot(x_aposteriori)
axis([1 N -100 10]);
xlabel ('Time (AU)')
ylabel ('Membrane Potential (mV)')
title ('Kalman Filter Application: true values (green); measurements
(red .); Kalman Filter Output (blue)')
19.5 USE OF THE KALMAN FILTER TO ESTIMATE
MODEL PARAMETERS
The vector-matrix version of the Kalman filter can estimate a series of
variables but can also be used to estimate model parameters. If a
parameter of the model is a constant or changes very slowly so that it can
be considered a constant, one treats and estimates this parameter as a
constant with noise, just as we did with xk in Eq. (19.12). The noise
component allows the algorithm of the Kalman filter to find the best fit of
the model parameter at hand.
APPENDIX 19.1 DETAILS OF THE STEPS BETWEEN
EQS. (19.17) AND (19.18)
This appendix gives the details of the steps between Eqs. (19.17) and
(19.18). For convenience, we repeat the equation.
sþ
k ¼ E
h""
xk $ x$
k
#
$ Kk
"
xk þ v $ x$
k
##2
i
(19.17)
19. KALMAN FILTER
372
13. We can expand this expression and take the expectations. If we expand
the squared function, we get three terms further evaluated in (1)e(3)
below.
1. The first term can be rewritten using Eq. (19.14), i.e.,:
E
h"
xk $ x$
k
#2
i
¼ s$
k
2. The second term E
&
$ 2
"
xk $ x$
k
#
Kk
"
xk þ v $ x$
k
#'
¼
E
&
$ 2Kk
("
xk $ x$
k
#
xk þ
"
xk $ x$
k
#
v $
"
xk $ x$
k
#
x$
k
)'
The second term between the curly brackets in the above expression,
"
xk $ x$
k
#
v will vanish when taking the expectation because xk $ x$
k
is the a priori estimation error which is uncorrelated with mea-
surement noise v. Thus, using Eq. (19.14) again, we can simplify
E
&
$ 2Kk
("
xk $ x$
k
#"
xk $ x$
k
#)'
¼ $ 2Kks$
k
3. The third term is E
&
K2
k
("
xk þ v $ x$
k
#"
xk þ v $ x$
k
#)'
. First we
focus on the expression between the curly brackets first and we
get
x2
k þ xkv $ xkx$
k þ vxk þ v2
$ vx$
k $ x$
k xk $ x$
k v þ x$ 2
k ¼
2v
"
xk $ x$
k
#
þ x2
k $ 2xkx$
k þ x$ 2
k þ v2
¼ 2v
"
xk $ x$
k
#
þ
"
xk $ x$
k
#2
þ v2
When we take expectations, the first term of the simplified expression
will vanish, so we get the following result:
E
h
K2
k
"
xk $ x$
k
#2
þ K2
kv2
i
¼ K2
ks$
k þ K2
ksv
If we now collect all terms from (1)e(3), we can rewrite Eq. (19.17) as
sþ
k ¼ s$
k $ 2Kks$
k þ K2
k s$
k þ K2
ksv
This is usually written as:
sþ
k ¼ s$
k ð1 $ KkÞ2
þ K2
ksv; (19.18)
giving us the result in Eq. (19.18).
APPENDIX 19.1 373
14. EXERCISES
19.1 Give the expression for estimate m
_
n at the nth measurement,
using a recursive approach as in Eqs. (19.3)e(19.5)
19.2 Why does (in the Kalman data fusion) adding a measurement
always results in a decrease of the uncertainty (variance) of
the estimate?
19.3 Write the following third-order ODE in the form of a state
vector and matrix operator
0
x1 þ b€
x1 þ c _
x1 þ dx1 ¼ 0:
Now repeat this for 0
x1 þ b€
x1 þ c _
x1 þ dx1 ¼ e.
Give the details of the procedure, i.e., define the state and matrix A
(see Eqs. (19.8)e(19.11)).
19.4 We have a series of measurements yn and system states xn. We
want to link both using a simple linear model: yn ¼ a$xn. Use
the MMSE method to find an optimal estimate for a. Comment
on how this result relates to linear regression.
19.5 Show that Eqs. (19.6) and (19.7) are the same as the Kalman
approach to fusion of data.
Hint 1: Combine Eqs. (19.15) and (19.19); add s1
s1þ s2
y1 $ s1
s1þ s2
y1 to
the result and rearrange the terms as in Eq. (19.6).
Hint 2: Combine Eqs. (19.20) and (19.19); rearrange Eq. (19.7) as
s ¼ s1s2
s1þ s2
and compare the results.
Discuss the difference of combining variance (noise) between
Eq. (19.7) and the combined effect of noise sources as in
Eq. (3.15).
References
Brown, R.G., Hwang, P.Y.C., 1997. Introduction to Random Signals and Applied Kalman
Filtering. John Wiley & Sons, New York, NY.
Maybeck, P.S., 1982. Stochastic Models, Estimation, and Control, vol. 141. Academic Press,
New York.
Schiff, S.J., 2012. Neural Control Engineering: The Emerging Intersection Between Control
Theory and Neuroscience. MIT Press, Cambridge, MA.
Welch, G., Bishop, G. An Introduction to the Kalman Filter. University of North Carolina.
http://www.cs.unc.edu/wwelch/media/pdf/kalman_intro.pdf.
19. KALMAN FILTER
374