0% found this document useful (0 votes)
25 views60 pages

ML Lecture17

This document discusses Kalman filtering and related techniques for time series modeling and state estimation. It provides an overview of the problem of estimating the most probable state at different times (prediction, filtering, smoothing) using noisy sensor measurements and a system model. Applications mentioned include robotics, tracking, econometrics, and navigation. The document then discusses the fundamentals of modeling time series as stochastic processes, hidden Markov models, dynamic systems, and recursive Bayesian filtering. It notes that in certain restrictive cases including linear Gaussian systems, the optimal solution exists in closed form using the Kalman filter equations.

Uploaded by

Marche Remi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views60 pages

ML Lecture17

This document discusses Kalman filtering and related techniques for time series modeling and state estimation. It provides an overview of the problem of estimating the most probable state at different times (prediction, filtering, smoothing) using noisy sensor measurements and a system model. Applications mentioned include robotics, tracking, econometrics, and navigation. The document then discusses the fundamentals of modeling time series as stochastic processes, hidden Markov models, dynamic systems, and recursive Bayesian filtering. It notes that in certain restrictive cases including linear Gaussian systems, the optimal solution exists in closed form using the Kalman filter equations.

Uploaded by

Marche Remi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Kalman filtering and friends:

Inference in time series models


Herke van Hoof
slides mostly by Michael Rubinstein
Problem overview
• Goal
– Estimate most probable state at time k using
measurement up to time k’
k’<k: prediction
k’=k: filtering
k’>k: smoothing
• Input
– (Noisy) Sensor measurements
– Known or learned system model (see last lecture)

• Many problems require estimation of the state of


systems that change over time using noisy
measurements on the system
Applications
• Ballistics
• Robotics
– Robot localization
• Tracking hands/cars/…
• Econometrics
– Stock prediction
• Navigation

• Many more…
Example: noisy localization
Position at t=0
https://img.clipartfest.com

Measurement at t=1

t=2
t=3
t=4
t=5

t=6
Example: noisy localization
Position at t=0
https://img.clipartfest.com

Measurement at t=1

t=2
Smoothing: where was I in the past t=3
t=4
t=5

t=6
Example: noisy localization
Position at t=0
https://img.clipartfest.com

Measurement at t=1

t=2
Smoothing: where was I in the past t=3
t=4
t=5
Filtering: where am I now t=6
Example: noisy localization
Position at t=0
https://img.clipartfest.com

Measurement at t=1

t=2
Smoothing: where was I in the past t=3
t=4
t=5
Filtering: where am I now t=6

Prediction: where will I be in the future


Today’s lecture
• Fundamentals
• Formalizing time series models
• Recursive filtering
• Two cases with optimal solutions
• Linear Gaussian models
• Discrete systems
• Suboptimal solutions
Stochastic Processes
• Stochastic process
– Collection of random variables indexed by some set
– Ie. R.V. 𝑥" for every element 𝑖 in index set

• Time series modeling


– Sequence of random states/variables
– Measurements available at discrete times
– Modeled as stochastic process indexed by ℕ
Stochastic Processes
• Stochastic process
– Collection of random variables indexed by some set
– Ie. R.V. 𝑥" for every element 𝑖 in index set

• Time series modeling


– Sequence of random states/variables
– Measurements available at discrete times
– Modeled as stochastic process indexed by ℕ

𝑝(𝑥$ )

𝑥$ (location at t=1)
Stochastic Processes
• Stochastic process
– Collection of random variables indexed by some set
– Ie. R.V. 𝑥" for every element 𝑖 in index set

• Time series modeling


– Sequence of random states/variables
– Measurements available at discrete times
– Modeled as stochastic process indexed by ℕ

𝑝(𝑥$ )

𝑝(𝑥( )

𝑝(𝑥) )
(First-order) Markov process
• The Markov property – the likelihood of a
future state depends on present state only

• Markov chain – A stochastic process with


Markov property

k-1 k k+1 time


xk-1 xk xk+1 States
Hidden Markov Model (HMM)
• the state is not directly visible, but output
dependent on the state is visible

k-1 k k+1 time


xk-1 xk xk+1 States
(hidden)

zk-1 zk zk+1 Measurements


(observed)
State space
• The state vector contains all available
information to describe the investigated system
– usually multidimensional:

• The measurement vector represents


observations related to the state vector
– Generally (but not necessarily) of lower dimension
than the state vector
State space

• Tracking: Econometrics:
• Monetary flow
• Interest rates
• Inflation
• …
Hidden Markov Model (HMM)
• the state is not directly visible, but output
dependent on the state is visible

k-1 k k+1 time


xk-1 xk xk+1 States
(hidden)

zk-1 zk zk+1 Measurements


(observed)
Dynamic System
k-1 k k+1
xk-1 xk xk+1

zk-1 zk zk+1
Stochastic diffusion
State equation:
state vector at time instant k
state transition function,
i.i.d process noise

Observation equation:
observations at time instant k
observation function,
i.i.d measurement noise
A simple dynamic system
• (4-dimensional state space)
• Constant velocity motion:

• Only position is observed:


Gaussian distribution

Yacov Hel-Or
Today’s lecture
• Fundamentals
• Formalizing time series models
• Recursive filtering
• Two cases with optimal solutions
• Linear Gaussian models
• Discrete systems
• Suboptimal solutions
Recursive filters
• For many problems, estimate is required each time a
new measurement arrives

• Batch processing
– Requires all available data
• Sequential processing
– New data is processed upon arrival
– Need not store the complete dataset
– Need not reprocess all data for each new measurement
– Assume no out-of-sequence measurements (solutions for
this exist as well…)
Bayesian filter
• Construct the posterior probability Thomas Bayes
density function of the state based
on all available information
Posterior

Sample space

• By knowing the posterior many kinds of


estimates for can be derived
– mean (expectation), mode, median, …
– Can also give estimation of the accuracy (e.g.
covariance)
Recursive Bayes filters
• Given:
– System models in probabilistic forms
Markovian process

Measurements conditionally
independent given state
(known statistics of vk, wk)
– Initial state also known as the prior
– Measurements 𝑧$, … , 𝑧-
Recursive Bayes filters
• Prediction step (a-priori)

– Uses the system model to predict forward


– Deforms/translates/spreads state pdf due to random noise

• Update step (a-posteriori)

– Update the prediction in light of new data


– Tightens the state pdf
Prior vs posterior?

• It can seem odd to regard 𝑝 𝑥- 𝑧$:-/$ as prior


• Compare likelihood prior
posterior
𝑝 𝑧- 𝑥- , 𝑧$:-/$ 𝑃(𝑥- |𝑧$:-/$ )
𝑃(𝑥- |𝑧- , 𝑧$:-/$ ) =
𝑝(𝑧- |𝑧$:-/$ )
evidence
to
𝑝 𝑧- 𝑥- , 𝑧$:-/$ 𝑃(𝑥- |𝑧$:-/$ )
𝑃(𝑥- |𝑧- , 𝑧$:-/$ ) =
𝑝(𝑧- |𝑧$:-/$ )

• In update with 𝑧- , it is a working prior


General prediction-update framework

• Assume is given at time k-1


• Prediction:
System model Previous posterior
(1)

• Using Chapman-Kolmogorov identity + Markov


property
General prediction-update framework

• Update step

Measurement Current
model prior

(2)

Normalization constant
Where
Generating estimates

• Knowledge of enables to compute


optimal estimate with respect to any
criterion. e.g.
– Minimum mean-square error (MMSE)

– Maximum a-posteriori
General prediction-update framework

➔So (1) and (2) give optimal solution for the


recursive estimation problem!
• Unfortunately… only conceptual solution
– integrals are intractable…
– Cannot represent arbitrary pdfs!

• However, optimal solution does exist for


several restrictive cases
Today’s lecture
• Fundamentals
• Formalizing time series models
• Recursive filtering
• Two cases with optimal solutions
• Linear Gaussian models
• Discrete systems
• Suboptimal solutions
Restrictive case #1
• Posterior at each time step is Gaussian
– Completely described by mean and covariance
• If is Gaussian it can be shown that
is also Gaussian provided that:
– are Gaussian
– are linear
Restrictive case #1
• Why Linear?

Yacov Hel-Or
Restrictive case #1
• Why Linear?

Yacov Hel-Or
Restrictive case #1
• Linear system with additive noise

• Simple example again

/$

/$
The Kalman filter
Rudolf E. Kalman

• Substituting into (1) and (2) yields the predict and


update equations
The Kalman filter
Predict:

Update:
Intuition via 1D example
• Lost at sea
– Night
– No idea of location
– For simplicity – let’s
assume 1D
– Not moving

* Example and plots by Maybeck, “Stochastic models, estimation and control, volume 1”
Example – cont’d
• Time t1: Star Sighting
– Denote z(t1)=z1
• Uncertainty (inaccuracies, human error, etc)
– Denote σ1 (normal)
• Can establish the conditional probability of
x(t1) given measurement z1
Example – cont’d

• Probability for any location, based on measurement


• For Gaussian density – 68.3% within ±σ1
• Best estimate of position: Mean/Mode/Median
Example – cont’d
• Time t2: friend (more trained)
– x(t2)=z2, σ(t2)=σ2
– Since she has higher skill: σ2<σ1
Example – cont’d
• f(x(t2)|z1,z2) also Gaussian
Example – cont’d

• σ less than both σ1 and σ2


• σ1= σ2: average
• σ1> σ2: more weight to z2
• Rewrite:
Example – cont’d
• The Kalman update rule:

Best estimate
Given z2
(a posteriori)

Best Prediction prior to z2 Optimal Weighting Residual


(a priori) (Kalman Gain)
The Kalman filter
Predict:

Generally increases
variance
Update:

Generally decreases
variance
Kalman gain

• Small measurement error, 𝐻 invertible:

• Small prediction error:


The Kalman filter
• Pros (compared to e.g. particle filter)
– Optimal closed-form solution to the tracking problem
(under the assumptions)
• No algorithm can do better in a linear-Gaussian environment!
– All ‘logical’ estimations collapse to a unique solution
– Simple to implement
– Fast to execute
• Cons
– If either the system or measurement model is non-
linear à the posterior will be non-Gaussian

Smoothing possible with a


backward message
(cf HMMs, lecture 10)
Restrictive case #2
• The state space (domain) is discrete and finite
• Assume the state space at time k-1 consists of
states
• Let be the conditional
probability of the state at time k-1, given
measurements up to k-1
The Grid-based filter
• The posterior pdf at k-1 can be expressed as
sum of delta functions

• Again, substitution into (1) and (2) yields the


predict and update equations

Equivalent to belief
monitoring in HMMs
(Lecture 10)
The Grid-based filter
• Prediction
(1)

• New prior is also weighted sum of delta functions


• New prior weights are reweighting of old posterior weights using state transition
probabilities
The Grid-based filter
• Update
(2)

• Posterior weights are reweighting of prior weights using likelihoods (+


normalization)
The Grid-based filter
• Pros:
– assumed known, but no
constraint on their (discrete) shapes
– Easy extension to varying number of states
– Optimal solution for the discrete-finite environment!
• Cons:
– Curse of dimensionality
• Inefficient if the state space is large
– Statically considers all possible hypotheses
Smoothing possible with a
backward message
(cf HMMs, lecture 10)
Today’s lecture
• Fundamentals
• Formalizing time series models
• Recursive filtering
• Two cases with optimal solutions
• Linear Gaussian models
• Discrete systems
• Suboptimal solutions
Suboptimal solutions
• In many cases these assumptions do not hold
– Practical environments are nonlinear, non-Gaussian,
continuous
➔Approximations are necessary…

– Extended Kalman filter (EKF)


Analytic approximations
– Approximate grid-based methods
– Multiple-model estimators Numerical methods
– Unscented Kalman filter (UKF) Gaussian-sum filters
– Particle filters (PF)
– …
Sampling approaches
The extended Kalman filter
• The idea: local linearization of the dynamic
system might be sufficient description of the
nonlinearity
• The model: nonlinear system with additive
noise
The extended Kalman filter
• f, h are approximated using a first-order Taylor
series expansion (eval at state estimations)
Predict:

Update:
The extended Kalman filter
The extended Kalman filter
• Pros
– Good approximation when models are near-linear
– Efficient to calculate
(de facto method for navigation systems and GPS)
• Cons
– Only approximation (optimality not proven)
– Still a single Gaussian approximations
• Nonlinearity à non-Gaussianity (e.g. bimodal)
– If we have multimodal hypothesis, and choose
incorrectly – can be difficult to recover
– Inapplicable when f,h discontinuous
The unscented Kalman filter

Yacov Hel-Or

• Can give more accurately approximates posterior


Challenges
• Detection specific
– Full/partial occlusions
– False positives/false negatives
– Entering/leaving the scene
• Efficiency
• Multiple models and switching dynamics
• Multiple targets
• …
Conclusion
• Inference in time series models:
• Past: smoothing
• Present: filtering
• Future: prediction
• Recursive Bayes filter optimal
• Computable in two cases
• Linear Gaussian systems: Kalman filter
• Discrete systems: Grid filter
• Approximate solutions for other systems

You might also like