MIT OpenCourseWare
http://ocw.mit.edu
______________
12.540 Principles of Global Positioning Systems
Spring 2008
For information about citing these materials or our Terms of Use, visit: ___________________
                                                                         http://ocw.mit.edu/terms.
12.540 Principles of the Global
     Positioning System
          Lecture 10
     Prof. Thomas Herring
             Estimation: Introduction
• Overview
    – Basic concepts in estimation
    – Models: Mathematical and Statistical
    – Statistical concepts
03/13/06                 12.540 Lec 10       2
               Basic concepts
• Basic problem: We measure range and phase
  data that are related to the positions of the
  ground receiver, satellites and other
  quantities. How do we determine the “best”
  position for the receiver and other quantities.
• What do we mean by “best” estimate?
• Inferring parameters from measurements is
  estimation
03/13/06             12.540 Lec 10                  3
                 Basic estimation
• Two styles of estimation (appropriate for
  geodetic type measurements)
    – Parametric estimation where the quantities to be
      estimated are the unknown variables in equations
      that express the observables
    – Condition estimation where conditions can be
      formulated among the observations. Rarely used,
      most common application is leveling where the sum
      of the height differences around closed circuits
      must be zero
03/13/06                12.540 Lec 10                 4
             Basics of parametric estimation
• All parametric estimation methods can be
  broken into a few main steps:
    – Observation equations: equations that relate the
      parameters to be estimated to the observed
      quantities (observables). Mathematical model.
           • Example: Relationship between pseudorange, receiver
             position, satellite position (implicit in ρ), clocks,
             atmospheric and ionosphere delays
    – Stochastic model: Statistical description that
      describes the random fluctuations in the
      measurements and maybe the parameters
    – Inversion that determines the parameters values
      from the mathematical model consistent with the
      statistical model.
03/13/06                        12.540 Lec 10                        5
                Observation model
• Observation model are equations relating
  observables to parameters of model:
    – Observable = function (parameters)
    – Observables should not appear on right-hand-side
      of equation
• Often function is non-linear and most common
  method is linearization of function using Taylor
  series expansion.
• Sometimes log linearization for f=a.b.c ie.
  Products fo parameters
03/13/06                12.540 Lec 10                    6
                   Taylor series expansion
• In most common Taylor series approach:
    y = f (x1, x 2 , x 3 , x 4 )
                                ∂f (x)
    y 0 + Δy = f (x) x 0      +        Δx           x = (x1, x 2 , x 3 , x 4 )
                                 ∂x
• The estimation is made using the difference between
  the observations and the expected values based on
  apriori values for the parameters.
• The estimation returns adjustments to apriori
  parameter values
03/13/06                            12.540 Lec 10                                7
                 Linearization
• Since the linearization is only an
  approximation, the estimation should be
  iterated until the adjustments to the parameter
  values are zero.
• For GPS estimation: Convergence rate is 100-
  1000:1 typically (ie., a 1 meter error in apriori
  coordinates could results in 1-10 mm of non-
  linearity error).
03/13/06             12.540 Lec 10                8
                    Estimation
• (Will return to statistical model shortly)
• Most common estimation method is “least-squares” in
  which the parameter estimates are the values that
  minimize the sum of the squares of the differences
  between the observations and modeled values based
  on parameter estimates.
• For linear estimation problems, direct matrix
  formulation for solution
• For non-linear problems: Linearization or search
  technique where parameter space is searched for
  minimum value
• Care with search methods that local minimum is not
  found (will not treat in this course)
03/13/06              12.540 Lec 10                     9
              Least squares estimation
• Originally formulated by Gauss.
• Basic equations: Δy is vector of observations;
  A is linear matrix relating parameters to
  observables; Δx is vector of parameters; v is
  residual
      Δy = AΔx + v
      minimize (v T v); superscript T means transpose
      Δx = (A T A)−1 A T Δy
03/13/06                      12.540 Lec 10             10
           Weighted Least Squares
• In standard least squares, nothing is assumed
  about the residuals v except that they are zero
  mean.
• One often sees weight-least-squares in which
  a weight matrix is assigned to the residuals.
  Residuals with larger elements in W are given
  more weight.
            minimize (v T Wv);
            Δx = (A T WA)−1 A T WΔy
03/13/06              12.540 Lec 10             11
      Statistical approach to least squares
• If the weight matrix used in weighted least
  squares is the inverse of the covariance matrix
  of the residuals, then weighted least squares
  is a maximum likelihood estimator for
  Gaussian distributed random errors.
• This latter form of least-squares is most
  statistically rigorous version.
• Sometimes weights are chosen empirically
03/13/06             12.540 Lec 10             12
             Review of statistics
• Random errors in measurements are
  expressed with probability density functions
  that give the probability of values falling
  between x and x+dx.
• Integrating the probability density function
  gives the probability of value falling within a
  finite interval
• Given a large enough sample of the random
  variable, the density function can be deduced
  from a histogram of residuals.
03/13/06             12.540 Lec 10                  13
                               Example of random variables
                    4.0
                    3.0
                    2.0
  Random variable
                    1.0
                    0.0
                    -1.0
                    -2.0
                    -3.0          Uniform
                                  Gaussian
                    -4.0
                        0.00       200.00    400.00          600.00   800.00
                                                 Sample
03/13/06                                     12.540 Lec 10                     14
                  Histograms of random variables
                                                Gaussian
                                                Uniform
                                                490/sqrt(2pi)*exp(-x^2/2)
                               200.0
                               150.0
           Number of samples
                               100.0
                                50.0
                                 0.0
                                       -3.75   -2.75   -1.75    -0.75 0.25     1.25   2.25   3.25
                                                               Random Variable x
03/13/06                                                       12.540 Lec 10                        15
      Characterization Random Variables
• When the probability distribution is known, the
  following statistical descriptions are used for
  random variable x with density function f(x):
      Expected Value      < h(x) >           ∫ h(x) f (x)dx
           Expectation      <x>              ∫ xf (x)dx = μ
            Variance     < (x − μ) 2 >      ∫ (x − μ)   2
                                                            f (x)dx
  Square root of variance is called standard deviation
03/13/06                    12.540 Lec 10                             16
              Theorems for expectations
• For linear operations, the following theorems
  are used:
    – For a constant <c> = c
    – Linear operator <cH(x)> = c<H(x)>
    – Summation <g+h> = <g>+<h>
• Covariance: The relationship between random
  variables fxy(x,y) is joint probability distribution:
     σ xy =< (x − μx )(y − μy ) >=   ∫ (x − μ )(y − μ ) f
                                               x     y      xy   (x, y)dxdy
     Correlation : ρ xy = σ xy /σ xσ y
03/13/06                       12.540 Lec 10                                  17
               Estimation on moments
• Expectation and variance are the first and second
  moments of a probability distribution
                 N
           μˆ x ≈ ∑ x n /N ≈     ∫ x(t)dt
                             1
                  n=1
                             T
                  N                    N
           σˆ x2 ≈ ∑ (x − μx ) 2 /N ≈ ∑ (x − μˆ x ) 2 /(N −1)
                 n=1                  n=1
• As N goes to infinity these expressions approach their
  expectations. (Note the N-1 in form which uses mean)
03/13/06                      12.540 Lec 10                     18
                   Probability distributions
• While there are many probability distributions
  there are only a couple that are common
  used:
•                      1
      Gaussian f (x) =   e−( x− μ ) /(2σ )
                                                      2   2
                                 σ 2π
                                                          1
                                    1         − ( x− μ )T V −1 ( x− μ )
           Multivariant f ( x ) =           e  2
                                  (2π ) n V
                                   x r / 2−1e−x / 2
           Chi − squared χ r (x) =
                                  2
                                   Γ(r /2)2 r / 2
03/13/06                              12.540 Lec 10                       19
            Probability distributions
• The chi-squared distribution is the sum of the squares
  of r Gaussian random variables with expectation 0
  and variance 1.
• With the probability density function known, the
  probability of events occurring can be determined.
  For Gaussian distribution in 1-D; P(|x|<1σ) = 0.68;
  P(|x|<2σ) = 0.955; P(|x|<3σ) = 0.9974.
• Conceptually, people thing of standard deviations in
  terms of probability of events occurring (ie. 68% of
  values should be within 1-sigma).
03/13/06                12.540 Lec 10                  20
             Central Limit Theorem
• Why is Gaussian distribution so common?
• “The distribution of the sum of a large number of
  independent, identically distributed random variables
  is approximately Gaussian”
• When the random errors in measurements are made
  up of many small contributing random errors, their
  sum will be Gaussian.
• Any linear operation on Gaussian distribution will
  generate another Gaussian. Not the case for other
  distributions which are derived by convolving the two
  density functions.
03/13/06                12.540 Lec 10                     21
                        Summary
• Examined simple least squares and weighted least
  squares
• Examined probability distributions
• Next we pose estimation in a statistical frame work
• Some web resources for reading;
http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd4.htm
http://www.weibull.com/LifeDataWeb/least_squares.htm
03/13/06                   12.540 Lec 10                        22