Fitting Lines - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Fitting Lines

Description:

Line Fitting with Least Squares ... LS penalizes outliers too heavily (allows them to dominate the fit) ... minimizing the sum of squares, or sum of absolute ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 27
Provided by: kimlb
Category:
Tags: fitting | lines

less

Transcript and Presenter's Notes

Title: Fitting Lines


1
Fitting Lines
  • Many applications in which straight lines are
    important features
  • From, for example, a set of edge points, we often
    will want to find the best straight line through
    those points
  • What best means will gradually become clear

2
Line Fitting with Least Squares
Start by assuming that we know which points are
on the line now we want to determine the lines
parameters.
The equation of the line Minimize this Solve
this for a, b (easy in Matlab)
Assumes no error in x, only in y. Near vertical
lines problematic.
3
Total Least Squares
Represent line as
Parameterization scales
Distance from (u,v) to line
Minimize
Subject to
Lagrange multipliers lead to an eigenvalue
problem (see text)
4
TLS
5
Packing Points into Lines
  • Suppose we dont know that a set of points share
    a line (as we assumed before)
  • Now the problem can be much harder because we
    dont want to
  • Grab some points at random
  • Do a trial fit
  • Optimize over all groupings of points to possible
    lines

6
Whats My Line?
  • Lets assume that we have some additional clues
  • Connected edge contours (at least)
  • Local orientation, gradient direction
  • Other photometry contrast, color
  • Then we can control the combinatorics
  • Incremental line fitting
  • K-means
  • Other grouping (as before, or tensor voting)
  • Probabilistic (expectation-maximization)

7
(No Transcript)
8
About Algorithm 15.1
  • Notice the refit step. Thats important.
  • Results could depend on starting point set.
  • Can be augmented with a robust estimator. (later)
  • Similar idea has been used in a robust sequential
    estimator for estimating surfaces in range data
    (Mirza-Boyer).

9
Using K-Means
  • Suppose no auxiliary information, no contours
  • Model k lines, each generating some subset of
    points
  • Best solution minimizes
  • Over correspondences and line parameters
  • Combinatorially impossible
  • So, adapt k-means in a two phase implementation
  • Allocate each point to the closest line
  • Fit the best line to the points allocated to it

10
(No Transcript)
11
Robustness
  • A single bad data point can ruin the LS fit
  • Bad data?
  • A perturbation caused by some mechanism other
    than the model (implicitly Gaussian)
  • Gross measurement error
  • Matching blunders
  • We call them outliers
  • Thick-tailed densities (Student t, for instance)
  • LS penalizes outliers too heavily (allows them to
    dominate the fit)
  • Need to limit the penalty, or find and discard

12
A good LS fit to a set of points.
One bogus point leads to disaster.
13
Another outlier, another disaster.
Closeup of the poor fit to the true data.
14
M-Estimators
Robust procedures can be thought of as variations
on LS. Although several classifications exist,
we will concentrate on M-Estimators in linear
regression. General linear regression model on p
parameters Note The function for z need not
be linear in the explanatory variables, and x can
be a vector. For instance
15
We can package this as Where z n-vector of
observations on the dependent variable X (n,p)
matrix of observations on the explanatory
variables, rank p Q p-vector of
regression coefficients to be estimated e
n-vector of disturbances
16
The Nature of M-Estimators
  • Key step replace the quadratic loss function by
    another symmetric cost function on the residuals.
  • Robust estimators are more efficient (lower
    variance) when the disturbances are non-Gaussian.
  • Slightly less efficient when they are Gaussian.
  • Can model the noise by a heavy-tailed density and
    use maximum likelihood estimation. (Hence, the
    name.)
  • Direct evaluation can be a nightmare, but we can
    sneak up on it with (re)weighted least squares.

17
A robust M-estimate for , minimizes
Where s is a known or previously computed scale
parameter and r is a robust loss function,
meeting the Dirichlet conditions. This is more
general than minimizing the sum of squares, or
sum of absolute values. In fact, the mean and
median are special cases of M-estimators. OK.
How do we come up with a loss function? One way,
assume a form for the error density function and
go from there
18
Let our differentiable error density have the
form
a scale parameter
a functional form, not Gaussian (else, POLS)
the ith actual error (residual)
Given a sample z of n observations, the
log-likelihood for q, s2 is
19
Differentiate with respect to q, s2 , get
(1)
20
Set the partials to zero, get maximum likelihood
estimates ,
(2a)
(2b)
Nonlinear, solve iteratively using reweighted
least squares. Rewrite (2a) in matrix form W
is a diagonal weight matrix whose entries wi
depend on the residuals, as in Eq. (1).
21
The resulting iterative scheme, after
simplifying
(3)
Now, we select an appropriate heavy-tailed
distribution to use as a model, plug that into
Eq.(1) to get an expression for the weights as a
function of the residuals, and let it rip.
Note Dont forget to update the scale
parameter as you go!
22
One example, t distribution, with degree of
freedom f
Weight
Residual
And s is the current estimate of s, as in Eq.
(2b).
Put the weights on the diagonal of W and use (3)
to update the parameters until convergence.
23
In the preceding, we used maximum likelihood
analysis to go directly to the weights. It is
also common to specify the robust loss function
directly, although they usually have their roots
in a similar calculation.
A popular choice (text) Like most of these,
this looks quadratic near the origin and then
flattens out.
24
(No Transcript)
25
The impact of tuning s
Just right
Too large
Too small
26
Practical Robust Estimation
  • Do a LS fit
  • Examine the residuals. Discard some fraction of
    the data having the highest residuals. (highest
    10, for example)
  • Do another LS fit to remaining data.
  • Loop to convergence, or until some set fraction
    of points remains.
  • Not especially elegant, but it usually works!
Write a Comment
User Comments (0)
About PowerShow.com