Robust statistics methods [Tukey 1977 ; Huber 1981] provide tools for statistics problems in which underlying assumptions are inexact. A robust procedure should be insensitive to departures from underlying assumptions caused by, for example, outliers. That is, it should have good performance under the underlying assumptions and the performance deteriorates gracefully as the situation departs from the assumptions. Applications of robust methods in vision are seen in image restoration, smoothing and segmentation [Kashyap and Eom 1988 ; Jolion et al. 1991 ; Meer et al. 1991], surface and shape fitting [Besl et al. 1988 ; Stein and Werman 1992] and pose estimation [Haralick et al. 1989] where outliers are an issue.
There are several types of robust estimators. Among them are M-estimator (maximum likelihood type estimator), L-estimator (linear combinations of order statistics) and R-estimator (estimator based on rank transformation) [Huber 1981]; RM estimator (repeated median) [Siegel 1982] and LMS estimator (estimator using the least median of squares) [Rousseeuw 1984]. We are concerned with the M-estimator.
The essential form of the M-estimation problem is the following: Given a set of m data samples where , the problem is to estimate the location parameter f under noise . The distribution of is not assumed to be known exactly. The only underlying assumption is that obey a symmetric, independent, identical distribution (symmetric i.i.d.). A robust estimator has to deal with departures from this assumption.
Let the residual errors be () and the error penalty function be . The M-estimate is defined as the minimum of a global error function
To minimize above, it is necessary to solve the following equation
This is based on gradient descent. When can also be expressed as a function of , its first derivative can take the following form
where is an even function. In this case, the estimate can be expressed as the following weighted sum of the data samples
where h acts as the weighting function.
In the LS regression, all data points are weighted equally with and the estimate is . When outliers are weighted equally as inliers, it will cause considerable bias and rapid deterioration of the quality of the estimate. In robust M estimation, the function h provides adaptive weighting. The influence from is decreased when is very large and suppressed when it is infinitely large.
Table 4.1: Robust functions.
Table 4.1 lists some robust functions used in practice where . They are closely related to the adaptive interaction function and adaptive potential function defined in (3.27) and (3.28). Fig.4.1 shows their qualitative shapes in comparison with the quadratic and the line process models (note that a trivial constant may be added to ). These robust functions are piecewise as in the line process model. Moreover, the parameter in is dependent on some scale estimate, such as the median of absolute deviation (MAD).
Figure 4.1: The qualitative shapes of potential functions in use. The quadratic prior (equivalent to LS) model in (a) is unable to deal with discontinuities (or outliers). The line process model (b), Tukey's (c), Huber's (d), Andrews' (e) and Hampel's (f) robust model are able to, owing to their property of . From (Li 1995a) with permission; © 1995 Elsevier.