Background
Before describing the mechanics of estimating robust regression models in EViews, it will be useful to review the basics of the three estimation methods and to outline alternative approaches for computing the covariance matrix of the coefficient estimates.
M-estimation
The traditional least squares estimator is computed by finding coefficient values that minimize the sum of the squared residuals:
| (33.1) |
where the residual function
is given by
| (33.2) |
Since the residuals
enter the objective function on the right-hand side of
Equation (33.1) after squaring, the effects of outliers are magnified accordingly.
M-estimator definition
One obvious approach to robust regression replaces squaring of residuals in
Equation (33.1) with a function that provides less weight to outliers. The Huber M-estimator (“M” for “maximum likelihood estimator-like”) computes the coefficient values that minimize the summed values of a function
of the residuals:
| (33.3) |
where
is a measure of the scale of the residuals,
is an arbitrary positive
tuning constant associated with the function, and where
are individual weights that are generally set to 1, but may be set to:
| (33.4) |
to down-weight observations with high leverage (large diagonals of the Hat Matrix).
The potential choices for the function
(Andrews, Bisquare, Cauchy, Fair, Huber-Bisquare, Logistic, Median, Talworth, Welsch) are outlined below along with the default values of the tuning constants:
The default tuning constants for each function are taken from Holland and Welsch (1977), and are chosen so that the estimator achieves 95% asymptotic efficiency under residual normality.
M-estimator calculation
If the scale
is known, then the
-vector of coefficient estimates
may be found using standard iterative techniques for solving the
nonlinear first-order equations:
| (33.5) |
for
, where
, the derivative of the
function, and
is the value of the
j-th regressor for observation
.
Since
is not known, a sequential procedure is used that alternates between: (1) computing updated estimates of the scale
given coefficient estimates
, and (2) using iterative methods to find the
that solves
Equation (33.5) for a given
. The initial
are obtained from ordinary least squares. The initial coefficients are used to compute a scale estimate,
, and from that are formed new coefficient estimates
, followed by a new scale estimate
, and so on until convergence is reached.
Given an estimate
, the updated scale
is estimated using one of three different methods: Mean Absolute Deviation – Zero Centered (MADZERO), Median Absolute Deviation – Median Centered (MADMED), or Huber Scaling:
where
are the residuals associated with
and where the initial scale required for the Huber method is estimated by:
| (33.6) |
M-estimator summary statistics
EViews automatically computes a variety of robust summary statistics for equations estimated using M-estimators.
R-squared
Maronna (1996, p. 171) defines the robust
statistic as
where
is the M-estimate from the constant-only specification.
The adjusted
is calculated as:
| (33.7) |
Both of these statistics can be highly sensitive to the choice of function, even when the coefficient estimates and standard errors are not. Studies have also found that these statistics may be upwardly biased (see, for example, Renaud and Victoria-Feser (2010)).
Rw-squared
Renaud and Victoria-Feser (2010) propose the
statistic, and provide simulation results showing
to be a better measure of fit than the robust
outlined above. The
statistic is defined as
| (33.8) |
where
is the function of the residual value and
| (33.9) |
As with the robust
, an adjusted value of
may be calculated from the unadjusted statistic
| (33.10) |
Rn-squared Statistic
The
statistic is a robust version of a Wald test of the hypothesis that all of the coefficients are equal to zero. It is calculated using the standard Wald test quadratic form:
| (33.11) |
where
are the
non-intercept robust coefficient estimates and
is the corresponding estimated covariance. Under the null hypothesis that all of the coefficients are equal to zero, the
statistic is asymptotically distributed as a
.
Deviance
The deviance is the value of the objective function
Equation (33.3) evaluated at the final coefficient estimates and estimate of the scale:
| (33.12) |
Information Criteria
EViews reports two information criteria for M-estimated equations: the robust equivalent of the Akaike Information Criterion (
), and a corresponding robust Schwarz Information Criterion (
):
| (33.13) |
where
is the derivative of
as outlined in Holland and Welsch (1977). See Ronchetti (1985) for details.
S-estimation
The S-estimator (“S” for “scale statistic”) is a member of the class of high-breakdown-value estimators introduced by Rousseeuw and Yohai (1984). The breakdown-value of an estimator can be seen as a measure of an estimator's robustness to outliers. (A good description of breakdown-values and high-breakdown-value estimators can be found in Hubert and Debruyne (2009)).
S-estimator definition
S-estimators find the set of coefficients
that provide the smallest estimate of the scale
such that:
| (33.14) |
for the function
with tuning constant
, where
is taken to be
with
the standard normal. The breakdown value
for this estimator is
.
Following Rousseeuw and Yohai, we choose a function based on the integral of the Biweight function
| (33.15) |
and estimate the scale
using the Median Absolute Deviation, Zero Centered (MADZERO) method.
Note that
affects the objective function through
and
.
is typically chosen to achieve a desired breakdown value. EViews defaults to a
value of 1.5476 implying a breakdown value of 0.5. Other notable values for
(with associated
) are:
| |
5.1824 | 0.10 |
4.0963 | 0.15 |
3.4207 | 0.20 |
2.9370 | 0.25 |
2.5608 | 0.30 |
1.9880 | 0.40 |
1.5476 | 0.50 |
S-estimator calculation
Calculation of S-estimates is computationally intensive, and there exist a number of fast algorithms that provide accurate approximations. EViews uses the Fast-S algorithm of Salibian-Barrera and Yohai (2006):
1. Obtain a random subsample of size
from the data and compute the least squares regression to obtain a
. By default
is set equal to
, the number of regressors. (Note that with the default
, the regression will produce an exact fit for the subsample.)
2. Using the full sample, perform a set of
refinements to the initial coefficient estimates using a variant of M-estimation which takes a single step toward the solution of
Equation (33.5) at every
update. These modified M-estimate refinements employ the Bisquare function
with tuning parameter and scale estimator
| (33.16) |
where
is the previous iteration's estimate of the scale and
is the breakdown value defined earlier.
The initial scale estimator
is obtained using MADZERO
3. Compute a new set of residuals over the entire sample using the possibly refined initial coefficient estimates, compute an estimate of the scale
using MADZERO, and produce a final estimate of
by iterating
Equation (33.16) (with
in place of
) to convergence or until
.
4. Steps 1-3 are repeated
times. The best (smallest)
scale estimates are refined using M-estimation as in Step 2 with
(or until convergence). The smallest scale from those refined scales is the final estimate of
, and the final coefficient estimates are the corresponding estimates of
.
S-estimator summary statistics
The following summary statistics are available for equations estimated by S-estimation:
R-squared
The robust version of
for S-estimation is given by:
| (33.17) |
where
is the estimate of the scale from the final estimation, and
is an estimate of the scale from S-estimation with only a constant as a regressor.
Deviance
The S-estimator deviance value is given by:
| (33.18) |
Rn-squared Statistic
The
statistic is identical to the one computed for M-estimation. See
“Rn-squared Statistic” for discussion.
MM Estimation
MM-estimation addresses outliers in both the dependent and the independent variables by combining S-estimation with M-estimation.
The MM-estimator first computes S-estimates of the coefficients and scale, then uses the estimate of the scale as a fixed value in iterating to find a solution to
Equation (33.5). The second stage M-estimation in EViews uses the Bisquare function with a default tuning parameter value of 4.684 which gives 95% relative efficiency for normal errors (Yohai, 1987).
The summary statistics for MM-estimation are obtained from the second-stage M-estimation procedure.
Coefficient Covariance Methods
EViews offers three different methods for computing the coefficient covariance matrix taken from Huber (1981, p. 173, equations 6.5, 6.6 and 6.7). All three methods provide unbiased estimates of the covariance matrix, with none having better properties than the others.
with
| (33.19) |
where as before,
and
is the value of the
j-th regressor for observation
.
The first method (which is the easiest computationally) is the default choice.