Examples

We provide some examples of elastic net estimation using the prostate cancer data from Stamey et al, (1989), which also serve as example data in Hastie, Tibshirani, and Friedman (2009, p. 3).

The “Prostate.WF1” workfile containing these data obtained from the authors’ website is provided with your EViews example data.

The dependent variable is the logarithm of prostate-specific antigen (LPSA). The regressors are log(cancer volume) (LCAVOL), log(prostate weight) (LWEIGHT), age, the logarithm of the amount of benign prostatic hyperplasia (LBPH), seminal vesicle invasion (SVI), log(casular penetration) (LCP), Gleason score (GLEASON) and percentage Gleason score 4 or 5 (PGG45). Following Hastie, Tibrshirani, and Friedman we standardize the regressors over the full data, creating corresponding series with the original names appended with “_S”.

Ridge Regression

We begin with a simple ridge regression example. Open the equation estimation dialog, select ENET - Elastic Net Regularization in the Method drop-down menu, and fill out the dialog as follows:

We list the dependent variable followed by standardized variables as regressors

lpsa c lcavol_s lweight_s age_s lbph_s svi_s lcp_s gleason_s pgg45_s

in the Equation specification edit field, and choose Ridge regression in the Type drop-down menu.

• We leave the Lambda edit field blank so that EViews will determine the lambda path of values to consider.

• We set the estimation sample to the observations where the TRAIN series takes the value “T”.

• On the Penalty tab we change the Regressor scaling drop-down to None since the data are pre-scaled.

• We change the Path min/max ratio to 0.0001.

We will leave the remaining settings at their defaults. Click on OK to estimate the equation. EViews will display the main estimation output.

The top portion of the output shows that we are estimating a ridge regression model with automatic path estimation. The optimal value of 14.23403 is computed from 5-fold cross-validation using a mean square error objective. Since the regressors are already standardized, we have specified no regressor scaling and there is no dependent variable scaling.

The output shows the results for estimation using the optimal penalty of 61.62 and the largest penalty values (279.4198 and 1358.707) with cross-validation mean square errors that are 1 and 2 standard deviations from the mean.

In all three cases, all of the coefficients are included in the specification, as expected for ridge regression, and the results clearly show the reduction in both the coefficient values and the goodness of fit measures as we increase the value of .

Click on View/Coefficient Path/Graphs to show the coefficients plotted against various values in the Lambda graph:

i

The vertical line shows the optimal value for log .

We may also see the relationship between the penalty and the fit measures. Select View/Lambda Path/Graphs and select the Fit Statistics node to see the relationship between the and the squared deviation based measure of fit:

Note that both measures are monotonically decreasing in the penalty, and the adjusted is negative for larger values.

The cross-validation views provide more insight into the effect of increasing penalty. Click on View/Model Selection/Cross-validation Error Graph to show the cross-validation objective plotted against the penalty:

The plot shows the average mean square error obtained from the K-fold cross-validation, again with the vertical line corresponding to the optimal log .

Lasso

Next, we estimate a modified specification using the Lasso penalty function. We change the penalty specification Type to Lasso, and fill out the main estimation dialog with the same variables as before, but increasing the relative penalty of PGG45_S, by entering

lpsa c lcavol_s lweight_s age_s lbph_s svi_s lcp_s gleason_s @vw(pgg45_s, 1.2)

into the Equation specification edit field.

We use here the simple form of the @vw specification in which the single argument is assumed to be the weight value. Note that we set individual weight for PGG45_S to 1.2, so that the variable is more heavily penalized. EViews will apply the specified weight, and then normalize all of the non-zero weights so that they sum to the number of penalized variables.

For added clarity, we may alter the specification to use the option form of the weight specification:

lpsa c lcavol_s lweight_s age_s lbph_s svi_s lcp_s gleason_s @vw(pgg45_s, w=1.2)

All of the remaining settings are as in the original ridge regression example in
“Ridge Regression”.

Click on OK to estimate the equation.

The top of the output is mostly familiar, but has an added line noting the higher individual penalty weight for PGG45_S.

As expected, the Lasso penalty produces more 0-valued coefficients than does the ridge penalty. The main estimation output table drops AGE_S, LCP_S and GLEASON_S from the display since the corresponding coefficient is 0 for all three models. As the penalty increases in the rightward columns, additional coefficients are set to 0. For a of 0.491487, the only remaining coefficients are for the unpenalized intercept, LCAVOL_S, and LWEIGHT_S.

Selecting View/Coefficient Path/Graphs and examining the Lambda graph shows the coefficients plotted against various values of the penalty parameter:

Of particular interest are the values of log lambda where the lines intersect the horizontal axis. We see, for example, that GLEASON_S is the first variable to drop out as we increase the penalty, followed significantly later by LCP_S and then AGE_S. The remaining variables have coefficients which reach zero only at log values greater than the optimum.

We can see an overview of this behavior by looking at the graph of the number of non-zero coefficients against the values of log (View/Lambda Path/Graphs):

Looking at the cross-validation results View/Model Selection/Cross-validation Criteria Graph shows the behavior of the top 20 models

and View/Model Selection/Cross-validation Error Graph shows the behavior of the cross-validation objective along the path.

The cross-validation table results (View/Model Selection/Criteria Table) show the objective and various measures of fit for all of the values of the penalty:

Elastic Net

To estimate this model, open the elastic net dialog and fill it out as depicted:

lpsa c lcavol_s lweight_s @vw(age_s, cmax=0.4) lbph_s svi_s lcp_s gleason_s @vw(pgg45_s, cmax=0.1, w=1.2)

In this specification, we restrict the AGE_S variable to have a coefficient less than or equal to 0.4, and we assign an individual weight to PGG45_S of 1.2, with a coefficient maximum value of 0.1.

Again, we leave the Lambda edit field blank so that EViews will determine the lambda path of values to consider and we set the estimation sample to the observations where the TRAIN series takes the value “T”. The remaining settings are as in the original ridge regression example in
“Ridge Regression”.

Click on OK to estimate the equation.

The output should be familiar. Note the line in the header displaying the individual weights and coefficient limits for LCAVOL_S and PGG45_S.

We see here that the upper coefficient limit is binding for LCAVOL_S at both the optimal and the “+1 SE” penalties, but that at the “+2 SE” penalty, the limit is no longer relevant. The PGG45_S coefficient limit is binding at the optimal, but not for the two higher penalties.

Click on View/Coefficient Path/Graphs and select the Lambda graph to show the coefficients along the path:

Notable are the paths of the LCAAVOL_S and the PGG45_S in which the restrictions are binding along the path until the penalty is large enough to shrink the coefficient within the bounds. As expected, as the penalization increases, the complexity of the model decreases and coefficients gradually shrink to 0.

The cross-validation error graph (View/Model Selection/Cross-validation Error) shows the behavior of the mean square error along the path: