EViews Help: Background

The following discussion describes only the basic features of switching models. Switching models have a long history in economics that is detailed in numerous surveys (Goldfeld and Quandt, 1973, 1976; Maddala, 1986; Hamilton, 1994; Frühwirth-Schnatter, 2006), and we encourage you to explore these resources for additional discussion.

Suppose that the random variable of interest,

follows a process that depends on the value of an unobserved discrete state variable

. We assume there are

possible regimes, and we are said to be in state or regime

in period

when

, for

The switching model assumes that there is a different regression model associated with each regime. Given regressors

and

, the conditional mean of

in regime

is assumed to be the linear specification:

where

and

are

and

vectors of coefficients. Note that the

coefficients for

are indexed by regime and that the

coefficients associated with

are regime invariant.

when

, where

standard normally distributed. Note that the standard deviation

may be regime dependent,

The likelihood contribution for a given observation may be formed by weighting the density function in each of the regimes by the one-step ahead probability of being in that regime:

are parameters that determine the regime probabilities,

is the standard normal density function, and

is the information set in period

. In the simplest case, the

represent the regime probabilities themselves.

which may be maximized with respect to

To this point, we have treated the regime probabilities

in an abstract fashion. This section considers a simple switching model featuring independent regime probabilities. We begin by focusing on the specification of the regime probabilities, then describe likelihood evaluation and estimation of those probabilities.

It should be emphasized that the following discussion is valid only for specifications with uncorrelated errors. Models with correlated errors are described in “Serial Correlation”.

In the case where the probabilities are constant values, we could simply treat them as additional parameters in the likelihood in Equation (39.4). More generally, we may allow for varying probabilities by assuming that

is a function of vectors of exogenous observables

and coefficients

parameterized using a multinomial logit specification:

for

with the identifying normalization

. The special case of constant probabilities is handled by choosing

to be identically equal to 1.

This likelihood may be maximized with respect to the parameters

using iterative methods.

It is worth noting that the likelihood function for this normal mixture model is unbounded for certain parameter values. However, local optima have the usual consistency, asymptotic normality, and efficiency properties. See Maddala (1986) for discussion of this issue as well as a survey of different algorithms and approaches for estimating the parameters.

Given parameter point-estimates, coefficient covariances may be estimated using conventional methods, e.g., inverse negative Hessian, inverse outer-product of the scores, and robust sandwich.

The likelihood expression in Equation (39.6) depends on the one-step ahead probabilities of being in a regime:

. Note, however, that observing the value of the dependent variable in a given period provides additional information about which regime is in effect. We may use this contemporaneous information to obtain updated estimates of the regime probabilities

The process by which the probability estimates are updated is commonly termed filtering. By Bayes’ theorem and the laws of conditional probability, we have the filtering expressions:

The expressions on the right-hand side are obtained as a by-product of the densities obtained during likelihood evaluation. Substituting, we have:

The Markov switching regression model extends the simple exogenous probability framework by specifying a first-order Markov process for the regime probabilities. We begin by describing the regime probability specification, then discuss likelihood computation, filtering, and smoothing.

Typically, these probabilities are assumed to be time-invariant so that

for all

, but this restriction is not required.

where the

-th element represents the probability of transitioning from regime

in period

to regime

in period

. (Note that some authors use the transpose of

so that all of their indices are reversed from those used here.)

As in the simple switching model, we may parameterize the probabilities in terms of a multinomial logit. Note that since each row of the transition matrix specifies a full set of conditional probabilities, we define a separate multinomial specification for each row

of the matrix

for

and

with the normalizations

As noted earlier, Markov switching models are generally specified with constant probabilities so that

contains only a constant. Hamilton’s (1989) model of GDP is a notable example of a constant transition probability specification. Alternately, Diebold, Lee, and Weinbach (1994), and Filardo (1994) adopt two-state models that employ time-varying logistic parameterized probabilities.

The Markov property of the transition probabilities implies that the expressions on the right-hand side of Equation (39.4) must be evaluated recursively.

Briefly, each recursion step begins with filtered estimates of the regime probabilities for the previous period. Given filtered probabilities,

, the recursion may broken down into three steps:

1. We first form the one-step ahead predictions of the regime probabilities using basic rules of probability and the Markov transition matrix:

2. Next, we use these one-step ahead probabilities to form the one-step ahead joint densities of the data and regimes in period

3. The likelihood contribution for period

is obtained by summing the joint probabilities across unobserved states to obtain the marginal distribution of the observed data

4. The final step is to filter the probabilities by using the results in Equation (39.13) to update one-step ahead predictions of the probabilities:

These steps are repeated successively for each period,

. All that we require for implementation are the initial filtered probabilities,

, or alternately, the initial one-step ahead regime probabilities

. See “Initial Probabilities” for discussion.

The likelihood obtained by summing the terms in Equation (39.14) may be maximized with respect to the parameters

using iterative methods. Coefficient covariances may be estimated using standard approaches.

Estimates of the regime probabilities may be improved by using all of the information in the sample. The smoothed estimates for the regime probabilities in period

use the information set in the final period,

, in contrast to the filtered estimates which employ only contemporaneous information,

. Intuitively, using information about future realizations of the dependent variable

(

) improves our estimates of being in regime

in period

because the Markov transition probabilities link together the likelihood of the observed data in different periods.

Kim (2004) provides an efficient smoothing algorithm that requires only a single backward recursion through the data. Under the Markov assumption, Kim shows that the joint probability is given by

The key in moving from the first to the second line of Equation (39.16) is the fact that under appropriate assumptions, if

were known, there is no additional information about

in the future data

The smoothed probability in period

is then obtained by marginalizing the joint probability with respect to

Note that apart from the smoothed probability terms,

, all of the terms on the right-hand side of Equation (39.16) are obtained as part of the filtering computations. Given the set of filtered probabilities, we initialize the smoother using

, and iterate computation of Equation (39.16) and Equation (39.17) for

to obtain the smoothed values.

The Markov switching filter requires initialization of the filtered regime probabilities in period 0,

There are a few ways to proceed. Most commonly, the initial regime probabilities are set to the ergodic (steady state) values implied by the Markov transition matrix (see, for example Hamilton (1999, p. 192) or Kim and Nelson (1999, p. 70) for discussion and results). The values are thus treated as functions of the parameters that determine the transition matrix.

Alternately, we may use prior knowledge to specify regime probability values, or we can be agnostic and assign equal probabilities to regimes. Lastly, we may treat the initial probabilities as parameters to be estimated.

Note that the initialization to ergodic values using period 0 information is somewhat arbitrary in the case of time-varying transition probabilities.

We may extend the basic switching model to allow for dynamics in the form of lagged endogenous variables and serially correlated errors. The two methods require different assumptions about the dynamic response to changes in regime.

Our discussion is very brief. Frühwirth-Schnatter (2006) offers a nice overview of the differences between these two approaches, and provides further discussion and references.

The most straightforward method of adding dynamics to the switching model is to include lagged endogenous variables. For a model with

lagged endogenous regressors, and random state variable

taking the value

we have:

where

is again

standard normally distributed. The coefficients on the lagged endogenous variable are allowed to be regime-varying, but this generality is not required.

In the Markov switching context, this model has been termed the “Markov switching dynamic regression” (MSDR) model (Frühwirth-Schnatter, 2006). In the special case where the lagged endogenous coefficients are regime-invariant, the model may be viewed as a variant of the “Markov switching intercept” (MSI) specification (Krolzig, 1997).

Of central importance is the fact that the mean specification depends only on the contemporaneous state variable

so that lagged endogenous regressors may be treated as additional regime specific

or invariant

for purposes of likelihood evaluation, filtering, and smoothing. Thus, the discussions in “Simple Switching” and “Markov Switching” are directly applicable in MSDR settings.

An alternative dynamic approach assumes that the errors are serially correlated (Hamilton, 1989). With serial correlation of order

, we have the AR specification

In the Markov switching literature, this specification has been termed the “Markov switching autoregressive” (MSAR) (Frühwirth-Schnatter, 2006) or the “Markov switching mean” (MSM) model (Krolzig, 1997). The MSAR model is perhaps most commonly referred to as the “Hamilton model” of switching with dynamics.

Note that, in contrast to the MSDR specification, the mean equation in the MSAR model depends on lag states. The presence of the regime-specific lagged mean adjustments on the right-hand side of Equation (39.20) implies that probabilities for a

dimensional state vector representing the current and

previous regimes are required to obtain a representation of the likelihood.

which requires that we consider probabilities for the four potential regime outcomes for the state vector

More generally, since there is a

dimensional state vector and

regimes, the number of potential realizations is

. The description of the basic Markov switching model above ( “Markov Switching”) is no longer valid since it does not handle the filtering and smoothing for the full

vector of probabilities.

Hamilton (1989) derived the form of the MSAR specification and outlined an operational filtering procedure for evaluating the likelihood function. Hamilton (1989), Kim (1994), and Kim and Nelson (1999, Chapter 4) all offer excellent descriptions of the construction of this lagged-state filtering procedure.

Briefly, the Hamilton filter extends the analysis in “Markov Switching” to handle the larger

dimensional state vector. While the mechanics of the procedure are a bit more involved, the concepts follow directly from the simple filter described above ( “Likelihood Evaluation and Filtering”). The filtered probabilities for lagged values of the states,

conditional on the information set

are obtained from the previous iteration of the filter, and the one-step ahead joint probabilities for the state vector are obtained by applying the Markov updates to the filtered probabilities. These joint probabilities are used to evaluate a likelihood contribution and in obtaining updated filtered probabilities.

Hamilton also offers a modified lag-state smoothing algorithm that may be used with the MSAR model, but the approach is computationally unwieldy. Kim (1994) improves significantly on the Hamilton smoother with an efficient smoothing filter that handles the

probabilities using a single backward recursion pass through the data. This approach is a straightforward extension of the basic Kim smoother (see “Smoothing”).

The simple switching results outlined earlier ( “Simple Switching”) do not hold for the simple switching with autocorrelation (SSAR) model. As with the MSAR specification, the presence of lagged states in the specification complicates the dynamics and requires handling a

dimensional state variable representing current and lag states.

Conveniently, we may obtain results for the specification by treating it as a restricted Markov switching model with transition probabilities that do not depend on the origin regime:

We may then apply the Hamilton filter and Kim smoother to this restricted specification to obtain the one-step ahead, likelihood, filtered, and smoothed values.

In the serial correlation setting, the Markov switching filter requires initialization of the vector of probabilities associated with the

dimensional state vector. We may proceed as in the uncorrelated model by setting

initial probabilities in period

using one of the methods described in “Initial Probabilities” and recursively applying Markov transition updates to obtain the joint initial probabilities for the

dimensional initial probability vector in period

Again note that the initialization to steady state values using the period

information is somewhat arbitrary in the case of time-varying transition probabilities.