4  Model Specification

To get a sense of MPlus' model specification algebra, as well as some of it's default model assumptions, we will continue to use the data in ex3.1.dat. For this, we can drop the analysis: command (we will be using the default analysis options), and add a model: command.

For example, a simple regression is specified like this:

title: Simple regression
data:  file=ex3.1.dat;
variable: names= x1 x2 x3;
model:
    x1 on x2 x3;

and a portion of our output, the parameter estimates, is:

MODEL RESULTS
                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value
 X1       ON
    X2                 0.969      0.042     23.356      0.000
    X3                 0.649      0.044     14.626      0.000
 Intercepts
    X1                 0.511      0.043     11.765      0.000
 Residual Variances
    X1                 0.941      0.060     15.811      0.000

To use categorical variables as independent (exogenous) variables in any model you have to create indicators (sets of binary variables). You could either include indicators in your input data set, or use the DEFINE: command to create them after the data is read.

Notice (by looking at the full output, not included here) that this is maximum likelihood estimation, not least squares. Notice also (by looking at the diagram) that this model includes a correlation between x2 and x3, but does not report this in the Model Results. To see a fuller specification of the very same model, try:

title: Simple regression, explicit corr
data:  file=ex3.1.dat;
variable: names= x1 x2 x3;
model:
    x1 on x2 x3;
    x2 with x3;

A modification we can make to this model, is to specify that x2 and x3 have no (that is, zero) correlation. Try:

title: Simple regression, no corr
data:  file=ex3.1.dat;
variable: names= x1 x2 x3;
model:
    x1 on x2 x3;
    x2 with x3 @0;

4.1 Algebraic Elements and Operators

In the structural equation modeling world, our models have several basic elements: variances, covariances/correlations, regression relations, residual variances, and means. We will also want to be able to constrain or free these elements in a variety of ways.

As we've already seen, the on operator specifies a regression path or paths, and the with operator specifies correlation. A third operator is by. This is like a regression path with the restriction that the independent variable must be a latent variable. In other words, it is used to specify a measurement model. As a bad example (but using the same data), look at:

title: Simple CFA
data:  file=ex3.1.dat;
variable: names= x1 x2 x3;
model:
    L1 by x1 x2 x3;

Here, L1 is a latent variable, not part of the input file. By default, the regression path from L1 to x1 (the first variable encountered in the by list) is fixed at 1 (one). Also by default, there are no correlations among the residual variances for x1, x2, and x3.

(To see why this is such a poorly fitting model, look at the correlation matrix from the sample statistics, back in the Basics section. The very low correlation between x2 and x3 is not consistent with the other correlations.)

Constraints, which we have also already touched on, come in three fundamental varieties:
fixing and freeing parameters, and equality constraints. Parameters are fixed at a particular value with @, and freed with an asterisk, *.

For example, we could change the scaling variable from x1 to x2 by reordering the variables in the model command (this is typical style), or we could explictly free x1 and fix x2 @1.

title: Simple CFA, scale by x2
data:  file=ex3.1.dat;
variable: names= x1 x2 x3;
model:
    L1 by x1* x2@1 x3;

So far, we have left means and variances to be freely estimated, but we can constrain these as well. In the modeling algebra here, variances are specified by simply naming each variable, while means are specified by putting a variable name in square brackets.

For example, if we specify our latent variable to have a mean of zero (which is the default) and a variance of 1 (one), we specify:

title: Simple CFA, standardized latent var
data:  file=ex3.1.dat;
variable: names= x1 x2 x3;
model:
    L1 by x1* x2 x3;
    L1@1;
    [L1@0];

(Note in the usual version of this model, we free x1, otherwise it is fixed to 1 by default. This model still fits poorly, for the same reason as before.)

As we work our way into more complicated types of models, we will encounter other model algebra operators.