Re: [S] model.frame.default() problem

Bill Venables (
Thu, 25 Jun 1998 12:53:55 +0930

Robert Treder writes:
> Prof. Ripley pointed out a serious problem with the _new_
> default behavior for model.frame.default(). The new argument,
> `drop.unused.levels', was added to remove any factor levels
> not represented in the data so as to avoid singular
> (non-invertible) model matrices. In the context of modeling
> with lm() (and others), a singular model matrix produces an
> error by default.

The model matrix is rectangular and so always `non-invertible'.
For lm to work with the default setting the model matrix must
have linearly independent columns, sometimes called having `full
column rank'. (It may be a bit pedantic of me here, but my
students would disown me if I made a slip like this... :-)

Note that aov() with the default setting will handle model
matrices of less than full column rank fairly gracefully.

More seriously, I applaud the promptness and openness with which
MathSoft have acted to correct this, but it does not fix the
biggest and most dangerous infelicity, which is the following.

If you fit a model with a factor, f, and then try to predict from
it for a set of cases using a new factor f having (a levels
attribute with) only a proper subset of the original levels, you
usually get plausible looking, silently produced nonsense.

The ham-fisted way to fix this is to use predict.gam, but simpler
solutions should be possible and should be implemented. (If any
existing code actually uses the present misfeature, all I can say
is that it richly deserves to break.)

Bill Venables.

Bill Venables, Head, Dept of Statistics,    Tel.: +61 8 8303 5418
University of Adelaide,                     Fax.: +61 8 8303 3696
South AUSTRALIA.     5005.   Email:

----------------------------------------------------------------------- This message was distributed by To unsubscribe send e-mail to with the BODY of the message: unsubscribe s-news