model.frame.default() problem

Robert Treder (bob@statsci.com)
Wed, 24 Jun 1998 10:03:48 -0700

Prof. Ripley pointed out a serious problem with the _new_ default behavior
for model.frame.default(). The new argument, `drop.unused.levels', was
added to remove any factor levels not represented in the data so as to
avoid singular (non-invertible) model matrices. In the context of modeling
with lm() (and others), a singular model matrix produces an error by default.
Without dropping unused levels, the burden is on the user to change the fitting
method (if one exits for singular matrices for that fitting function) or create
a new data frame with any offending factors "refactored" so that all levels are
represented in the data. The creation of such a new data frame can be
burdensome in the presence of an na.action() and subsetting or when multiple
factors have some levels unrepresented.

It is preferable to us and many users that extra (unused) levels be dropped
when fitting a model to prevent singular model matrices but that they should
not be dropped for prediction (the problem encountered by Prof Ripley and David
Nelson). The next release will default this argument to FALSE so that unused
levels are not dropped which is backward compatible with all previous versions
beginning with 3.0. We will then proceed to modify existing model fitting
functions to explicitly drop unused levels in model fitting to avoid the
problem of singular model matrices. Changing the code in this way will not
effect the way predictions are done.

In the mean time the work around suggested by Prof. Ripley will put everything
back to where it was before we made this change. Make a local copy of the
model.frame.default() function by doing


and replacing the `drop.unused.levels = T' argument with
`drop.unused.levels = F'.

We are sorry for any inconvenience this may have caused (or will cause).

Bob Treder
MathSoft, DAPD
