Re: [S] QUERY: missing value imputation and transcan/impute

Jens Oehlschlaegel (oehl@Psyres-Stuttgart.DE)
Wed, 4 Mar 1998 23:16:20 +0100 (MET)

On Tue, 3 Mar 1998, Frank E Harrell Jr wrote:

> Good luck - these are good questions -Frank

Well, but these seem to be good answers. Especially

> I think that the main reason people impute individual realizations
> rather than expected values is that they are using multiple imputation
> to get covariance matrices. You need this kind of variation to make
> multiple imputation work. If using the bootstrap you can impute using
> estimates of expected values and still get the right variances.

is very interesting. I have to think about it. Is there any general
proof? I'll read the recommended document, perhaps then I understand this.

> I think
> there may be a slight advantage to imputing "best" in place of "random"
> estimates but don't have any formal justification yet.

I don't know, the conditional distribution is also "best". I remember
that in Efrons Books he describes, that resampling from a parametric
distribution (if approbriate) is "better" than resampling on the original
data, thus e.g. re-inventing the classical t-test. This issue could be
related. I don't know.

> I could add an option to predict.transcan to impute individual predicted
> values by taking draws from the residuals instead of using expected
> values, but I'm not convinced it's needed if you use the bootstrap to
> get covariances, as mentioned above.

This sounds interesting. There is an example in Efrons book, I think
time-series, where he resamples from the residuals, which would be much
faster. However, bootstrapping the whole is probably more
stable than relying on a single fit.

Thanks again for the quick and informative answer.

Jens Oehlschlaegel

