[S] Summary: How to count "NA" etc.

Marc Feldesman (feldesmanm@pdx.edu)
Wed, 21 Oct 1998 10:38:12 -0700

Thanks to all of you who responded to my query about how to approach the
problem of establishing the precise "missingness" pattern in what is (for
me anyway) a complexly sparse data set.

There were lots of common ideas expressed although none really "solved" the
problem I was having. I did look at Shafer's package for multiple
imputation, which gives me part of the information I needed. I had also
looked at the functions in Frank Harrell's "hmisc" library, which also
partly answered the question. Certainly once I get to the point of even
considering multiple imputation, I would use one of both of these packages.
Several users sent me ideas and code snippets that were also partly
useful. However, all seemed to be missing something, and a more
sophisticated approach would have required more work than I was prepared to
undertake at the moment. However, the question also provoked a response
from developers at Mathsoft who have been working on routines for dealing
with both the "missingness" problem, and also for multiple imputation.
They kindly agreed to send me dumps of three functions currently in "alpha"
testing that get at the problem I'm trying to solve.

I've received the alpha code, sourced it, and am testing it now. My first
reaction is that it *seems* to do precisely what I need done. I *think*
(and someone from Mathsoft can correct me here) that part of this code (and
some of the multiple imputation code that I did not ask for or receive) is
being developed in conjunction with Joe Shafer.

Anyway, the three functions - miss(), print.miss(), and plot.miss() seem to
give me all the information I need right now to identify all the different
missing value patterns, locate the specific cases that share them, identify
cases with very similar missing value patterns (and also point out
precisely how the patterns differ), and present it to me in a format that
will enable me to decide what variables are worth imputing and which ones
have an intractable pattern of missingness.

Before posting this message, I cleared its posting with MathSoft. I am not
violating any NDA in reporting this information.

Thanks again for everyone's thoughtful suggestions. It is clear that this
is a non-trivial problem, and one that many researchers seem to have. I am
very happy that MathSoft has seen this as a fruitful area for development.
Dr. Marc R. Feldesman
Professor and Chairman
Anthropology Department
Portland State University
P.O. Box 751
Portland, Oregon 97207
email: feldesmanm@pdx.edu
phone: 503-725-3081
fax: 503-725-3905
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news