[S] Followup on "efficient" way to omit rows from file

Marc R. Feldesman (feldesmanm@pdx.edu)
Sun, 17 May 1998 20:12:25 -0700


I should have been more verbose in my message. My question about
efficiency concerned speed. I am aware that I can use the construct:

newfile<-na.omit(oldfile).

This construct is markedly slower than SAS operating on the same data file.
As a test, I have a datafile that has ca. 50,000 cases and about 200
variables per case. About 25% of the cases have 1 or more missing values.

A SAS (6.12) data step to subset the file to leave only cases with complete
suites of variables takes about 4 seconds on my 300 MHz Pentium II w/128 MB
RAM. SPlus (4.5) takes about 20 seconds to do the same thing. It struck
me that there might be a faster way of doing it than the simple way above.

Sorry to have misled people into thinking I wasn't aware of the na.omit
function. BTW, I am also aware of the na.action parameter in many
modelling functions.

Thanks.

Dr. Marc R. Feldesman
email: feldesmanm@pdx.edu
email: feldesman@ibm.net
pager: 503-870-2515
fax: 503-725-3905

"If ignorance is bliss then why aren't there more happy people?" Lawrence
Peter

Powered by: Monstrochoerus - the 300 MHz Pentium II
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news