Re: [S] rpart

Atkinson, Beth (atkinson@mayo.edu)
Wed, 24 Jun 1998 16:17:42 -0500


A few of suggestions:

1) Turn off the cross-validation and surrogate options (maxsurrogate=0, xval=0).
That should speed things up.

2) Try using a smaller subset just to make sure you've got the code working
(say n=500).

3) There is a stand-alone version of rpart out on statlib in the 'general'
section. This may work best for extremely large datasets.

~> I am trying to use rpart for a classification problem with 10,000
~> observations. The dependent variable has 11 levels and I have 4
~> predictors: 2 continuous and 2 categorical (with 5 and 4 levels
~> respectively). It runs fine if I take the defaults, but if I use a loss
~> matrix (with off-diagonal elements equal to the absolute value of the
~> misclassification error it just runs forever and never finishes).
~>
~> Am I being too ambitious here and the problem is just too big? (10,000
~> is 25% of the full dataset).
~>
~> Carlos Alzola
~> calzola@apa.com
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news