Re: [S] cv.tree

Prof Brian Ripley (ripley@stats.ox.ac.uk)
Wed, 3 Jun 1998 10:59:19 +0100 (BST)


> Comments: Authenticated sender is <sungg027@srv1.mail.uni-kiel.de>
> From: Carsten Stech <stech@geographie.uni-kiel.de>
> To: s-news@wubios.wustl.edu
> Date: Wed, 3 Jun 1998 11:53:45 +0000
> Subject: [S] cv.tree
>
>
> we are trying to start a 40-fold crossvalidation.
> Here's the wrong syntax that we used:
>
> our.40.tree<-cv.tree(our.max.tree,k=40,FUN=prune.tree)
>
> we also tried rand=40.
>
> Does anybody know the right syntax ?
>
our.40.tree<-cv.tree(our.max.tree
rand = sample(40, length(m[[1]]), replace = T)
FUN=prune.tree)

Since you will know the size of your dataset, say, N, you could just use

rand40 <- sample(40, N, replace=T)
our.40.tree<-cv.tree(our.max.tree, rand40)

I do think though that this is a waste of time. 40-fold CV is far too
many for trees (as you can see from Breiman et al, for example). Do 4
10-fold CVs and average, or 8 5-fold ones. See the examples in V&R2
chapter 14.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

----------------------------------------------------------------------- This message was distributed by s-news@wubios.wustl.edu. To unsubscribe send e-mail to s-news-request@wubios.wustl.edu with the BODY of the message: unsubscribe s-news