Re: [S] summary, df in smooths

John Maindonald (john.maindonald@anu.edu.au)
Thu, 1 Oct 1998 16:03:58 +1000 (EST)


Jane Elith <j.elith@botany.unimelb.edu.au>, in her summary, quoted
> ####Henrik Aalborg-Nielsen:
>
> I think I would use a CV criterion and use smooth.spline() directly,
> e.g.
>
> CV <- rep(NA,50) # Or whatever max you tkink is appropiate
> for(df in 2:50) CV[df] <- smooth.spline(x=x, y=y, df=df, cv=T)$cv.crit
> plot(1:50,CV,xlab="df")
>
> And chose df to minimize CV. For "automatic" handling of future data
> sets "of the same kind" you may just want to use the same df.
>
> NOTE: Hastie & Tibshirani (1990) has a section on automatic selection
> of smoothing parameters (section 3.4) in their book 'Generalized
> Additive Models' (Chapman & Hall). I think one of the messages is
> that one should be carefull about minimizing CV in that the optimum is
> very flat.

I'd be interested to get comments on whether an approach that seems to
work well in tree-based regression might have application here also.
I've been experimenting with the Atkinson & Therneau RPART library,
which one can get from Statlib.

They recommend marking any cv "error" ("risk") within one SE of the
achieved minimum cv error as being equivalent to the minimum. Then
the simplest model is chosen from among all those so tied. See
section 4.2, page 13, of the Therneau & Atkinson document (to be
distinguished from the Atkinson & Therneau document) which accompanies
the library. This approach, say Therneau and Atkinson, has proved
reliable in screening out pure noise variables which have been
inserted into the data in Monte Carlo trials. My experience has been
that the method very consistently chooses the same size of model. One
conclusion is that tree-based models do not work very well on the
rather small data sets that are used as examples in the S-PLUS4 "Guide
to Statistics"!

I wonder whether this approach has wider application. It relies,
of course, on getting an SE estimate for each cv error estimate.

John Maindonald.
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news