Re: [S] comparing spline fits with different df?

Prof Brian Ripley (
Thu, 9 Apr 1998 07:44:19 +0100 (BST)

Martin Maechler wrote:
> >>>>> "Bill" == Bill Shipley <> writes:
> Bill> Hello, I know that there are a number of different methods that
> Bill> have been suggested to choose the "best" span value, or smoother
> Bill> df, when using spline smoothers. However, I would like to know
> Bill> if there is any way of testing whether or not a spline fit with x
> Bill> df provides a significantly better fit than a spline fit with y
> Bill> df? In other words, assuming normality and homogeneity of
> Bill> variance for the residuals, what is the sampling distribution of
> Bill> the smoother cross-validation score?
> The easiest is to use the (Cp based ?) test provided by
> g1 <- gam(y ~ s(x), ...., df = df1)
> g2 <- gam(y ~ s(x), ...., df = df2)
> anova(g1,g2)

Yes, that is more or less the same as the AIC approach, except that
the S implementation is based on a known sigma. However, s() is only
one of a wide range of spline smoothers.

> Of course, it can be discussed how exactly the nonparametric
> degrees of freedom should be counted, and related questions..

No real problem: splines are parametric! Well, they are. There are
two related issues here. One is that the fitting of smoothing splines is
not by maximum-likelihood, so the justification for Cp and AIC is missing
and the other is that the derivation of both assumes that the fitted
model is true. You can work around both by using Moody's equivalent
degrees of freedom aka Murata et al's NIC. Details in e.g. my
Pattern Recognition and Neural Networks book.

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
This message was distributed by  To unsubscribe
send e-mail to with the BODY of the
message:  unsubscribe s-news