My favorite technique, at least at the entry level, is taking the
derivative of a spline smooth, and so I agree with Grace Wahba. But I
have not been impressed with the qualities of the derivative of a cubic
smoothing spline, such as is provided by smooth.spline. In general, what
seems to work best is to penalize the derivative two orders beyond that which
is estimated, implying the use of a quintic spline for the first derivative.
However, a few years ago when I tried to search the literature, I wasn't
able to find anything but cubic spline smoothers in the usual places. So
I developed an Splus module called Pspline that is available in Statlib, or
on my web site, www.psych.mcgill.ca/faculty/ramsay.html. This penalizes
the squared norm of the derivative of order p, thus controlling the curvature
of the derivative of order p - 2, where p is arbitrary. Like smooth.spline,
it is also O(n).
Choosing bandwidth is tricky, however. I'm no great lover of automatic
bandwidth selection methods, even though I use GCV all the time (see
Ramsay and Silverman (1997) Functional Data Analysis for lots of examples)
as a starting point. But with derivative estimation, CV, GCV, and etc.
are generally poor guides, and some user intervention is nearly always
required.
None of the current methods works well at the boundaries, in spite of
claims to the contrary in the local polynomial smoothing literature. Methods
that control bias pay a savage price in variance. Typically one
sees derivatives go wild at the extremes, and the higher the derivative,
the wilder the behavior. This should be treated as a missing data problem,
in my opinion. That is, we need to consider ways of bring in information
from elsewhere, such as prior knowledge of function behavior, borrowing
information from elsewhere on the curve, using multiple curves to at
least stabilize the variance of the estimates, and so on.
I've worked on two approaches that seem to pay off well for boundary
estimation, as well as derivative estimation in general. This first
is a project with Nancy Heckman at the University of British Columbia,
involving penalizing an arbitrary linear differential operator L, where
L is chosen to annihilate known components of variation. The more
you know about the function, the better this works. We have a paper nearly
completed on this, and much is also in the book mentioned above. A
module called Lspline is available on statlib and at my web site.
The second involves applying spot or localized penalties at the boundaries.
Silverman and I have been developing a library of Splus and Matlab functions
for the analysis of functional data, and within the context of this
library, this is easy to do. We envisage a roughness penalty as a list
or structure containing the elements
-- a linear differential operator L (may be a derivative order, or may
involve functions as coefficients)
-- the usual multiplier \lambda
-- a range over which integration is to take place. This may be a single
point.
-- a positive weight function \omega
-- a target value d
What is penalized is \omega(Lh - d)^2 integrated over the range of integration.
This seems to work extremely well. A simple version of this, for example,
involves penalizing the square the derivative of order m+1 evaluated
at the boundary, or integrated over the neighborhood of the boundary
using an exponentially decaying weight function \omega. The desired
derivative is of order m. Not only is this derivative stablized, but so are
all the lower order derivatives. The functions defining L and the weight
function \omega are all defined as functional data objects in the software,
and are easy to define and manipulate. Check out my web site if you are
interested.
Jim Ramsay
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news