Re: [S] Gradient matrix in nls() with "plinear" option

Bill Venables (
Tue, 3 Mar 1998 10:15:20 +1030

Bert Gunter writes:
> The following is quoted from S-Plus 4.0's online documentation:
> "...If the "gradient" attribute is included, it should be an
> array of dimension the number of observations by number of
> linear parameters by number of nonlinear parameters. "
> I would like to use the "plinear" algorithm with explicit
> gradients for the nonlinear parameters. The above quote is all
> I have been able to find on how to do this (including all the
> usual suspect references). This would seem to indicate that,
> e.g., for n data points, 2 conditionally linear parameters, 3
> nonlinear parameters, the above array should be of dimension
> nx2x3. However, I do not understand what the (i,j,k)th element
> should be. I would be grateful if someone would enlighten me,
> as it is clear that there must be something fundamental here
> that I do not understand.

The number of conditionally linear parameters is the same as the
number of terms in the model. Suppose the regression is of the

y = b1*f1(x, th) + b2*f2(x, th) + bp*fp(x, th) + E

then the (i,j,k)th element of the gradiant array has to be

d fj(x_i, th) / d th_k

that is, the partial derivative of the ith component of the jth
function with respect to the kth nonlinear parameter.

> Second question: Is this (usually) a good thing to do (it
> would seem to take advantage of everything that one reasonably
> should)?

It can't hurt, I guess, but finding real cases where it is
actually necessary for convergence reasons is not easy (unless,
of course, the number of nonlinear parameters is very large
relative to the number of linear).

> I think a private reply should suffice for this, as it is
> probably not of general interest.

but it probably should be of some general interst, at least. It
took me a long time to find out what was going on, (and I only
know now because Doug Bates told me about it).

Three small comments.

1. there is an open challenge to write a method function for
deriv() to do the job of producing the model function
returning a "plinear" model function equipped with the
gradient attribute.

2. the ms() algorithm can take and use first and second
derivatives, and very often both first and second explicit
derivatives are necessary for the function to work.

3. nlminb seems to be a pretty swish minimiser. There would be
some point in tarting it up a bit so that it took a formula
and data frame and returned an object with class, for which
methods could be written. In this case, too, supplying first
(and second) derivatives can be useful, but there are no
tools written to help. Another job for deriv?


Bill Venables, Head, Dept of Statistics,    Tel.: +61 8 8303 5418
University of Adelaide,                     Fax.: +61 8 8303 3696
South AUSTRALIA.     5005.   Email:

----------------------------------------------------------------------- This message was distributed by To unsubscribe send e-mail to with the BODY of the message: unsubscribe s-news