RE: [S] minimizing multi-parameter functions?

Terry Elrod (
Wed, 18 Mar 1998 08:44:26 -0700

This is a problem that frequently arises with quasi-Newton minimizers
and is not unique to nlminb. My own suggestions are these:

(1) You say that the values returned by nlminb are not close to what
you expect. Try using your expectations as your starting values.

(2) Often a model contains what I think of as "work horse" parameters
and "ambitious" parameters. The work horse parameters are often
critical to successful minimization of the function and are easy to
estimate. The ambitious parameters are less important and/or are more
nonlinear and hence harder to estimate. I often have a good idea which
are which in advance. For instance, when fitting a (generalized)
nonlinear model, it is often the nonlinear parameters on the right hand
side that create the problems. In such cases, it is often very useful
first to fix the ambitious parameters at plausible values to obtain
very good starting values for the work horse parameters, and then rerun
the model with all parameters free using these starting values.

(3) nlminb takes a scale argument that allows you to accommodate gross
differences across the parameters in the curvature of the function
being minimized. On-line help for nlminb states that the settings for
scale can have a great impact on performance of nlminb, but that how to
set scale in advance is hard to determine. I have no experience with
use of scale. A more transparent (and equivalent) alternative is to
rescale the parameters in the function definition, which I have done
many times. The ideal is to define the parameters so that all second
derivatives at the minimum equal one for all parameters. In practice,
it is essential only that the second derivatives differ by no more than
a factor of about 1000. Note that rescaling a parameter in the
definition of the function to be minimized usually implies a change in
the starting value for that parameter.

Once you have what appears to be convergence, you can use VR2's
vcov.nlminb function to obtain estimates of the standard errors of the
parameters. Here too a ratio of the largest standard error to the
smallest of more than about sqrt(1000) == 30 indicates that some
rescaling is in order.

(4) There are an infinitude of different ways to include nonlinear
parameters in a model that are mathematically equivalent but are _not_
equivalent when it comes to performance of the estimation algorithm.
Consider, for example, the mathematically equivalent nonlinear
regression models E(y) = a * exp(b * x) and E(y) = exp(b * x + c). It
is known that, in general, the second model is better behaved in
estimation than the first. It is better behaved because the
loglikelihood is usually approximated more closely by a quadratic
function in the vicinity of the minimum, and gausi-Newton minimizers
such as nlminb are based upon the quadratic approximation.

Nonlinear models are so sensitive to choice of parameterization that it
is very worthwhile having on hand some references that provide
guidance. Two that I use a lot are David A. Ratkowsky's Handbook of
Nonlinear Regression Models (New York: Marcel Dekker, 1990) and Gavin
J. S. Ross's Nonlinear Estimation (New York: Springer-Verlag, 1990).

(5) Some minimization routines are better than others. Most of my
experience has been with Aptech Systems MAXLIK, written in their GAUSS
language. It is a much more fully featured routine than nlminb. For
instance, it allows the algorithm to start using steepest decent, which
is very robust to poor starting values. Postings on nlminb lead me to
believe it has more problems than some. MAXLIK also had problems
initially that were reduced only after several years of refinement. The
better algorithms work fine without analytical gradients; which is a
great convenience. nlminb appears to be fussier, and consequently one
sees more often the suggestion to supply analytical gradients to help
with convergence.

(6) Once you have convergence to at least a local minimum and you are
comfortable with your choices of parameterization and scale, then there
is no substitute for rerunning the model with different, but not too
implausible, starting values to see if a better optimum can be found.

Terry Elrod
Prof. Terry Elrod; 3-23 Fac. of Business; U. of Alberta; Edmonton AB; Canada T6G 2R6
email:; tel: (403) 492-5884; fax: (403) 492-3325
Web page:

-----Original Message-----
From: Bill Shipley []
Sent: Wednesday, March 18, 1998 7:33 AM
Subject: [S] minimizing multi-parameter functions?

I want to find values of a vector of parameters q=(a,b,c,d) that minimizes a
function foo(q,X) where X is an additional argument. This is supposed to be
done using the nlminb() function; i.e.

However, I have found that this does not find the minimum very well for my
function. I suppose that this is because the algorithm is getting stuck in
a local minimum. Is there a better (safer) way to do this?
Bill Shipley
Departement de Biologie
Universite de Sherbrooke
Sherbrooke (Quebec)
voix: 819-821-8000 poste 2079
telecopieur: 819-821-8049
Visitez notre site WEB :

This message was distributed by To unsubscribe
send e-mail to with the BODY of the
message: unsubscribe s-news

This message was distributed by To unsubscribe
send e-mail to with the BODY of the
message: unsubscribe s-news