"It will be noticed that these conclusions apply to the vicintiy of the
hypothesis tested...[t]herefore it is possible that an unbiased
critical region...when it will most frequently detect the falsehood of
a hypothesis tested...when it is slightly erroneous, will fail to do so
when the true hypothesis is widely different from [the null]...
[Neyman, "'Smooth' Test for goodness of fit" Skandinavisk
Aktuarietiskrift, Vol. 20 1937 , p. 168]
F. Nelson and N.E. Savin have a paper that gives several examples of
this type of phenomenon in Wald and LM tests [Econometrica 58 1990 "The
danger of extrapolating asymptotic local power" pp.977-1070]. There
are also examples in which the LR test statistic exhibits these forms
of non-monotonicity.
This issue also relates to the non-invariance of the Wald test with
respect to re-parameterization and the possible inadequacy of the
asymptotic approximation to the finite sample test statistic
distribution. In a 1992 working paper Joel Horowitz and N.E. Savin at
U.Iowa have shown that using bootstraping to get critical values can
solve the size distortions.
Doug McManus
You wrote:
>
>> From <@uconnvm.uconn.edu:kent@darwin.eeb.uconn.edu> Wed Jan 7 12:51
GMT 1998
>> To: ripley@stats.ox.ac.uk (Prof Brian Ripley)
>> Cc: s-news@utstat.toronto.edu
>> Subject: Re: Summary of Robust Regression Algorithms
>> From: kent@darwin.eeb.uconn.edu (Kent E. Holsinger)
>>
>> >>>>> "Brian" == Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:
>>
>> Brian> My best example of this not knowing the literature is the
>> Brian> Hauck-Donner (1977) phenomenon: a small t-value in a
>> Brian> logistic regression indicates either an insignificant OR
a
>> Brian> very significant effect, but step.glm assumes the first,
>> Brian> and I bet few users of glm() stop to think.
>>
>> All right I confess. This is a new one for me. Could some one
explain
>> the Hauck-Donner effect to me? I understand that the t-values from
>> glm() are a Wald approximation and may not be terribly reliable, but
I
>> don't understand how a small t-value could indicate "either an
>> insignificant OR a very significant effect."
>>
>> Thanks for the help. It's finding gems like these that make this
group
>> so extraordinarily valuable.
>
>There is a description in V&R2, pp. 237-8., given below. I guess I
was
>teasing people to look up Hauck-Donner phenomenon in our index.
>(I seem to remember this was new to my co-author too, so you were in
>good company. This is why it is such a good example of a fact which
>would be useful to know but hardly anyone does. Don't ask me how I
>knew: I only know that I first saw this in about 1980.)
>
> There is a little-known phenomenon for binomial GLMs that was
pointed
> out by Hauck & Donner (1977: JASA 72:851-3). The standard errors
and
> t values derive from the Wald approximation to the log-likelihood,
> obtained by expanding the log-likelihood in a second-order Taylor
> expansion at the maximum likelihood estimates. If there are some
> \hat\beta_i which are large, the curvature of the log-likelihood at
> \hat{\vec{\beta}} can be much less than near \beta_i = 0, and so
the
> Wald approximation underestimates the change in log-likelihood on
> setting \beta_i = 0. This happens in such a way that as
|\hat\beta_i|
> \to \infty, the t statistic tends to zero. Thus highly significant
> coefficients according to the likelihood ratio test may have
> non-significant t ratios.
>
>To expand a little, if |t| is small it can EITHER mean than the Taylor
>expansion works and hence the likelihood ratio statistic is small OR
>that |\hat\beta_i| is very large, the approximation is poor and the
>likelihood ratio statistic is large. (I was using `significant' as
>meaning practically important.) But we can only tell if |\hat\beta_i|
>is large by looking at the curvature at \beta_i=0, not at
>|\hat\beta_i|. This really does happen: from later on in V&R2:
>
> There is one fairly common circumstance in which both convergence
> problems and the Hauck-Donner phenomenon (and trouble with
> \sfn{step}) can occur. This is when the fitted probabilities
> are extremely close to zero or one. Consider a medical diagnosis
> problem with thousands of cases and around fifty binary
> explanatory variables (which may arise from coding fewer
> categorical factors); one of these indicators is rarely true but
> always indicates that the disease is present. Then the
> fitted probabilities of cases with that indicator should be one,
> which can only be achieved by taking \hat\beta_i = \infty.
> The result from \sfn{glm} will be warnings and an estimated
> coefficient of around +/- 10 [and an insignificant t value].
>
>That was based on a real-life example, which prompted me to write what
>is now stepAIC. Once I had that to try, I found lots of examples.
>
>
>Brian Ripley
>