Here is an example.
Consider the dataset:
y x
[1,] 0 -15
[2,] 0 -15
[3,] 0 -15
[4,] 0 -15
[5,] 0 -15
[6,] 0 -15
.
.
.
[99,] 0 -1
[100,] 0 1
[101,] 1 -1
[102,] 1 1
.
.
.
[199,] 1 15
[200,] 1 15
try fitting this with:
summary(glm(y ~ x,
family=binomial,
control=glm.control(maxit=25)))
Notice the coefficients and standard errors:
Value Std. Error t value
(Intercept) 0.00 1.02 0.00
x 0.57 0.33 1.75
The t-value gives a p-value of .08 (two-tailed)
However, intuition should suggest that the 'p-value' is much too large.
And the likelihood ratio test would support that intuition. The
chi-square statistic on 1 degree of freedom is >270:
Null Deviance: 277.2589 on 199 degrees of freedom
Residual Deviance: 5.941614 on 198 degrees of freedom
The Wald test uses the Fisher Information, the curvature at the MLE,
which isn't very big.
It may help your intuition to notice that p.i*(1-p.i) shows up in the
Fisher Information ( where p.i <- predict(..., type="resp") is the
predicted probability of success) and that sum(p.i*(1-p.i)) == 0.959692.
and .922601 of this is due to observations 99:102.
So the Fisher Information doesn't really depend on observations in which
the classification is nearly certain.
And you expect most observations to be classified with near certainty if
the effects are really large.
--Charles C. Berry (619) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry@tajo.ucsd.edu UC San Diego http://hacuna.ucsd.edu/members/ccb.html La Jolla, San Diego 92093-0622