[S] Summary: predict.glm for binomial family with log link

ballr@rimu.fri.cri.nz
Wed, 8 Jul 1998 10:25:05 +1200


On 3/7/98 I was trying to some data using glm with binomial family and log
link.

However I noticed that
> predict(fit,type="response")
appeared to be using the logit link for predictions. Checking
> help(glm)
and
> help(glm.links)
there seemed to be no reason why I couldn't use a log link, and no error
message when I did. In fact
looking at the fitted model

> test.fit
Call:
glm(formula = y == 0 ~ x - 1, family = binomial(link = log), data = a)

Coefficients:
x
-0.716061

Degrees of Freedom: 40 Total; 39 Residual
Residual Deviance: 50.51698

it appears that a log link has indeed been used. Subsequently I discovered
that
binomial(link=log) actually quietly uses a logit link, presumably because
'log'
is a partial match for 'logit'.

My solution was to minimise the model directly using nlmin(). Another
option is

to build a custom family.

John Wallace pointed out that
>?binomial

> gets one the help on <family>, which shows a belief in the table
> that a log link with a binomial error is not a suitable link.

and
>> binomial(link=log)
>Binomial family
> link function: Logit: log(mu/(1 - mu))
>variance function: Binomial: mu(1-mu)
>
>Yep, still Logit, it does seem like there should be a warning here.
>
>If that is truly what you want use quasi().

In case anyone was wondering why I would want to use a binomial model with
log link here is the problem:

I was interested in the probability of finding a defect on log ends as a
function of the number (x) of defects per piece in the timber
obtained from sawing up the log. The presence or absence of defects
on the log end is a binomial random variable. The probability p of
detecting a defect on the log end is given by

(1 - p) = (1-p0)^^(n*k)

where k and p0 are constant (k=number of pieces per log; p0= probability
of detecting a defect on the log end resulting

form a single defect in the timber) so

log(1-p) = -a * n

where a = -k*log(1-p0) is a positive constant to be estimated, i.e. we have
a log link.

A desirable property of this model, as opposed to using a logit link, is
that p=0 when n=0

(when there are no defects in the log one doesn't expect to detect them on
the end). A logit link does fit the

data I had about equally well, but did not conform with prior expectations
giving excessively high predictions when

n=0.

Rod Ball

Dr Roderick D. Ball Ph 64-7-3475899
Statistician, Fx 64-7-3479380
New Zealand Forest Research Institute email: ballr@fri.cri.nz
P.B. 3020, Rotorua, New Zealand

-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news