# Re: [S] Residual Deviance and log-likelihood in survreg

Sat, 14 Feb 1998 14:42:05 +1030

Terry M. Therneau writes:
> ... The deviance is defined as
>
> 2*[loglik(fitted model) - loglik(saturated model)] *scale,
>
> where the saturated model has one coef per subject.

I think you need at least to swap the models around, but a more
careful definition is needed anyway, in my view.

If there is an unknown scale parameter, even to define the
deviance you need temporarily to assume that it is known, and
hence not even involved in the likelihood maximisation. The
deviance is then

D(M) = 2 * [max log L(S) - max log L(M)] * scale

ie, apart from factors, the maximised log-likelihood under the
saturated model less the maximised log-likelihood under the
present model. For models in the GLM family D(M) is free of
scale'.

> For a Gaussian linear model with known variance sigma^2 this
> turns out to be ... (the residual sum of squares)

... and this is the deviance whether the variance is known or
not. The assumption that the scale is known is only a temporary
one in order, really, to have the maximised log-likelihood at the
saturated model well defined. (If it were not assumed known, the
saturated model would have n+1 parameters estimated with n
observations.)

> ... Two nice properties are that differences in deviances are
> the same as 2 * differences in loglik, so the usual chisquared
> tests apply, and that for a good fit we have, roughly, that
> E(residual deviance) = residual df.

Both properties are only valid (approximately) if the scale
parameter is 1. The second nice property, on the whole, I tend
to agree with, (although there are well-known cases where it can
be wildly out). The first needs some extra qualification because
if there is an unknown scale parameter, differences in deviances
are not even likelihood ratio tests.

Suppose M and M0 are models for the mean with parameter space of
dimension p and p0 respectively. Further, assume that p0 < p and
that M0 is a special case of M, i.e. it is an allowable null
hypothesis model within M. There are two cases

1. Scale parameter known (including the Binomial and Poisson
cases where it is, effectively, 1).

In this case X2 = {D(M0) - D(M)}/scale is the usual LR test
for M0 within M.

2. Scale parameter unknown.
The usual test statistic for testing M0 within M is

F = [{D(M0) - D(M)}/(p - p0)]/\hat{scale}

where \hat{scale} is an estimate of the scale parameter, often
(but not always) D(M)/(n-p). In the normal linear model case
this is the ordinary F-statistic (and happens to be equivalent
to the LR test. In other cases I believe even the approximate
distribution of this statistic still has some secrets to
yield.

> In censored data this doesn't work out -- the "nuisance" parts
> of the loglik don't neatly cancel, and the scale parameter is
> integral.

Essentially because with censored data you have gone beyond the
GLM family.

(I can appreciate that Terry did not want to get sidetracked with
this kind of detail, but I do believe this issue to be important.)

Bill Venables.

--
Bill Venables, Head, Dept of Statistics,    Tel.: +61 8 8303 5418
University of Adelaide,                     Fax.: +61 8 8303 3696
`