Re: [S] Residual Deviance and log-likelihood in survreg

Jens Oehlschlaegel (oehl@Psyres-Stuttgart.DE)
Wed, 18 Feb 1998 22:49:24 +0100 (MET)


Dear Prof. Ripley,

thank you for your intensive comments on those deviance and LL issues.
Please forgive me if I'm still struggling with 3 implications of your
recent writings, probably due to my ignorance

you wrote

> That was prescience! I have just discovered that the log-likelihood
> given by survreg is that viewing link(T) (by default, log T) as the
> data, whereas the natural formulation is to view T (time to event) as the
> data. This alters -2 log L by an additive constant that depends only
> on the non-censored observations and the `link'. So there is a
> problem in comparing -2 log L if the `link' is changed, as then the
> base measure is changed.

and recently you wrote

> The likelihoods are comparable in the technical sense that they are on
> on the same probability space, and densities with respect to the same
> measure. So a larger log-likelihood does indicate a better fit.
> [snipped]
> My ultimate answer is that I do use AIC with parametric survival
> models, cautiously, and find it highly related to (much more expensive)
> cross-validated measures of the performance I am interested in.

So I assume the the key here to combine those two statements is "with
respect to the same measure", i.e. cautiously comparing a Weibull with a
Log-Normal is ok, comparing a Log-Normal with a Identity-Normal is not.

However, I'm struggling with some implications:

(1)
>From what you say it looks like

survreg(Surv(time,event)~x, link="log")

equals

survreg(Surv(log(time),event)~x, link="identity")

and trying that indeed gave identical results (WinS+3.3),
except if some cases have time<1, then log(time) < 0
and in the second version surreg stops with an error.
(F. Harrell's psm() swallows both versions)

If the second version is a legal version, could I go on doing a
hybrid version like

psm(Surv(ifelse(time<1, log(time), time-1), event)~x
,link="identity")

which treats time logarithmic close to zero and linear after time=1?

(2)
Furthermore, is there any way to use the following

-2LL0 -2LL1 LR=-2LL0+2LL1 $null.deviance $deviance

for a good-ness of fit comparision between a
log-normal and a identity-normal parametric survival model? Or do I have
to accept that there is no goodness of fit?

(3) The Nagelkerke R-square seems not to be a Goodness of fit, rather
evaluating the association to the covariates than the fit of the error
distribution. But on which measure the Nagelkerke R-square should be
computed, should it be based on the -2LLs (like in Harrell's cph-objects)

R2.nagelkerke = R2.LR / R2.max

where

R2.LR = 1 - exp(-LR/n)
R2.max= 1 - exp(LL0/n)

or is the LL0 wrong in the latter, and should it should it be based on the
deviance?

[F. Harrell replaced LL0 with $null.deviance for his psm()-nagelkerke
version of survreg(), if I read the code in validate.psm() correctly]

but the recent discussion put some doubts on the deviance in survreg() and
psm().

Please advise me/us through these difficult issues.

Best regards

Jens Oehlschlaegel

--
Jens Oehlschlaegel-Akiyoshi
Psychologist/Statistician
Project TR-EAT + COST Action B6
                                                 F.rankfurt
oehl@psyres-stuttgart.de                         A.ttention
+49 711 6781-408 (phone)                         I.nventory
+49 711 6876902  (fax)                           R .-----.
                                                  / ----- \
Center for Psychotherapy Research                | | 0 0 | |
Christian-Belser-Strasse 79a                     | |  ?  | |
D-70597 Stuttgart Germany                         \ ----- /
-------------------------------------------------- '-----' -
(general disclaimer)                             it's better

----------------------------------------------------------------------- This message was distributed by s-news@wubios.wustl.edu. To unsubscribe send e-mail to s-news-request@wubios.wustl.edu with the BODY of the message: unsubscribe s-news