Re: [S] "Sums of squares", the taxonomy issue again.

J. Philip Miller (
Wed, 29 Apr 1998 20:03:57 -0500 (CDT)

> There've been some good points made in this debate. Bill
> Venables did a useful service by getting it started. So is
> the consensus that it is the perception that the FDA requires
> Type IV sums of squares (I wrongly called them Type III
> sometiem back) that propels package developers to include
> them?
the only difference between III and IV has to do with missing cells.

> We're perhaps off topic, but let us, with Saint Augustine I
> think, sin boldly. There's an initiative that the statistical
> community could take that could have a huge impact, and
> probably also cause an uproar. Individuals would take it upon
> themselves, or agree, to go through the last year's copies of
> one or other journal, looking at the presentation of
> statistical results. Attention would focus only on obvious,
> gross errors. I can name a few journals where such an
> exercise would yield a rich harvest. The results of these
> efforts would be written up and published.
as someone who was involved in the discussions of the type I-IV sums of
squares in various forums in the late 70's and early 80's, it is not clear to
me that this discussion has "rights" and "wrongs" as much as different
perspectives. To me the problem is trying to do analyses by cookbook or by
applying "standard" analyses rather than thinking about what hypotheses are
being tested and which ones make sense.

Not to necessarily start a long discussion, but to take the issue of using a
model where clinic is one of the factors and there are an uneven number
of subjects at each clinic. The first question I would as was whether the
differing number of patients at each clinic was part of the design or not.
If the varying frequency represented varying prevelances of the condition at
the various clinics or if it represented the zealousness of the investigator
in recruiting subjects. If the former, then I think a clear case can be made
for weighting by the frequency of patients. If the later, I am not sure that
I want to make the hypothesis that I am testing change by the performance of
the investigators!

There are other perspectives, e.g. testing the treatment effect with the
maximum power, independent of the details of the hypothesis being tested.

I am reluctant to label any of these as right or wrong for all studies. I do
think that reasonable statistical software should give the statistician the
widest range of options for how to do the analysis. I have long faulted SAS,
for example, for not being able to apply arbitrary, prespecified weights.

Personally, I would cast most of these into a mixed model formulation which
almost requires a good deal more understanding of the problem than to just run


     J. Philip Miller, Professor, Division of Biostatistics, Box 8067
       Washington University School of Medicine, St. Louis MO 63110 - (314) 362-3617 [362-2693(FAX)]
This message was distributed by  To unsubscribe
send e-mail to with the BODY of the
message:  unsubscribe s-news