Re: Statistica anova

James H Steiger (steiger@unixg.ubc.ca)
Fri, 2 Sep 94 14:10:32 EDT


*******************************************
To all forum members:

This message is posted at the request of
StatSoft, Inc.. StatSoft does not have
direct access to the Internet.
*******************************************

RESPONSE OF STATSOFT, INC.
TO MS. PAIGE MILLER

This note is in response to Paige Miller's ANOVA
example with missing cells.

1. To summarize, SPSS and STATISTICA (and the other
programs) will produce identical results for the same,
user-specified ANOVA models. The default behavior of
different programs, however, will vary (e.g., Systat
will not produce any output for the default model, as
explained on this forum by Jerry Dallal).

Moreover, there is no room for ambiguity concerning
the specific models tested by STATISTICA because
STATISTICA will issue very explicit messages regarding
what precisely is being done, and then compute the
CORRECT RESULTS. Given those explicit messages, there
cannot be any way for the user to mis-interpret the
results of an analysis with missing cells! Moreover,
the issue of incomplete designs (and actually of this
specific case!) is explicitly discussed in great
detail in the documentation (see STATISTICA/w manual,
Volume I, p. 1550-1552, for a discussion of exactly
this 2x2 case; and pages 1621-1633 for a detailed
discussion of examples mostly from Milliken & Johnson,
1984; Volume I).

2. When the user requests the "Summary" for the 2-way
design described by Ms. Miller, STATISTICA/w issues
the message: "Design incomplete, test planned
comparisons or specific effects."

3. When the user requests to compute the "A main
effect," the program will first issue a message,
LEAVING NO AMBIGUITY concerning what specific
hypothesis regarding a linear combination of means
will be tested. Specifically, the message reads:
Because of missing cells, the following specific
hypothesis will be tested:
Factor A B Contrast
1 1 0.5
1 2 0.5
2 1 -1.0
This message is part of the default output and it
CANNOT BE SUPPRESSED; therefore, it CANNOT BE MISSED.

3.1. Only then will the program COMPUTE THE CORRECT
RESULTS for testing this specific hypothesis:
Sum of Mean
Squares df Square F p-level
Effect 77.23189 1 77.23189 147.4427 .000006
Error 3.66667 7 .52381
In SPSS, identical results can be obtained if you set
up the problem to test the hypothesis (about the
means) stated above.

4. I fail to see how anyone can be misled by these
multiple, informative messages. At the same time,
consider what precisely is being tested in SPSS: the
default SS computed by SPSS (66.67) actually pertains
to the hypothesis that Mean(A1,B1)=Mean(A2,B1); is
this what the user really intended to test? Will the
naive user know that this is what is being tested? Of
course, you can produce the EXACT SAME RESULTS via
STATISTICA, if you only request to test this specific
hypothesis. Note that this issue is explicitly
discussed (with an example of a 2x2 design with a
missing cell, just like the one described by Paige
Miller) on page 1550 of Volume I of the STATISTICA/w
documentation. If Ms. Miller does not have the
manual, she can also find the same explanation of
exactly her design in the on-line Help (just search
for Missing Cells Designs).

5. The issue of Type III and IV SS in incomplete
designs is of course also discussed in great detail in
the literature. For example, Milliken and Johnson,
Volume I, 1984 (Chapter 14) show how in incomplete
designs the standard Type III SS will actually test
complex compound hypotheses about means and Cell N's.
5.1. Here is another simple example; try analyzing
these data:
A B Y
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
The default results computed by SPSS are:
Sum of Mean Sig
Source Squares DF Square F of F
Main Effects 16.000 2 8.000 16.000 .025
A 16.000 1 16.000 32.000 .011
B 4.000 1 4.000 8.000 .066

Is there really a marginally significant B main
effect? Here are the cell means and marginal means:

Factor A Factor B Marg.Means
1 2
----------------
1 | 1.5 3.5 | 2.5
2 | 5.5 missing | 5.5
----------------
3.5 3.5

Why does SPSS report an almost significant
effect for the set of identical marginal means for
factor B? Of course, as also discussed in detail by
William Gould on this forum, there is no simple answer
to this problem. However, there can be no ambiguity
concerning what is being computed in STATISTICA,
because, unlike other packages, STATISTICA issues a
message (which cannot be suppressed) which explains
what exactly is being calculated, and it also
recommends to the user that he/she should test
explicitly specified planned comparisons. When you
analyze the data with STATISTICA, you will get results
that will be entirely unambiguous regarding the nature
of the hypothesis that is being tested. We consider
this an important advantage over our competitors!

Maria Czyz, Ph.D.
StatSoft, Inc.