Re: Statistica anova

James H Steiger (steiger@unixg.ubc.ca)
Mon, 5 Sep 94 21:40:44 EDT


Maria Czyz, following a thread initiated by Paige Miller,
had given the following interesting example of a 2x2 ANOVA
with a missing cell:

Czyz had written

>try analyzing these data:
> A B Y
> 1 1 1
> 1 1 2
> 1 2 3
> 1 2 4
> 2 1 5
> 2 1 6
>The default results computed by SPSS are:
> Sum of Mean Sig
>Source Squares DF Square F of F
>Main Effects 16.000 2 8.000 16.000 .025
> A 16.000 1 16.000 32.000 .011
> B 4.000 1 4.000 8.000 .066
>
>Is there really a marginally significant B main
>effect? Here are the cell means and marginal means:
>
>Factor A Factor B Marg.Means
> 1 2
> ----------------
>1 | 1.5 3.5 | 2.5
>2 | 5.5 missing | 5.5
> ----------------
> 3.5 3.5
>
>Why does SPSS report an almost significant
>effect for the set of identical marginal means for
>factor B? Of course, as also discussed in detail by
>William Gould on this forum, there is no simple answer
>to this problem. However, there can be no ambiguity
>concerning what is being computed in STATISTICA,
>because, unlike other packages, STATISTICA issues a
>message (which cannot be suppressed) which explains
>what exactly is being calculated, and it also
>recommends to the user that he/she should test
>explicitly specified planned comparisons.

After clarifying his gender (several authors thought he was female),
in his latest post Paige Miller typed:

>Different statisticians will prefer to test different contrasts. I happen to
>feel that this contrast is meaningless when there is a missing cell. The only
>fair comparison for factor A is
>
> A B Contrast
> 1 1 1
> 1 2 0
> 2 1 -1

This latter contrast is a "subset" hypothesis. It essentially performs
an ANOVA on the subset of cells for which factor B has been measured.

Statistica's default hypothesis simply ignores B (by treating all
cells equally and averaging their means) when computing a "main
effect" for A. We could call the Statistica type of hypothesis a "full
data hypothesis," just for the sake of simplicity in subsequent
discussion.

If we can move away from the more petty aspects of this discussion
(i.e., an attempt to score points for/against one's favorite
statistical package), I think we might discover an interesting
substantive issue here.

In many cases, I would find myself agreeing with Paige Miller that it
makes more sense to compare factor A only for levels of B that are
available.

In such situations, Statistica makes it easy to test the
desired hypothesis. Measured from the time you double click the
Statistica icon to the time you exit the program, it requires
approximately 90 seconds to obtain the following results...

+---------------------+---------------------------------+
| STAT. |Summary of Contrasts (miller.sta)|
| GENERAL |Between-Groups |
| MANOVA | |
+---------------------+--------------+------------------+
| | |
| A B | c. 1 |
+---------------------+--------------+
| G_1:1 G_1:1 | 1 |
| G_1:1 G_2:2 | 0 |
| G_2:2 G_1:1 | -1 |
| G_2:2 G_2:2 | 0 |
+---------------------+--------------+

+----------+------------------------------------------------------+
| STAT. | Planned Comparison (miller.sta) |
| GENERAL | 1-A, 2-B |
| MANOVA | |
+----------+----------+----------+----------+----------+----------+
| Univar. | Sum of | | Mean | | |
| Test | Squares | df | Square | F | p-level |
+----------+----------+----------+----------+----------+----------+
| Effect | 66.66666 | 1 | 66.66666 | 127.2727 | .000010 |
| Error | 3.66667 | 7 | .52381 | | |
+----------+----------+----------+----------+----------+----------+

+---------------------+---------------------------------+
| STAT. |Summary of Contrasts (miller.sta)|
| GENERAL |Between-Groups |
| MANOVA | |
+---------------------+--------------+------------------+
| | |
| A B | c. 1 |
+---------------------+--------------+
| G_1:1 G_1:1 | 1 |
| G_1:1 G_2:2 | -1 |
| G_2:2 G_1:1 | 0 |
| G_2:2 G_2:2 | 0 |
+---------------------+--------------+

+----------+------------------------------------------------------+
| STAT. | Planned Comparison (miller.sta) |
| GENERAL | 1-A, 2-B |
| MANOVA | |
+----------+----------+----------+----------+----------+----------+
| Univar. | Sum of | | Mean | | |
| Test | Squares | df | Square | F | p-level |
+----------+----------+----------+----------+----------+----------+
| Effect | 2.333333 | 1 | 2.333333 | 4.454545 | .072726 |
| Error | 3.666667 | 7 | .523810 | | |
+----------+----------+----------+----------+----------+----------+

So of course you have the option of testing the above hypothesis,
and Statistica makes it rather easy. Note that these results agree
with those of the other programs.

However, modern statistical packages like Statistica, which provides
very extensive, user-friendly capabilities for examining assumptions
in the analysis of variance, are moving away from the notion that
every situation has a fixed "only fair" remedy, that should be fed to
the user without comment.

Statistica's "default" behavior differs from the other packages in
this case. It suspect one would have to search hard to find many
instances of consequence where this is true.

I doubt this particular instance is of great consequence.Hopefully,
not very many statisticians perform missing-cell ANOVAs without
thinking about what they are testing,

But returning to Paige Miller's assertion, is the "only fair"
comparison in a 2x2 design with a missing cell the one that compares
the "available cells" in a given row or column?

Certainly Bill Gould, in his erudite earlier posting, did not slam the
door on the "full data hypothesis" contrast.

Gould wrote:

>What do YOU mean by a main effect?
>
>The language of ANOVA, once we move from simple cases, is insufficiently
>precise. It leaves room for honest statisticians to disagree as to
>what the linear hypothesis ought to be corresponding to the vague words.
>
>Stata, SAS, and the rest have one definition.
>
>Statistica has another.

In other words, by playing the "prestige" game, we might be right more
often than not, but we also might be missing
something very interesting.

A timely quote might help put matters in perspective.

Searle, in the book "Linear Models for Unbalanced Data," cautions
against rigidity in a situation very similar to that dealt with here.

In the chapter on "The 2-Way Crossed Classification, Some Cells
Empty", Searle says the following:

"It is the investigator's knowledge of the study for which the data
were collected, of the data themselves, and of the source of the data
that produces what is of interest so far as linear functions of
filled-cell mu's are concerned....the question of what to estimate
rests squarely on the investigator....making such decisions is not
just the task of the statistician - far from it. Investigators
familiar with the data, their source, and the data gathering process
have to be involved. Not only do these aspects of data have to be
taken into account when deciding on what linear functions fo cell
means (of filled cells) are of interest, but also the purpose for
which the linear function is intended must be considered." (p. 158)

Searle goes on (p.159) to consider a case precisely analogous to the
one under consideration here, except that the ANOVA is 2x3 instead of
2x2. Searle considers BOTH contrasts, of the type preferred by
Miller, AND the type given as the default by Statistica. Searle states
that "generally speaking" the hypothesis of the type favored by
Statistica "does not seem very appealing."

However, Searle quickly adds that in the case where it is of interest,
such a hypothesis can be estimated and tested.

So S.R. Searle and William Gould seem to be in essential agreement.

1. In missing cell designs, you have to know what you are doing,
where your data came from, and what you want. Any program that offers
"default" behavior without comment might be dangerous. Stata, SAS, and
Statistica all warn the user quite openly about what they are testing.

2. Deciding that a hypothesis is "reasonable" or not cannot be done
in the abstract. You have to have a host of factors under
consideration.

This second point would seem to call into question Mr. Miller's
assertion that the "subset" hypothesis is the "only fair comparison"
in such designs.

So after a number of postings, we have finally arrived at that point
where the intellectual exchange might just become fascinating.

Specifically,

1) Can we imagine substantive examples where the "subset hypothesis"
is the one we definitely wish to test? Why?

2) Can we imagine substantive examples where the "full data hypothesis"
is the one we definitely wish to test? Why?

Let's use our imagination, and post our examples here on this
forum. Preferably, each poster should try to come with examples of
EACH case. My suspicion, in line with Searle's judgement, and Miller's
preference, is that situations of Type 1 are more obvious, and that
situations of Type 2 might require greater creativity and imagination.

However, if situations of Type 2 exist, we would certainly be doing
our students a disservice by pretending that the default behavior of
SPSS and SAS is the "only fair" behavior.

I've been puzzling over this, and I THINK I have a clearcut example of
each situation. Keep in mind, Statistica, SAS, Data Desk, can test
both types of contrast. What I want is some really good, really
provocative examples to present to my students.

Again, thanks to William Gould for his very interesting contributions
to this discussion.

Those who prefer not to post to this newsgroup, I would still appreciate
seeing your examples, if you wish to send them to me directly..

James H. Steiger (steiger@unixg.ubc.ca)
University of British Columbia
Dept. of Psychology
Vancouver, B.C., V6T 1Z4