Re: ancova (fwd)

Joe H Ward (joeward@tenet.edu)
Sat, 25 May 1996 16:32:22 -0400


Below is a reply to a personal message from John Williams. John has said
it is OK to send it to appropriate lists.

***********************************************************************
* Joe Ward 167 East Arrowhead Dr. *
* Health Careers High School San Antonio, TX 78228-2402 *
* Univ. of Texas at San Antonio Phone: 210-433-6575 *
* joeward@tenet.edu *
***********************************************************************

On Thu, 23 May 1996, John E. Williams wrote:

> Date: Thu, 23 May 1996 18:59:36 -0400 (EDT)
> From: John E. Williams <rockport@digital.net>
> To: joeward@tenet.edu
> Subject: ancova
>
> Joe: Thanks for sending along your ideas about ANCOVA. If there is no
> correlation between variates, obviously, there is little need for such an
> analysis. The original questioner desired information regarding texts which
> gave examples of unequal n's in ANCOVA. I sent along at least two that I
> knew about. Old guys like me never forget <G>... a simplistic solution!
> Today, with SAS, SPSS, etc. we have advantages that the old slide ruler
> types could never have imagined. You should remember, however, that in
> two-way or factorial ANOVA the research objectives differ somewhat. ANCOVA
> enables the researcher to "handicap" the subjects where everyone has an
> equal (mathematical) chance at the objective. Two-way ANOVA allows the
> investigator to determine main effects, interaction effects, and secondary
> relationships. I see it as two different species. Maybe you disagree.
> That's what makes good horse races.
> john
>
=====================================================
Thanks for your thoughtful comments, John --

If they are available, you may wish to look at some of the references I
sent out to the list earlier -- particularly the "Synthesizing Regression
Models..." in American Statistician, 1969, 23, 14-20.

In the Ward-Jennings book, Introduction to Linear Models, there are
several flow charts that attempt to make several models seem "more alike
than different".

In "natural language", TWO important investigations involve:

(1) CONTROLLING THE UNCONTROLLABLE

"Are there differences in the expected RESPONSE VARIABLE (frequently
indicated as Y) for cases from DIFFERENT VALUES (LEVELS, AMOUNTS,
CATEGORIES), but which have the SAME VALUES (LEVELS, AMOUNTS, CATEGORIES)
on ONE or MORE ATTRIBUTES. I refer to this as "Controlling the
Uncontrollable". Notice that at this "natural language level", there is no
mention of the form of the ASSUMED or STARTING Prediction Model. In the
beginning ALL models have predictor information that consist of MUTUALLY
EXCLUSIVE CATEGORIES (expressed as BINARY, INDICATOR or sometimes called
DUMMY even though the DUMMIES are the most BRILLIANT predictors). It is
only when we decide -- through logical decision, model "exploration", or
other approaches -- that there is some RELATIONSHIP among the BINARIES
that lead us to "neater" models.

For example, when the ASSUMED MODEL is called "ANOVA" it usually implies
that ALL attributes are represented by BINARY or INDICATOR predictors
(vectors).

And when the ASSUMED MODEL is called "ANCOVA" it frequently
implies that one attribute is represented by BINARY or INDICATOR
predictors (vectors), and the others are "linearly" related (i.e. the
amount of change in Y per unit change in the attribute is CONSTANT.

2. ARE THERE INTERACTIONS or IS IT REASONABLE TO CONSIDER A MODEL HAVING
NO INTERACTIONS?

If there ARE DIFFERENCES observed in 1. above, ARE THE DIFFERENCES
CONSTANT FOR ALL LEVELS OF ONE OR MORE OTHER ATTRIBUTES? (NO INTERACTION).

In the pre-computer days, it was computationally necessary to ASSUME that
the lines (or hyperplanes) are "parallel", i.e., NO INTERACTION. As a
result it was not possible to investigate whether or not there is INTERACTION.

The interesting concerns are, in natural language, independent of
the form of the ASSUMED model. It is interesting that-- as you mention --
in ANOVA, information about INTERACTION is easily computed when
the cells contain equal or proportional n's. However, when the cell n's
are unequal, the computations are not easy -- but that's why we use a
computer!

Furthermore, SOME investigators are not sure what is meant by MAIN
EFFECTS, since they do not know what restrictions are imposed on the
ASSUMED MODEL. And in the "missing-cells-ANOVA" situation the packaged
programs can give answers to hypotheses of no interest to the
investigator. And unfortunately the investigator may never know!

However, in ANCOVA, the pre-computer algorithms must ASSUME NO
INTERACTION. So, in the ancient past, it was fairly easy to investigate
INTERACTION in ANOVA, but more difficult in the ANCOVA.

Whether we use ANOVA, ANCOVA or any analysis procedure depends on the
situation. The important idea is to KNOW WHAT WE'RE DOING.

So, as said on TV, BE CAREFUL OUT THERE!

-- Joe
***********************************************************************
* Joe Ward 167 East Arrowhead Dr. *
* Health Careers High School San Antonio, TX 78228-2402 *
* Univ. of Texas at San Antonio Phone: 210-433-6575 *
* joeward@tenet.edu *
***********************************************************************