I wrote:
>The DEFINITION of a main effect of B has to do with prediction IN BALANCED
>DATA, said more succinctly as the absence of knowledge about B.
It should say, "said more succinctly as the absence of knowledge about A".
I also wrote:
>Statistica answers the question as follows: Not knowing A, I note that
>the mean of Y for A=1 and B=1 are the same and therefore the point
>estimate is 0.
It should say "the mean of Y for B=1 and B=2".
Murdoch then continues with a substantive point of his own:
> [...]
> there's one more way to look at this problem. Perhaps the A=2, B=2
> combination is impossible, and that's why we have no observations there.
> (E.g. A is sex, and B is use of oral contraceptives.) In that case, it
> hardly makes sense to take account of the A=2 observations at all when doing
> the comparison. Why should you care about the male experience when you are
> trying to decide whether to take oral contraceptives or not? Here you
> should really compare the 11 cell to the 12 cell. I think this is what you
> end up doing, isn't it?
Answer: Yes, although there is a subtle issue having to do with the
variance of the estimate and hence significance levels.
Here is the computer output of estimating Y on A and B and Y on B among the
A==1 observations. (I use Stata for obvious reasons of personal bias 
remember, my address is wgould@stata.com  but any of the other packages will
produce the same results and format it in roughly the same way):

. anova y a b
Number of obs = 6 Rsquare = 0.9143
Root MSE = .707107 Adj Rsquare = 0.8571
Source  Partial SS df MS F Prob > F
+
Model  16.00 2 8.00 16.00 0.0251

a  16.00 1 16.00 32.00 0.0109
b  4.00 1 4.00 8.00 0.0663

Residual  1.50 3 .50
+
Total  17.50 5 3.50
. anova y b if a==1
Number of obs = 4 Rsquare = 0.8000
Root MSE = .707107 Adj Rsquare = 0.7000
Source  Partial SS df MS F Prob > F
+
Model  4.00 1 4.00 8.00 0.1056

b  4.00 1 4.00 8.00 0.1056

Residual  1.00 2 .50
+
Total  5.00 3 1.66666667

In the first case, I run the model which we have been discussing. In the
second, "anova y b if a==1", I estimate the model of y on B on just the
A=1 observations. Note the b line in each:
Source  Partial SS df MS F Prob > F
+
(from Y on A B) b  4.00 1 4.00 8.00 0.0663
(From Y on B if A=1) b  4.00 1 4.00 8.00 0.1056
The sums of squares and the F are the same, but the significance levels are
different. As I will explain, the Partial SS and MS must be the same.
It is merely by chance in this example that the F's came out the same  they
did so because the sample variance of YA=1 was the same as for YA=2. The
same or not, the significance levels will differ because, mechanically, the
first is evaluated as F(1,3) and the second as F(1,2).
Why?
The point estimate of the B effect for "females" is 2 no matter how one thinks
about the problem. The significance of the difference 2 depends on the
variance of Y. If Y naturally varies hugely, even within B, then we would
not be surprised to see a difference of 2 even when the true effect is 0.
If Y does not vary much  if the 2 is large relative to the background
variance  then we trust the measurement more.
We do not, however, know the residual variance, so we estimate it. In
"anova y b if a==1", we estimate the variance using the female observations.
In "anova y a b"  using all the data  we use the "male" (A=1) observations
to improve the variance estimate (under the assumption the variances are the
same). In this case, the male observations reenforce the finding that the
variance is "small"  in fact, the sample variance among "males" is exactly
what we observed among "females", so it reenforced that the variance is
exactly what we thought it was when we estimated "anova y b if a==1".
Our variance estimate did not change, but now being measured over more
observations, we are more certain of it, and thus more certain of our
measurement of the B effect. The denominator degrees of freedom for
the F account for this fact (if we knew the variance a priori, we would
use a chisquare distribution to evaluate the significance level).
B did not have to become more significant. Had the "male" observations
exhibited substantial variation, the significance level of the B effect would
have fallen.
Bill.
wgould@stata.com