I have included my initial question and two similar detailed answers
that I believe are similarly flawed. My thanks, of course, to the
authors for their effort.
When splus or any program calculates power it uses the normal
distribution not the t distribution. This is reasonable because for the
power calculation you are estimating the population means, the
population standard deviations.
After you have actually run the study, you are forced to use the
observed means and the standard deviation calculated from the sample.
Gosset created the t-test to address the issue that in small samples,
both statistical significance considerations and confidence intervals
would reflect the reality of a t-statistic not a normal distribution.
Therefore, from my perspective, it is perfectly legitimate to use a
normal distribution for your power calculation where you are estimating
the population standard deviation. But, after you have run the study,
or run the simulation, you must test for statistical significance using
the t-distribution.
I have repeated my simulation experiments with sample sizes of 3,4,5 and
you can clearly see that the power calculations are meaningful. It
seems that sample size 2 is uniquely unstable in the power program of
splus 4.5.
Sincerely,
Eran Bellin, M.D.
Director Outcome Analysis and Decision Support
Montefiore Medical Center
Bronx, N.Y. 10467
Let us repeat the analysis, now requiring, three members for each
vector.
> for (i in
1:1000)(z[i]<-t.test(sample(boys,3,replace=T),sample(girls,3,replace=T))$p.value)
> length(z[z<.05])
[1] 881
If we require three members in each group, we achieve nominal
significance 881 of 1,000 times. We have a power of 88.1%
Similar analysis shows:
> for (i in
1:1000)(z[i]<-t.test(sample(boys,4,replace=T),sample(girls,4,replace=T))$p.value)
> length(z[z<.05])
[1] 985
For samples of size four we have a power of 98.5%.
> for (i in
1:1000)(z[i]<-t.test(sample(boys,5,replace=T),sample(girls,5,replace=T))$p.value)
> length(z[z<.05])
[1] 999
For samples of size five, we have a power of 99.9% .
_________________________________________________________________
__________________________________________________________________
Initial question:
I think that there is something wrong with the power program.
I asked for the normal mean power calculation of:
*** Power Table ***
mean1 sd1 mean2 sd2 delta alpha power n1 n2
1 66 4 80 4 14 0.05 0.8 2 2
This implies that if you have a group with mean of 80 and standard
deviation of 4
compared to a group with a mean of 66 and a standard deviation of 4 you
only need 2
members from each to find a statistically significant finding 80% of the
time at a .05 level.
Well, I then tested this by creating two vectors with these values and
then running t.tests on
samples of 2 members from each at a time and looked at the resultant p
values.
> boys<-rnorm(1000,80,4)
> girls<-rnorm(1000,66,4)
> for (i in
1:1000)(z[i]<-t.test(sample(boys,2,replace=T),sample(girls,2,replace=T))$p.value)
I then asked, how many of these z observations have a p value of .05 or
less. The result
was:
> length(z[z<.05])
[1] 484
> length(z)
[1] 1000
only 48.4% had a value of .05 or less. I should have expected 80%.
Why the difference?
Thank you in advance.
Sincerely,
Eran Bellin, M.D.
Department Outcome Analysis and Decision Support
Montefiore Medical Center
Bronx, N.Y.
Answers:
Subject:
Re: [S] Flaw in power program - normal means
Date:
Wed, 22 Jul 1998 08:37:03 +0930
From:
"Prof. Richard Jarrett" <rjarrett@stats.adelaide.edu.au>
To:
belliney@idt.net
I presume it is that the power program uses the approximation that
the sd is known.
If this is true, it bases its calcuations on using a
z rather than a t. In your case you have a t on 2df
which has VERY different properties to the Normal distn
assumed by the program.
For groups of size 10 or more, the difference between the t and the z
will make almost no difference to the results.
Richard Jarrett
______________________________________________________
Prof Richard Jarrett | Phone: +61 8 8303 3034
Dept of Statistics | Fax: +61 8 8303 3696
Univ of Adelaide | email: rjarrett@stats.adelaide.edu.au
Adelaide 5005 Australia| Web: http://www.maths.adelaide.edu.au/Stats
Subject:
[S] Re: Flaw in power program
Date:
Tue, 21 Jul 1998 16:33:49 -0700
From:
Steve Allan <sallan@statsci.com>
To:
s-news@wubios.wustl.edu
Dr. Bellin,
The formula used to compute sample sizes is based on the quantiles of
the
Normal distribution. So sample sizes less than about 10 should be
looked
at askew when using the t test.
If you run the simulation using the Z statistic, you get about 95%
power.
The exact sample size (check the 'Options' page in the dialog for exact
N)
returns 1.28, so with n=2 the power is actually higher than 80% using a
Z statistic.
ztest <- function()
{
x <- rnorm(2, 66, 4)
y <- rnorm(2, 80, 4)
z <- (mean(x) - mean(y)) / 4
1 - pnorm(abs(z))
}
> z <- numeric(1000)
> for(i in 1:1000) z[i] <- ztest()
> sum(z < 0.05)
[1] 969
Conversly, if you select 'Min. Difference' in the dialog and enter
sample sizes of 10, you get an alternative of 85.012.
> boys <- rnorm(1000, 80, 4)
> girls <- rnorm(1000, 85.012, 4)
> for(i in 1:1000) z[i] <- t.test(sample(girls, 10, rep=T),
sample(boys, 10,
rep=T))$p.value
> sum(z < 0.05)
[1] 801
At a minimum, we should print a warning when the calculated sample
size is less than 10. I'll file a report on this.
Thank you for raising this point.
Steve
**************************************
*
* Data Analysis Products Division
* MathSoft, Incorporated
*
* Email: sallan@statsci.com
* Phone: 206-283-8802
*
**************************************
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news