Re: [S] reasonable p-values for Fisher exact's test - WAS strange ...

Patrick Connolly (PConnolly@grunt.marc.cri.nz)
Wed, 25 Mar 1998 13:19:18 +1200 (NZST)


According to Charles C. Berry:
|>
|> Before this thread enters an infinite loop, a few observations:
|>
|> First, class(fisher.test(etc) ) == "htest"
|>
|> So, print.htest() will format the results of fisher.test(). This is done
|> as follows
|>
|> cat("p-value =", format(round(x$p.value, 4)), "\n")
|>
|> (on Version 3.4 Release 1 for Sun SPARC, SunOS 4.1.3_U1 : 1996)
|>
|> So the reports that fisher.test() *seemed* to work OK only imply that
|> the first 5 digits were OK.
|>
|> Also, note that fisher.test() uses an algorithm which allows R x C
|> tables. This isn't required in simple 2 x 2 tables (and it wouldn't be
|> too hard to put in a switch for such tables), but this is what gets
|> used.
|>
|> Getting to the point:
|>
|> This algorithm usually yields answers that differ numerically from the
|> exact hypergeometric probability, viz the result of:
|>
|> > fisher.test(matrix(c(0,2,2,2),nc=2))$p
|> [1] 0.4666666
|>
|> differs from
|>
|> > dhyper(0:2,2,4,2)
|> [1] 0.40000000 0.53333333 0.06666667
|>
|> by an amount
|>
|> > fisher.test(matrix(c(0,2,2,2),nc=2))$p-sum(dhyper(c(0,2),2,4,2))
|> [1] -2.78155e-08
|> >
|>
|> And this isn't an isolated case. The following summaries are of numbers
|> that all equal zero under exact (and obvious) arithmetic:
|>
|> > summary(sapply(1:20,function(x) fisher.test(matrix(c(1,1,x,x),nc=2))$p-1.0))
|> Min. 1st Qu. Median Mean 3rd Qu. Max.
|> -1.407e-05 -1.997e-06 -8.941e-08 -3.189e-07 1.192e-06 1.562e-05
|> > summary(sapply(1:20,function(x) fisher.test(matrix(c(2,2,x,x),nc=2))$p-1.0))
|> Min. 1st Qu. Median Mean 3rd Qu. Max.
|> -2.325e-05 -2.295e-06 2.384e-07 -7.927e-07 2.712e-06 7.868e-06
|>
|> Only 1 of 40 , c(1,1,5,5) , gives exactly 0.0 as the result.
|>
|> So, fisher.test() apparently uses an approximation which gives a correct
|> answer for the first 5 or 6 significant digits most of the time.
|>
|> Even though the table
|>
|> matrix(1,nr=2,nc=2)
|>
|> would obviously lead to a p-value of exactly 1.0, it seems of little
|> practical import that fisher.test() reports it as
|>
|> > print(fisher.test(matrix(1,nr=2,nc=2))$p,digits=10)
|> [1] 0.9999998808
|> >
|>
|> If this is a problem, then dhyper() can be used in 2 x 2 tables. It
|> seems to generate results that are close to machine accuracy.
|>
|> --
|>
|> Charles C. Berry (619) 534-2098
|> Dept of Family/Preventive
|> Medicine
|> E mailto:cberry@tajo.ucsd.edu UC San Diego
|> http://hacuna.ucsd.edu/members/ccb.html La Jolla, San Diego 92093-0622
|> -----------------------------------------------------------------------
|> This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
|> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
|> message: unsubscribe s-news
|>

-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news