# Re: [S] reasonable p-values for Fisher exact's test - WAS strange ...

Patrick Connolly (*PConnolly@grunt.marc.cri.nz*)

*Wed, 25 Mar 1998 13:19:18 +1200 (NZST)*

According to Charles C. Berry:

*|> *

*|> Before this thread enters an infinite loop, a few observations:*

*|> *

*|> First, class(fisher.test(etc) ) == "htest"*

*|> *

*|> So, print.htest() will format the results of fisher.test(). This is done*

*|> as follows*

*|> *

*|> cat("p-value =", format(round(x$p.value, 4)), "\n")*

*|> *

*|> (on Version 3.4 Release 1 for Sun SPARC, SunOS 4.1.3_U1 : 1996)*

*|> *

*|> So the reports that fisher.test() *seemed* to work OK only imply that*

*|> the first 5 digits were OK.*

*|> *

*|> Also, note that fisher.test() uses an algorithm which allows R x C*

*|> tables. This isn't required in simple 2 x 2 tables (and it wouldn't be*

*|> too hard to put in a switch for such tables), but this is what gets*

*|> used. *

*|> *

*|> Getting to the point:*

*|> *

*|> This algorithm usually yields answers that differ numerically from the*

*|> exact hypergeometric probability, viz the result of:*

*|> *

*|> > fisher.test(matrix(c(0,2,2,2),nc=2))$p*

*|> [1] 0.4666666*

*|> *

*|> differs from *

*|> *

*|> > dhyper(0:2,2,4,2)*

*|> [1] 0.40000000 0.53333333 0.06666667*

*|> *

*|> by an amount*

*|> *

*|> > fisher.test(matrix(c(0,2,2,2),nc=2))$p-sum(dhyper(c(0,2),2,4,2))*

*|> [1] -2.78155e-08*

*|> > *

*|> *

*|> And this isn't an isolated case. The following summaries are of numbers*

*|> that all equal zero under exact (and obvious) arithmetic:*

*|> *

*|> > summary(sapply(1:20,function(x) fisher.test(matrix(c(1,1,x,x),nc=2))$p-1.0))*

*|> Min. 1st Qu. Median Mean 3rd Qu. Max. *

*|> -1.407e-05 -1.997e-06 -8.941e-08 -3.189e-07 1.192e-06 1.562e-05*

*|> > summary(sapply(1:20,function(x) fisher.test(matrix(c(2,2,x,x),nc=2))$p-1.0))*

*|> Min. 1st Qu. Median Mean 3rd Qu. Max. *

*|> -2.325e-05 -2.295e-06 2.384e-07 -7.927e-07 2.712e-06 7.868e-06*

*|> *

*|> Only 1 of 40 , c(1,1,5,5) , gives exactly 0.0 as the result.*

*|> *

*|> So, fisher.test() apparently uses an approximation which gives a correct*

*|> answer for the first 5 or 6 significant digits most of the time.*

*|> *

*|> Even though the table*

*|> *

*|> matrix(1,nr=2,nc=2)*

*|> *

*|> would obviously lead to a p-value of exactly 1.0, it seems of little*

*|> practical import that fisher.test() reports it as *

*|> *

*|> > print(fisher.test(matrix(1,nr=2,nc=2))$p,digits=10)*

*|> [1] 0.9999998808*

*|> > *

*|> *

*|> If this is a problem, then dhyper() can be used in 2 x 2 tables. It*

*|> seems to generate results that are close to machine accuracy.*

*|> *

*|> -- *

*|> *

*|> Charles C. Berry (619) 534-2098 *

*|> Dept of Family/Preventive*

*|> Medicine*

*|> E mailto:cberry@tajo.ucsd.edu UC San Diego*

*|> http://hacuna.ucsd.edu/members/ccb.html La Jolla, San Diego 92093-0622*

*|> -----------------------------------------------------------------------*

*|> This message was distributed by s-news@wubios.wustl.edu. To unsubscribe*

*|> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the*

*|> message: unsubscribe s-news*

*|> *

-----------------------------------------------------------------------

This message was distributed by s-news@wubios.wustl.edu. To unsubscribe

send e-mail to s-news-request@wubios.wustl.edu with the BODY of the

message: unsubscribe s-news