[S] SUMMARY(?): algorithms & roundoff errors

Marc R. Feldesman (feldesmanm@pdx.edu)
Fri, 05 Jun 1998 10:36:58 -0700


Thank you for your suggestions on the issue of roundoff and algorithms in
calculating the pooled within groups covariance matrix and its log
determinant.

As my two previous messages suggested, I suspected some form of round-off
error resulting from the brute force algorithm I used to get the final
"answer".

I had used the det function from the Matrix library introduced by
Mathsoft(?) in 1995 and included in my versions 4.0 and 4.5. Two writers
suggested that I approach the problem differently, computing the
determinant using other primitives. Bill Venables suggested an approach
involving the qr-decomposition. Tim Hesterberg suggested three different
approaches involving:

(a) log(prod(eigen(pvcvm)$values)) <- works fine
(b) log(prod(svd(pvcvm)$d)) <- doesn't work for me
(c) log(prod(diag(chol(pvcvm)$d)) <- works fine

Earlier today, before Tim suggested the same thing, I discovered a way to
considerably reduce the number of multiplications that take place in the
algorithm I presented to the group in my original message. Instead of
taking each individual vcv matrix and multiplying it by its own n-1 (and
then summing the resulting SSCP matrices), I discovered the SumSquares=T
argument to var()

Thus,

var(x, SumSquares=T) directly produces the SSCP matrix.

By using these instead of doing step 1 in my algorithm, I get the correct
answer for the determinant using the det() function.

It appears that in this case (and in two other test cases I've tried), the
first step in my brute force algorithm introduces a significant amount of
error - more than I would have expected.

Tim asked whether the pooled covariance matrices were the same in each
program I compared. For those where I had the output available to inspect,
they SEEMED "the same". There were small differences, but I dismissed
these as negligible, particularly since the individual covariance matrices
had similar small differences yet yielded correct determinants out at the
6th decimal place.

There is a moral to this story, but I'm not sure what it is. I can't fault
det() as being inaccurate. The problem was in the first step of the
algorithm - going from VCV matrices back to the the SSCP matrix before
beginning the summation process. Apparently, this is not a trustworthy
approach and now that I have discovered the parameter SumSquares=T, this
shouldn't occur again.

Thank you Bill Venables, Tim Hesterberg, and Brian Ripley for helpful
suggestions.

Dr. Marc R. Feldesman
email: feldesmanm@pdx.edu
email: feldesman@ibm.net
pager: 503-870-2515
fax: 503-725-3905

"If ignorance is bliss then why aren't there more happy people?" Lawrence
Peter

Powered by: Monstrochoerus - the 300 MHz Pentium II
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news