# [S] Roundoff error and algorithms

Marc R. Feldesman (feldesmanm@pdx.edu)
Thu, 04 Jun 1998 19:11:19 -0700

I've been working through an exercise to create a function that will
calculate Box's M and both the chi square and F distribution tests of its
significance. My interest, of course, is in determining whether covariance
matrices are "equal". I realize that Box's M is fraught with problems, but
my question arises in the context of programming the procedure.

I have a fully working function. However, in comparing the results from my
SPlus 4.5 (windows 95) function with results from 3 other programs (SAS,
SPSS, and NCSS97), I notice one single, subtle, difference. Box's M
requires the computation of individual covariance matrices and their
log-determinants, and the pooled within groups covariance matrix, and its
log-determinant. This is followed by a bunch of other calculations, but
they are not the source of my problem.

The individual covariance matrices and their determinants are identical in
my SPlus function as their counterparts in SAS, SPSS, and NCSS97. Aside
from possible rounding at about the 6th decimal position, all four programs
give identical results. The problem lies with the pooled w/in group
covariance matrix and its determinant. SAS, SPSS, and NCSS97 all give
identical results for the covariance matrix and determinant; however, there
are slight, and subtle differences between their values and the values I
get using SPlus. I can see the problem clearly as one of cumulative
roundoff error. For example, the determinant in SAS et al for the pooled
within groups covariance matrix in one case is 27.664; in SPlus I get
27.684. This is not a big difference, but considering that all the others
are identical through the first 5 decimal places makes me wonder about why
this one differs at the 2nd decimal place. This isn't an isolated example.
This problem occurs with every test problem I've used.

Since I've never had to explicitly program the computation of a pooled
within covariance matrix, I simply used a sledge hammer approach to the
problem.

Since I have to compute the individual covariance matrices, I might as
well use them for something else, so:

1. Take the individual covariance matrices and scalar multiply each by its
own (n-1).
2. Do an element-by-element addition of the resulting matrices.
3. Divide the summed SSCP matrix by the quantity (N-k) [elementwise],
where N is total individuals across groups and k is the number of groups.
4. Take the determinant of this matrix.

As I see this, step 1 introduces a lot of multiplications, step two adds an
equal number of additions, and step 3 adds a division. All of these seem
to add up to some roundoff error.

I've done steps 1-4 programmatically, and I've done it manually and looked
at the intermediate results. The results are the same either way.

My question is elementary. Given the fact that the individual covariance
matrices have to be computed no matter what, is there a better way of
getting the pooled within groups covariance matrix that might result in
fewer additions, multiplications, and divisions and, hence, less roundoff
error?

Dr. Marc R. Feldesman
email: feldesmanm@pdx.edu
email: feldesman@ibm.net
pager: 503-870-2515
fax: 503-725-3905

"If ignorance is bliss then why aren't there more happy people?" Lawrence
Peter