Re: [S] S+ negative eigenvalue from eigen(), not from svd()

Prof Brian Ripley (ripley@stats.ox.ac.uk)
Fri, 1 May 1998 20:18:00 +0100 (BST)


Tom Burr wrote:
>
> Hello,
> Can someone tell me how to modify var() in S+ so that it always
> produces a positive definite matrix when applied to any
> n-by-p matrix with n>p. Or, maybe someone can tell me that var() already
> does that, but that eigen() can fail to produce the "true"
> eigenvalues. Below you'll see that I get a negative eigenvalue
> using eigen() but all postive eigenvalues using svd()--
> I suppose I'm asking something in the "numerical stability"
> area where I get uncomfortable pretty quickly.
>
>
> Here is an 8 by 7 matrix, say temp, that is
> very poorly conditioned.
>
> temp:
> 0.0 1.0 2.0 3.0 4.0 5.0 6.0
> 0.0 1.0 2.0 3.0 4.0 5.0 5.0
> 0.0 1.0 2.0 3.0 4.0 4.0 4.0
> 6.0 5.0 4.0 3.0 2.0 1.0 0.0
> 6.0 5.0 4.0 3.0 2.0 1.0 1.0
> 6.0 5.0 4.0 3.0 2.0 2.0 2.0
> 1.0 1.0 1.0 1.0 1.0 1.0 1.0
> 0.0 0.0 0.0 0.0 0.0 0.0 0.0
>
>
> > var(temp)
> [,1] [,2] [,3] [,4] [,5] [,6]
> [,7]
> [1,] 9.125000 6.5535714 3.98214286 1.410714 -1.1607143 -2.87500000
> -3.7321429
> [2,] 6.553571 4.8392857 3.12500000 1.410714 -0.3035714 -1.44642857
> -2.0178571
> [3,] 3.982143 3.1250000 2.26785714 1.410714 0.5535714 -0.01785714
> -0.3035714
> [4,] 1.410714 1.4107143 1.41071429 1.410714 1.4107143 1.41071429
> 1.4107143
> [5,] -1.160714 -0.3035714 0.55357143 1.410714 2.2678571 2.83928571
> 3.1250000
> [6,] -2.875000 -1.4464286 -0.01785714 1.410714 2.8392857 3.98214286
> 4.5535714
> [7,] -3.732143 -2.0178571 -0.30357143 1.410714 3.1250000 4.55357143
> 5.4107143
> > eigen(var(temp))
> $values:
> [1] 1.907736e+001 9.930884e+000 2.622073e-001 3.311722e-002
> 9.583994e-016 1.843151e-016 -2.367552e-016
>
> alternate using svd:
> svd(var(temp))
> $d:
> [1] 1.907736e+001 9.930884e+000 2.622073e-001 3.311722e-002 7.682346e-016
> 5.022829e-016 1.477306e-019

Colin Goodall suggested:

> To be sure of obtaining non-negative eigenvalues
> from a symmetric real matrix, use eigen(x,symmetric=T)

but the answer is the same, with a negative eigenvalue.

First, to sort out a confusion, svd does not return eigenvalues but
singular values, which are by definition non-negative.

Second, the problem here is the svd shows that numerical rank of
var(temp) is 4; it would be at most 7 (8 - 1 for the mean). Thus there are
really three zero eigenvalues, and it is too much to expect the numerical
calculations to give these as exactly zero.

Moral: there is really no way out but to assume that your data might be
rank-degenerate and take proper precautions. These can be quite cumbersome
(see the code in my lda.default). A second point is to think why you are
doing this in the first place. It is better practice to take the svd of
scale(temp,T,F) than form eigen(var(temp)): not only are you guaranteed
non-negative eigenvalues (the square of the singular values) but by
avoiding `squaring' you will have much higher numerical accuracy.
PCA, LDA, ... are all best computed this way.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news