Re: [S] S+ negative eigenvalue from eigen(), not from svd()

Colin Goodall (cgoodall@monmouth.com)
Sat, 02 May 1998 17:50:18 -0400


1. eigen(var(temp),symmetric=T) is a useful response when eigen()
has difficulties, returning unreasonable eigenvalues
(typically complex) when the matrix is known to be
non-negative definite. That is not the problem here.
In fact, x=var(temp) passes the test all(x==t(x)) so symmetric=T
automatically.

BTW, eigen(var(temp),symmetric=F) gives (for me) real results
which do not seem quite right.

2. As Brian Ripley points out, the problem is that the matrix has
less than full rank. The last 3 eigenvalues are exact zeroes,
and, while small in absolute value, are returned by S as either negative
or positive. The last 3 singular values of the centered temp
are exact zeros, and are returned by S as very small positive numbers.

Thanks to Brian Ripley, James Stapleton, Duncan Murdoch, and
Dan Heitjan for some comments.

3. The small numbers should be set to zero. The following measures
the impact of doing this.

Recall that the function all.equal() uses, for values close to zero, an
default tolerance of sqrt( .Machine$double.eps ) which is
1.490116e-08 on this machine (SGI/IRIX 6.2). Below, the tolerance
is set to 1e-15 or 1e-16.

The conclusion (from pretty much all that follows from here down)
is that the imprecision in the first eigenvalue and in the first
singular value is larger than any imprecision associated with the
last few eigenvalues or singular values. Both types of error
are negligbly small ( order e-15 and order e-16 respectively ).

SMALL <- 1e-15
vr <- var(temp)
ev <- eigen(vr)
evnew <- ev
evnew$val[abs(evnew$val)<SMALL] <- 0 #### set EVs to zero
all.equal(ev$vec %*% diag(ev$val) %*% t(ev$vec), vr, tolerance=SMALL)
# Mean relative difference: 1.044267e-15
all.equal(evnew$vec %*% diag(evnew$val) %*% t(evnew$vec), vr, tolerance=SMALL)
# Mean relative difference: 1.050776e-15
all.equal(ev$vec %*% diag(ev$val) %*% t(ev$vec),
evnew$vec %*% diag(evnew$val) %*% t(evnew$vec), tolerance=SMALL/10)
# Mean relative difference: 1.629258e-16

This is splitting hairs, but the effect of rounding the small eigenvalues
is smaller than the imprecision of the eigen decomposition (recall that the
underlying eigen algorithm is iterative).

4. Repeating step 3. for the singular value decomposition,
with similar conclusions

tc <- scale(temp, T, F)
all.equal( t(tc) %*% tc /( nrow(temp)-1 ), vr, tolerance=SMALL/10 )
# Mean relative difference: 4.460929e-16
sv <- svd(tc)
svnew <- sv
svnew$d[abs(evnew$d)<SMALL] <- 0
all.equal( sv$u %*% diag(sv$d) %*% t(sv$v), tc, tolerance=SMALL/10 )
# Mean relative difference: 5.695927e-16
all.equal( svnew$u %*% diag(svnew$d) %*% t(svnew$v), tc, tolerance=SMALL/10 )
# Mean relative difference: 5.781662e-16
all.equal( sv$u %*% diag(sv$d) %*% t(sv$v),
svnew$u %*% diag(svnew$d) %*% t(svnew$v), tolerance=SMALL/10 )
# Mean relative difference: 1.885858e-16

4. The orthogonal matrices, from the two decompositions, agree in
the first 4 columns (the rank of vr), up to change of sign.
The eigenvalues and squares of singular values (/7) match.

vr = O E O^T
tc = U D V^T
7vr = tc^T tc
= V D U^T U D V^T
= V D^2 V^T

The signs of the columns of O, of U, and of V are arbitrary,
except that signs of corresponding columns of U and V
must be swapped in sync. The function normmat() is used to
make the first row non-negative.

normmat <- function(x,fac){
if(missing(fac)) fac <- ifelse( sign(x[1,])==0, 1, sign(x[1,]) )
x * matrix(fac,nrow(x),ncol(x),byrow=T)
}
e <- ev$val
E <- diag(e)
O <- normmat(ev$vec)

V <- normmat(sv$v)
U <- normmat(sv$u,V[1,]/sv$v[1,])
d <- sv$d
D <- diag(d)

all.equal( V %*% (D^2/7) %*% t(V), vr, tolerance=SMALL/10 )
# Mean relative difference: 7.532491e-16

apply(array(c(O,V),c(dim(O),2)),2,function(x)all.equal(x[,1],x[,2]))
# or use dapply(O,V,all.equal)
# [1] "TRUE" "TRUE"

# [3] "TRUE" "TRUE"

# [5] "Mean relative difference: 0.676486" "Mean rel diff: 0.4397201"
# [7] "Mean relative difference: 0.4201242"

# the largest eigenvalues and squares of the singular values differ most
e
# [1] 1.907736e+01 9.930884e+00 2.622073e-01 3.311722e-02
# [5] 9.583994e-16 1.843151e-16 -2.367552e-16
d^2/7
# [1] 1.907736e+01 9.930884e+00 2.622073e-01 3.311722e-02
# [5] 2.737870e-32 1.708552e-32 5.765620e-33
e - d^2/7
# [1] -3.552714e-15 3.552714e-15 1.387779e-15 1.179612e-16
# [5] 9.583994e-16 1.843151e-16 -2.367552e-16

5. For fun, we try the svd of the variance matrix, which is
symmetric. This works fine here, since the first four columns
of "u" and "v" are identical; the last three are different.
And the singular values are all positive.

svvr <- svd(vr)
svvr$d
[1] 1.907736e+01 9.930884e+00 2.622073e-01 3.311722e-02 7.642100e-16
[6] 5.061760e-16 2.281970e-18
svvr$d - e
# [1] -1.065814e-14 3.552714e-15 -1.665335e-15 8.326673e-17
# [6] -1.941893e-16 3.218609e-16 2.390371e-16
all.equal( svvr$u %*% diag(svvr$d) %*% t(svvr$v), vr, tolerance=SMALL/10 )
# Mean relative difference: 5.232023e-16
apply(array(c(svvr$u,svvr$v),c(7,7,2)),2,function(x)all.equal(x[,1],x[,2]))
[1] "TRUE" "TRUE"

[3] "TRUE" "TRUE"

[5] "Mean relative difference: 0.3880426" "Mean relative difference:
2.041992"
[7] "Mean relative difference: 1.793989"

6. Here is a look at the approximation to var(temp) by sums of terms
e_i o_i o_i^t (i=1,...,7).

EOtE <- array(
apply(rbind(e,O),2,function(x)x[1]*outer(x[-1],x[-1])),
c(dim(O),length(e)))
EOtEcs <- aperm( apply(EOtE,c(1,2),cumsum), c(2,3,1) )
Eres <- array(vr,dim(EOtEcs)) - EOtEcs
apply(abs(Eres),3,max)
# [1] 2.427279e+00 1.200158e-01 2.300308e-02 6.217249e-15
# [5] 7.105427e-15 7.105427e-15 7.105427e-15

Similar we can approximate the centered temp,
tc, by partial sums from the SVD.

UDtV <- array(
apply(rbind(d,U,V),2,function(x)x[1]*outer(x[2:9],x[10:16])),
c(8,7,7))
UDtVcs <- aperm( apply(UDtV,c(1,2),cumsum), c(2,3,1) )
Ures <- array(tc,dim(UDtVcs)) - UDtVcs
apply(abs(Ures),3,max)
# [1] 2.866969e+00 4.836946e-01 2.306124e-01 2.664535e-15
# [5] 2.664535e-15 2.664535e-15 2.664535e-15

The third step is to approximate var(temp) by V (D^2/7) V^T

VD2tV <- array(
apply(rbind(d^2/7,V),2,function(x)x[1]*outer(x[-1],x[-1])),
c(7,7,7))
VD2tVcs <- aperm( apply(VD2tV,c(1,2),cumsum), c(2,3,1) )
Vres <- array(vr,dim(VD2tVcs)) - VD2tVcs
apply(abs(Vres),3,max)
# [1] 2.427279e+00 1.200158e-01 2.300308e-02 7.105427e-15
# [5] 7.105427e-15 7.105427e-15 7.105427e-15

Next we see how terms of E O E^T and V (D^2/7) V^T differ

EVres <- EOtE - VD2tV
apply(abs(EVres),3,max)
# [1] 5.329071e-15 5.551115e-15 1.332268e-15 4.478839e-16
# [5] 6.613522e-16 6.366861e-17 1.218375e-16

The corresponding partial sums, below, are dominated by the
first term!! In fact by the (1,1) element of the first term.

EVrescs <- EOtEcs - VD2tVcs
apply(abs(EVrescs),3,max)
# [1] 5.329071e-15 5.329071e-15 5.329071e-15 5.329071e-15
# [5] 5.329071e-15 5.329071e-15 5.329071e-15
# EVrescs[1,1,1]

Brian Ripley wrote:
>
>Tom Burr wrote:
>>
>> Hello,
>> Can someone tell me how to modify var() in S+ so that it always
>> produces a positive definite matrix when applied to any
>> n-by-p matrix with n>p. Or, maybe someone can tell me that var() already
>> does that, but that eigen() can fail to produce the "true"
>> eigenvalues. Below you'll see that I get a negative eigenvalue
>> using eigen() but all postive eigenvalues using svd()--
>> I suppose I'm asking something in the "numerical stability"
>> area where I get uncomfortable pretty quickly.
>>
>>
>> Here is an 8 by 7 matrix, say temp, that is
>> very poorly conditioned.
>>
>> temp:
>> 0.0 1.0 2.0 3.0 4.0 5.0 6.0
>> 0.0 1.0 2.0 3.0 4.0 5.0 5.0
>> 0.0 1.0 2.0 3.0 4.0 4.0 4.0
>> 6.0 5.0 4.0 3.0 2.0 1.0 0.0
>> 6.0 5.0 4.0 3.0 2.0 1.0 1.0
>> 6.0 5.0 4.0 3.0 2.0 2.0 2.0
>> 1.0 1.0 1.0 1.0 1.0 1.0 1.0
>> 0.0 0.0 0.0 0.0 0.0 0.0 0.0
>>
>>
>> > var(temp)
>> [,1] [,2] [,3] [,4] [,5] [,6]
>> [,7]
>> [1,] 9.125000 6.5535714 3.98214286 1.410714 -1.1607143 -2.87500000
>> -3.7321429
>> [2,] 6.553571 4.8392857 3.12500000 1.410714 -0.3035714 -1.44642857
>> -2.0178571
>> [3,] 3.982143 3.1250000 2.26785714 1.410714 0.5535714 -0.01785714
>> -0.3035714
>> [4,] 1.410714 1.4107143 1.41071429 1.410714 1.4107143 1.41071429
>> 1.4107143
>> [5,] -1.160714 -0.3035714 0.55357143 1.410714 2.2678571 2.83928571
>> 3.1250000
>> [6,] -2.875000 -1.4464286 -0.01785714 1.410714 2.8392857 3.98214286
>> 4.5535714
>> [7,] -3.732143 -2.0178571 -0.30357143 1.410714 3.1250000 4.55357143
>> 5.4107143
>> > eigen(var(temp))
>> $values:
>> [1] 1.907736e+001 9.930884e+000 2.622073e-001 3.311722e-002
>> 9.583994e-016 1.843151e-016 -2.367552e-016
>>
>> alternate using svd:
>> svd(var(temp))
>> $d:
>> [1] 1.907736e+001 9.930884e+000 2.622073e-001 3.311722e-002 7.682346e-016
>> 5.022829e-016 1.477306e-019
>
>Colin Goodall suggested:
>
>> To be sure of obtaining non-negative eigenvalues
>> from a symmetric real matrix, use eigen(x,symmetric=T)
>
>but the answer is the same, with a negative eigenvalue.
>
>First, to sort out a confusion, svd does not return eigenvalues but
>singular values, which are by definition non-negative.
>
>Second, the problem here is the svd shows that numerical rank of
>var(temp) is 4; it would be at most 7 (8 - 1 for the mean). Thus there are
>really three zero eigenvalues, and it is too much to expect the numerical
>calculations to give these as exactly zero.
>
>Moral: there is really no way out but to assume that your data might be
>rank-degenerate and take proper precautions. These can be quite cumbersome
>(see the code in my lda.default). A second point is to think why you are
>doing this in the first place. It is better practice to take the svd of
>scale(temp,T,F) than form eigen(var(temp)): not only are you guaranteed
>non-negative eigenvalues (the square of the singular values) but by
>avoiding `squaring' you will have much higher numerical accuracy.
>PCA, LDA, ... are all best computed this way.
>
>--
>Brian D. Ripley, ripley@stats.ox.ac.uk
>Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
>University of Oxford, Tel: +44 1865 272861 (self)
>1 South Parks Road, +44 1865 272860 (secr)
>Oxford OX1 3TG, UK Fax: +44 1865 272595
>-----------------------------------------------------------------------
>This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
>send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
>message: unsubscribe s-news
>
>
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news