[S] svd() for PCA

Paul Gilbert (pgilbert@bank-banque-canada.ca)
Fri, 1 May 1998 16:09:58 -0400


> It is better practice to take the svd of
>scale(temp,T,F) than form eigen(var(temp)): not only are you guaranteed
>non-negative eigenvalues (the square of the singular values) but by
>avoiding `squaring' you will have much higher numerical accuracy.
>PCA, LDA, ... are all best computed this way.

On the subject of PCA I've noticed that prcomp as in V&R uses svd(), but
princomp in Splus (which I thought was newer, and returns some extra information
of interest) seems to use eigen(). It also seems to me that when your data is
scaled and centered, so the correlation and cov matrix are the same, princomp(x,
cor=T) and princomp(x, cor=F) should give the same result. They do not,
apparently because one uses divisor N and the other N-1. V&R indicate this
divisor as a difference between prcomp and princomp for unscaled data, but does
it make sense to have this difference in one program?

> z <- scale(iris[,,2], scale=T, center=T)
> prcomp(z)$sdev
[1] 1.7106550 0.7391040 0.6284883 0.3638504

> princomp(z, cor=T)$sdev
Comp. 1 Comp. 2 Comp. 3 Comp. 4
1.710655 0.739104 0.6284883 0.3638504

> princomp(z, cor=F)$sdev
Comp. 1 Comp. 2 Comp. 3 Comp. 4
1.693462 0.7316756 0.6221717 0.3601935

> max(abs(cor(z)-var(z)))
[1] 3.329191e-08
>

Paul Gilbert

-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news