Re: [S] Regarding S-PLUS 5.0 Performance

Prof Brian Ripley (
Fri, 4 Dec 1998 10:59:58 +0000 (GMT)

> Date: Thu, 03 Dec 1998 06:59:51 -0600
> From: William Shannon <>

> I felt the major deficiency of 3.4 was its inability to analyze large
> datasets. From the comments above it appears that 5.0 has reversed the
> problem -- it is now difficult to handle small datasets. Here is a

> The 5.0 requires 26 minutes to invert 10,000 3x3 matrices versus 3
> minutes with 3.4. There 'IS' a problem here that needs to be solved to
> make this product acceptable to people doing applied data analysis!

More precisely, it needs 26 minutes to do it THIS WAY. I am not aware
of any `applied data analysis' problem on small datasets that requires
such a calculation, but if there is one, there are faster ways on
5.0r3, at least. (BTW, I believe the posting of results from beta
software to be irresponsible.)

I ran a set of scripts for the V&R2 chapters last night, on my Sun
Ultra 1/170 with 64Mb RAM (sufficient) and 200Mb swap (ditto):

chapter 2 3 4 5 6 7 11 12 13 14
3.4r1 3.0 3.2 9.4 123 16.9 52 572 355 108 74
5.0r3 6.1 9.3 184 1201 77.2 266 840 522 141 322

I replaced step() by stepAIC() in all cases, as step() is seriously
broken in 5.0 and gives completely incorrect answers. The results from
chapters 5-14 are genuine applied data analysis of small datasets.

There appears to be a considerable speed penalty, especially when
running pure S code. Note though that almost all of the statistical
models code in 5.0 is still running in compatibility mode, and one
would expect a drop in performance (if not this much). Would one
really notice? Doing bootstrapping (ch05) and cross-validating
trees (ch14), yes, so we need to think about improving those. Otherwise
the results are already fast enough for me: remember these are lots
of analyses per chapter. (Indeed they are much faster than 3.2 on a Sun
IPC which is how we first did most of the examples in 1992/3.)

I expect I will be using 3.4 for some time-consuming calculations for
a long time to come, but 5.0r3 is perfectly adequate for my routine
work (unlike 5.0r2, which used much more memory).

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

----------------------------------------------------------------------- This message was distributed by To unsubscribe send e-mail to with the BODY of the message: unsubscribe s-news