[S] Regarding S-PLUS 5.0 Performance

William Shannon (shannon@osler.wustl.edu)
Thu, 03 Dec 1998 06:59:51 -0600


Andrew Bruce wrote:

* Functions applied to large data objects are generally
faster and use less memory in 5.0R3 than in S-PLUS 3.4.
As an example, our benchmarks indicate that lm() can
handle data sets 2-3 times larger than in 3.4, and is
faster for data sets larger than about 5,000-10,000
observations.

* Functions applied to small data objects are generally
slower in 5.0R3 than in S-PLUS 3.4. In addition,
looping is generally slower and uses more memory when
there are many expressions and small data objects
within the loop. This is due both to the overhead of
small functions, and to the fact that 5.0R3 is not
doing the same type of memory compaction within the
loop (this is something we are looking at in our
development effort). Much of the difference in memory
usage can be removed by encapsulating the contents of
the loop within a single function call.

I felt the major deficiency of 3.4 was its inability to analyze large
datasets. From the comments above it appears that 5.0 has reversed the
problem -- it is now difficult to handle small datasets. Here is a
benchmark comparison (thanks to two colleagues) of the two SPLUS
versions running the following function:

fun <- function()
{
apply( array(rnorm(9*10000),dim=c(3,3,10000)), 3, solve)
}

On an ULTRA 5 (270MHz) with 512MB memory, Splus 3.4 took:
9851 ray -25 0 13M 8880K run 1:37 74.55% 99.95% Sqpe
9851 ray 25 0 14M 9472K run 2:02 83.53% 99.95% Sqpe
9851 ray 5 0 15M 10M run 2:31 90.22% 99.95% Sqpe
9851 ray 5 0 20M 15M run 2:36 91.04% 99.95% Sqpe
9851 ray 28 0 17M 12M sleep 2:45 62.05% 18.18% Sqpe

and Splus 5.0 Beta 2.2 (not Beta 2.1 as I earlier indicated) took:
9882 ray 34 0 46M 10M sleep 0:02 0.67% 0.06% Sqpe
9882 ray -25 0 52M 16M run 0:31 23.41% 77.59% Sqpe
9882 ray -25 0 55M 20M run 1:00 50.92% 95.54% Sqpe
9882 ray -25 0 63M 27M run 2:00 82.56% 100.00% Sqpe
9882 ray -25 0 77M 41M run 4:03 98.02% 100.00% Sqpe
9882 ray -25 0 100M 65M run 8:00 99.97% 100.00% Sqpe
9882 ray -25 0 140M 105M run 16:00 99.99% 100.00% Sqpe
9882 ray 5 0 166M 131M run 22:01 99.99% 100.00% Sqpe
9882 ray 34 0 190M 155M sleep 26:01 58.75% 17.75% Sqpe

The 5.0 requires 26 minutes to invert 10,000 3x3 matrices versus 3
minutes with 3.4. There 'IS' a problem here that needs to be solved to
make this product acceptable to people doing applied data analysis!

I am running SPLUS 5.0 release 2 and will wait for release 3.

Bill

-- 
William D. Shannon, Ph.D.

Assistant Professor of Biostatistics in Medicine Division of General Medical Sciences

Assistant Professor of Biostatistics Division of Biostatistics

Washington University School of Medicine Campus Box 8005, 660 S. Euclid St. Louis, MO 63110 Phone: 314-454-8356 Fax: 314-454-5113 e-mail: shannon@osler.wustl.edu web page: http://osler.wustl.edu/~shannon ----------------------------------------------------------------------- This message was distributed by s-news@wubios.wustl.edu. To unsubscribe send e-mail to s-news-request@wubios.wustl.edu with the BODY of the message: unsubscribe s-news