[S] summary on MAGIC #17

srosenfeld@nesdis.noaa.gov
Mon, 22 Jun 98 14:08:49 -0500


Many thanks to all folks who helped me to better understand the problem I've got
with memory-intensive computations.
I especially greatful to:
Doug Johnson ,Jens Oehlschlaegel-Akiyoshi,Peter Malewski, and Rob Creecy
for comments and suggestions.

Below is the selection of suggestions which turned out the most helpful for me
Doug Johnson:
______________
Try to put as few things as possible in frame 0. One way to
do this is to exit S-PLUS after creating a large dataset,
then start it again to run functions on the dataset.

Store data that was recorded in single precision as an S-PLUS
single precision data object. Look at the 'storage.mode()'
and 'scan()' on-line help files (note the "what=" argument for
the 'scan()' function).
Stay away from 'for()' loops. 'for()' (and 'while()') loops do not
free up memory they use until they finish. Note that the
"apply" functions like "apply", "sapply", etc. use "for"
loops effectively in their definitions.

If you must use 'for()' loops, put them in separate functions of
their own. They use less space there than if typed in directly
or included directly in another larger function.

!!!!this last proposal helped to actually solve my problem

Jens Oehlschlaegel-Akiyoshi:
____________________________

put the inner loop in a separate function, since memory used in the
inner loop will be freed if this function is left.

make shure you don't have often or big assignments to frame=0 or
where=0, since there is a not very widely known garbage collection
(discovered and confirmed in WinS+3.3, might also be in other versions).
Some S+ commands assign to frame 0 on their own, e.g. all graph and
trellis settings.

avoid direct subscripting of function results, as
a <- f()[i]
and rather do
temp <- f()
a <- temp[i]
This recommendation collides with intuition and with the programming
handbook ("do not name results"), but to my experience direct subscripting
can cause garbage collection problems.

Peter Malewski:
_______________

I think that your assumption (...that none of the intermediate results is
stored...) incorrect. S is funny about memory management.

Rob Creecy:
_______________
try "R", the S clone available from
http://lib.stat.cmu.edu/R/CRAN/contents.html
if you are using UNIX and not a lot of special
SPLUS functions.

thanks again

original request reproduced below:
>
> Dear S'ers:
> Please help me to resolve the foillowing problem:
> I am computing a matrix "A" 20*20, each element of which is a single number
> resulting from the solution of radiative transfer equation (RTE). In order to
> obtain a single solution of the RTE, I need to invert several matrices
200*200.
> Those matrices are needed only within the subroutine RTE and renewed for each
> element of "A". All the elements require absolutely identical sequence of
> computations. The calculation is organized as a double loop looking as
>
>
> for ( i in 1:20){
> for (j in 1:20){
> A[i,j]_RTE
> }
> }
>
> I move quite successfully spending 30 seconds on each element of "A" until the
> magic number 17. To compute one element of the matrix in the row #17 I spend 2
> minutes, in the row 18 I spend 4 hours. I didn't try further, most probably
I'll
> never get the result. Funny thing is that the row #17 will still be a
threshold
> even if I reduce the RTE matrix to 100*100. I checked the amount of allocated
> memory. It steadily grows reaching 60MgB by the row #18. Although I don't
> understand why it grows (none of the intermediate results is stored), it
should
> not represent any problems on my Pent/400MHz with 128M RAM and object size
> 100MgB. Can anybody explain what is going on in S+ in this kind of situation
and

Simon Rosenfeld
NOAA Science Center,
NESDIS/Satellite Research Lab
Camp Springs, MD
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news