[S] Re: scope rules: nested functions

Mark Bravington FSMG CEFAS (M.V.BRAVINGTON@cefas.co.uk)
Mon, 29 Jun 1998 11:23:53 +0100


Jens O-A mentioned that I wrote some routines to allow nested functions in Splus, i.e. "child" functions that have access to their parent's variables, like you get in easy-to-use programming languages. I attach the latest version of these functions ("local" and "local.return") and documentation below.

I use "local" on a daily basis, to simplify the coding of long algorithms, and I would find programming difficult without it. So far I have not hit any snags (at least with the latest version). Despite Jens' fears, I doubt that there is much copying between frames-- at least, I can't see why there should be _more_ copying than you get if you eschew nested functions and pass 2000 parameters in & out of "global" functions instead. When I first posted these functions, Jens asked whether I could guarantee them under all circumstances-- at the time, I thought it wise to suggest caution! Caution is still a good idea, but I can assure you that there won't be naming conflicts or anything like that.

"local" itself basically works by setting up a call to "eval" that deliberately breaks the standard scoping rules. The rest of the code handles calls to "on.exit" and the creation/deletion/restoration of parameters (if any) in the child function.

Mark Bravington
m.v.bravington@cefas.co.uk

DOCUMENTATION:
"local"-- support for nested functions in Splus

One of my main gripes as an Splus programmer, has been its lack of support
for nested functions. If you have a complicated algorithm to program, all
the gurus tell you to break it up into smaller modular tasks. In many
programming languages, this can be accomplished by writing the smaller
tasks as nested (or 'child') functions. A child function knows about the
variables in its parent, and can change those values, but the parent can
also pass parameters to the child, and the child can have its own local
variables that disappear after the child function has executed (i.e. don't
affect the parent).

But in Splus, when one function (the child) is called from another (the
parent), the child doesn't know about the values of objects in its
parent-- let alone how to update them, or create multiple new objects in
its parent's frame that will persist after it finishes. I presume that
this rather prissy behaviour is all thanks to Splus's functional language
setup-- functions shouldn't have side-effects. It's the same rationale
that has deprived us of pointers! Of course, Splus knows there's more to
life than functional programming, and does provide some ways round the
problem. But they are horrible.

Here's an nicer alternative. Write your nested functions just as you would
in a normal language, then just wrap the function body in a call to my
procedure 'local', as shown below.

EXAMPLE

nest.outer_ function( a, b) {
d_ a*b
d_ nest.inner( a+d)
b+d+e
}

nest.inner_ function( x, nlocal=sys.parent())
local( {
b_ x*x # changes value of b in parent frame
e_ b+b # creates e in parent frame
x
}

The call to 'local' ensures that the evaluation of the child's code takes
place in the frame of its parent (unless overridden by 'nlocal'). The
child function must include 'nlocal=sys.parent()' as a parameter, but
there is usually no need to set 'nlocal' explicitly when calling the child
from the parent, unless you are using something like 'lapply'; in that
case, pass 'nlocal=sys.nframe()' as the last parameter to 'lapply', so
that the child won't become an orphan!

Any named parameter in the child's definition will be (1) saved for
future reference by 'local' before executing the 'nest.inner' code, if it
already existed in the parent's frame; (2) evaluated in the parent's frame
before the child starts (see LIMITATIONS); (3) deleted when the child has
finished; (4) replaced by the saved version, if any. This means that
temporary variables specific to the child can be set up by means of dummy
parameters:

nest.inner_ function( x, nlocal=sys.parent(), temp1, temp2) local({
# temp1 & temp2 will be deleted (if they exist) when nest.inner finishes.
No need to set temp1 or
# temp2 in the call to nest.inner.

temp1_ x*x
temp2_ temp1+sqrt( x)
x_ temp1 * temp2
b_ x+1
a_ b/x
# Note no need to always return a function value from nested functions;
the point of this nest.inner is
# its side-effects on a & b
})

"return" STATEMENTS

"local" doesn't like "return" statements in the code-- they will cause the
parent function to exit, not the child! However, the function
"local.return" has exactly the same syntax, and provides the same
functionality. If you want to use "return" in the middle of a function,
you can mimic the effect with a call to "local.return" followed by a
"break" statement. Example:

my.nested.function_ function( x, nlocal=sys.parent()) local({
<<...some code>>
if( a<threshold) {
# return( current.estimate) # this is the line you'd LIKE to write...
# but you can do this instead:
local.return( current.estimate)
break
}

# otherwise carry on with the rest of the function and return something
else
}

In this example, the call to "local.return" wasn't really necessary; you
could replace the line "local.return( current.estimate)" by
"current.estimate". The last statement evaluated before the break, will be
returned as the function value.

If you are embedded in a loop, you will need to break out of the loop
first, before calling break at the top level. Sadly, Splus lacks a
multi-level break statement.

HANDLING OF 'on.exit'

'local' is meant to sensibly handle calls to 'on.exit' in both parent &
child functions. When local is invoked, it saves any pre-existing
'on.exit' that was set in the parent function. After 'child' has finished
(either normally or by crashing), its exit code if any will be executed;
then the pre-existing 'on.exit' of the parent will be reset as part of
'local's exit code (which will still delete temporaries and restore
previous values, even if the child crashed). BUT this does mean that there
is no way to set the parent's 'on.exit' from within the child.

LIMITATIONS

Unfortunately, I can't quite seem to make parameter handling work in
exactly the same way as Splus's usual 'lazy evaluation'. To get something
working, I have implemented the following: any expression passed as a
parameter in the call to the child will be evaluated BEFORE the child's
own code is entered, rather than "on demand"; the same goes for any
default parameter value specified in the child's definition. In other
words, the following won't work:

nest.inner_ function( param1, nlocal=sys.parent()) local( {
d_ a*b
param1+3
})

nest.outer_ function( a, b) {
e_ nest.inner( d)
# previous line WON'T work, because "d" doesn't exist at time of call
return( a, b, d, e)
}


However, if the call to nest.inner were replaced by 'nest.inner( a*b+sqrt(
b+a*a))', all would be well. This limitation won't be serious for most (of
my!) applications.

Also, a call to 'missing(x)' inside the child will only work properly if
there is NO default value for parameter 'x' in the definition of the
child; if there is such a default, 'missing( x)' will always return FALSE.

The root cause of these limitations seems to be the way Splus handles
'.Argument' objects. If Splus is looking for an object and finds that the
object is a .Argument, Splus normally evaluates the object (either what
was passed, or the default), and makes a second copy of the object in the
current frame, holding the result of the evaluation. Normally, .Argument
objects are only created when a function is called. If you try to create
them manually, Splus doesn't seem to do the evaluation or make that second
copy when it comes across the object. Presumably Splus is relying on calls
to 'amatch' to decide what to do, rather than just looking at the objects
themselves. In addition, Splus won't allow you to "remove" .Argument
objects.

POSSIBLE IMPROVEMENTS

The main quirk is the non-standard parameter handling, though again I
don't think it will often matter. This would require changes to Splus' own
code for handling .Argument objects, I think? It would be lovely if this
could be 'fixed', but some might not view it as a bug. Also, rewriting
'local' as an .Internal would save one useless frame every time a child
function is called.

THE CODE:

local.return_ function( ...) {
# Returns its arguments; unnamed arguments are named using deparse & substitute
mc_ match.call()
mode( mc)_ 'list'
mc_ mc[-1]
if( !length( mc))
return()
else if( length( mc)==1)
return( eval( mc[[1]], local=sys.parent()))

# else multiple arguments, so return as named list
if( is.null( names( mc)))
which_ 1:length( mc)
else
which_ names( mc)==''

subs_ lapply( mc[ which],
function( x, frame) do.call( 'substitute', list( x, frame=frame)),
frame=sys.nframe())
names( mc)[ which]_ unlist( lapply( subs, deparse), F)
lapply( mc, eval, local=sys.parent())
}

local_ function( expr) {
sp_ sys.parent()
nlocal_ eval( as.name( 'nlocal'), local=sp)

# Save old exit code, clear exit code, arrange tidy-up and re-installation of old exit code
# when 'local' is done
old.on.exit_ sys.on.exit()[[ nlocal]]
if( missing( old.on.exit))
old.on.exit_ substitute( on.exit())
else
old.on.exit_ substitute( on.exit( old.on.exit), list( old.on.exit=old.on.exit))

on.exit( {
eval( sys.on.exit()[[nlocal]], local=nlocal)

# Get rid of params & temporaries
lapply( names( params),
function( x, frame) if( exists( x, frame=frame)) remove( x, frame=frame),
frame=nlocal)

# Restore things hidden by params
for( i in names( savers))
assign( i, savers[[ i]], frame=nlocal)

eval( old.on.exit, local=nlocal) # so old code will execute on return to 'nlocal'
})

eval( expression( on.exit())[[1]], local=nlocal)

params_ amatch( sys.function( sp), sys.call( sp))
params_ params[ names(params)!='nlocal']
savers_ names( params)

if( length( params)) {
names( savers)_ savers
savers_ sapply( savers, exists, frame=nlocal)
savers_ names( savers)[ savers]
if( length( savers)) {
names( savers)_ savers
savers_ lapply( savers, get, frame=nlocal) }

# Parameters and temporary working variables:
for( i in names( params))
if( mode( params[[i]])!='argument') # sometimes when S routine is called from C
assign( i, params[[i]], frame=nlocal) # the obvious solution!
else # THE NORMAL CASE: .Arguments are handled oddly and best I can do is...
assign( i,
if( mode( params[[i]][[1]])=='missing') {
if( mode( params[[i]][[2]])!='missing')
params[[i]]_ eval( params[[i]][[2]], local=nlocal)
else
params[[i]][[1]] }
else
eval( params[[i]][[1]], local=nlocal),
frame=nlocal) }

# Embed 'expr' in a spurious loop so that calls to "break" will work!
expr_ substitute( {repeat{ assign( 'answer', expr, frame=f); break}; get( 'answer', frame=f)},
list( expr=substitute( expr), f=sys.nframe()))

# The business end!
eval( expr, nlocal)
}
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news