Re: [S] Is this really good programming practice?

Bill Venables (wvenable@attunga.stats.adelaide.edu.au)
Fri, 27 Mar 1998 23:28:54 +1030


Brett Presnell writes:
>

[...]

> Here's something that is currently bothering me. In the
> argument list to glm and gam one finds "start=eta". It seems
> reasonable to ask how this works, given that I can run glm
> without having any eta defined anywhere, and moreover, even if
> I do have an eta defined it is ignored. This is explained by
> the fact that "start" is only accessed indirectly through the
> value of "match.call(expand=F)" which doesn't have "start" in
> it if the user doesn't specify start in the call.
>
> Isn't this a pretty strange thing to do?

No.

> Wouldn't it be perfectly reasonable for a user who looks at
> the output of "args(glm)" to think that simply creating a
> vector "eta" would suffice to pass starting values to the
> function.

No.

> Why not have "start=NULL" as the default? When I first
> started looking at this I thought I saw some reason for doing
> things this way as a matter of convenience, but at the moment
> I don't see any reason at all.

Stick around. (I'm enjoying this... :-)

> Even with "start=NULL" the current code would still rely on
> the fact that start is only accessed through match.call, since
> an error results if we actually specify "start=NULL" in the
> call. This seems like sloppy programming practice, especially
> since it again would seem to be perfectly reasonable to
> specify "start=NULL" when calling glm. Why not have
> "start=NULL" as the default, and then a line like
>
> if (is.null(start)) m$start <- NULL
>
> before "eval(m,sys.parent())"?
>
> Ok, now someone can tell me how it is that I'm missing the
> point entirely. :-}

Well, you said it.

The entire explanation lies in two words: Lazy Evaluation. I
expect this does not quite fill you in, though...

The default value of an argument, such as start = eta, is rather
like an assignment that takes place in the frame of the function

* if no value has been supplied for the argument on the call, and

* _if_ and _when_ the argument is first needed inside the function.

So if the argument has a default setting start = eta and the
first reference to start inside the function is something like

yval <- start

You can think of this as being effectively equivalent to

yval <- if(missing(start)) eta else start

This implies that the default value of any argument can be any
expression with a well defined value _in_the_frame_of_the_
function at the point where the argument is first needed.

If at the point where start is first needed the function has
already gone to the trouble of calculating some local variable,
eta, that serves as a natural default value, why not make it so?

It is a powerful if puzzling and occasionally infuriating feature
of the language, but understanding what is going on usually marks
the turning point in your understanding of how S really works.
(At least you feel as if it does, and the floor is covered with
eye scales...)

A handy little rule to remember is that arguments specified on a
call are expressions evaluated in the parent frame, default
values are like assigments to be enacted within the local frame
(if need be). Both are only evaluated if and when they are first
needed. (That's why they are "lazy".)

You may not agree with the style, but it *is* the S style. The
reason you are puzzled and somewhat irritated by it is simply
because this is one more under-used feature of the language that
in fact few people do realise is even available, so it looks odd
when someone uses it.

Furthermore, expecting args() to give anything more than the most
vague clue to what is going on is expecting far too much. It
always was a simple hack meant to give a quick reminder, nothing
more. Here is how it is defined

> args
function(x)
{
if(mode(x) != "function") {
if(mode(x) == "character")
x <- get(x, mode = "function")
else stop("need the name of a function")
}
x[length(x)] <- function()
NULL
x
}

so all it does is check that the argument is, or can be
interpreted as a function, get it and then toss the body away.

The key point to note is that you cannot expect the argument list
with defaults to be self explanatory in isolation from the body.
Sometimes it is, of course, and often enough for args() to be
useful, but there are plenty of functions nowdays for which
args() is totally useless, for example

> args(plot)
function(x, ...)
NULL
>

Now that's informative, isn't it? :-)

Bill V.

-- 
Bill Venables, Head, Dept of Statistics,    Tel.: +61 8 8303 5418
University of Adelaide,                     Fax.: +61 8 8303 3696
South AUSTRALIA.     5005.   Email: Bill.Venables@adelaide.edu.au

----------------------------------------------------------------------- This message was distributed by s-news@wubios.wustl.edu. To unsubscribe send e-mail to s-news-request@wubios.wustl.edu with the BODY of the message: unsubscribe s-news