Re: [S] Re: scope rules

Charles Roosen (
Mon, 29 Jun 1998 16:53:45 -0700

I'd like to offer a personal comment on the scoping rules discussion, e.g.
I'm not representing MathSoft.

I don't think the scoping rules are particularly horrible. Although I
agree that there may be more clever ways to do things in some places.

The basic situation where this arises is that the user has created a model
object, and then later applies a method to the object which needs to get
the original data used to fit the model. Conceptually, there are two ways
to get the data:

* include a copy of the data in the model object

* include a reference to the data in the model object

The problem with the first approach is that it requires making an extra copy
of the data, which is often wasteful. We've actually used this strategy in
the 4.5 revisions to the cluster library, so time will tell whether people
find this objectionable. (The data is needed by clusplot.)

Another way to pass the data would be to add an argument to the
various methods, so that the user could explicitely pass the original data
used in fitting the model to the various methods. If this was omitted then
the reference could be used. I think this would be my favorite approach.

If one is going to pass the name of the data, I think the current scoping
rules are perfectly reasonable. S looks for a local variable, then for a
global variable, then for a built-in object.

It is currently possible to use the strategy of looking in all parent
frames. Just use

get("x", inherit=T)

Looking through parent frames will usually cause problems, because the
"x" you find is likely to be a different "x" than the one you used to fit
the model.

I think the concern about the scoping rules is something of a red herring
caused by the modeling functions doing a form of pass-by-reference. I think
the best solution would be for the convention to be:

a) Get the data by looking at the call as is currently done.

b) Add an option to the fitting function to stash the data in the model
object. If the data is there the call will be ignored.

c) Add an argument to the method function to supply the original data. If
that is present then the call will be ignored.

I guess I see the value of just getting the data name from the call in order
to save space. The problem is that this doesn't support finding the data
properly when the model object is used later within a function.

My perspective is that the current strategy is fine, it just needs extension
to cover the cases which aren't currently supported elegantly.
(That is, you currently have to use the assign to frame 1 trick.)

Another approach, which I like somewhat less but which I used in the
bootstrap() code, is to pass an argument saying which frame holds the data.
So maybe I should add:

d) In the fitting function, stash the number of the frame containing the
data in the model object. In the method functions, evaluate the name of the
data in this frame.

As a user, I'd find the {a, b, c} combination preferable, as passing frame
numbers about is pretty techie. I'm interested in other opinions.

The main point I'd like to emphasize is that I don't think the scoping rules
are bad. I think there are lots of advantages in them that people take for

The issues with model objects in functions arise from using
pass-by-reference in a language where that isn't encouraged. I think it
would be best to solve these within the current scoping rules.

Again, I'm speaking only for myself. Opinions vary.

Charlie Roosen

Charles Roosen, PhD 1700 Westlake Ave N, Suite 500
Senior Statistician Seattle, WA 98109
Data Analysis Products Division (206) 283-8802 x254
MathSoft email:

This message was distributed by To unsubscribe
send e-mail to with the BODY of the
message: unsubscribe s-news