RE: [S] Re: scope rules

Siem Heisterkamp (S.H.HEISTERKAMP@amc.uva.nl)
Tue, 30 Jun 1998 09:42:10 +0200


Dear Charles Roosen,
although I do a lot of programming in S-Plus, I am not very
sophisticated in my way of programming.
Although the proposed scoping rules used e.g. in R look interesting, I
could perfectly live with the options a, b or c. In fact, when using
nested functions i always put the name of the original dataframe in the
output object (+ the date of running the program) in a component called
info. The name can then alwyas be passed to a lm( ) or glm() in the
somewhat clumsy way using the assignment <<- (afterward i remove it).
Thus options a b or c would suit my use of S-Plus, making the assignment
superfluous.
Siem

Dr. S.H. Heisterkamp
University of Amsterdam
Department of Clinical Epidemiology and Biostatistics
room J2-220
PO Box 22700 1100 DE Amsterdam
tel: +31-(0)-20-5668520
fax:+31-(0)-20-6912683
s.h.heisterkamp@amc.uva.nl

> -----Original Message-----
> From: Charles Roosen [SMTP:roosen@statsci.com]
> Sent: Tuesday, June 30, 1998 1:54 AM
> To: 'S-NEWS'
> Subject: Re: [S] Re: scope rules
>
> I'd like to offer a personal comment on the scoping rules discussion,
> e.g.
> I'm not representing MathSoft.
>
> I don't think the scoping rules are particularly horrible. Although I
> agree that there may be more clever ways to do things in some places.
>
>
> The basic situation where this arises is that the user has created a
> model
> object, and then later applies a method to the object which needs to
> get
> the original data used to fit the model. Conceptually, there are two
> ways
> to get the data:
>
> * include a copy of the data in the model object
>
> * include a reference to the data in the model object
>
> The problem with the first approach is that it requires making an
> extra copy
> of the data, which is often wasteful. We've actually used this
> strategy in
> the 4.5 revisions to the cluster library, so time will tell whether
> people
> find this objectionable. (The data is needed by clusplot.)
>
> Another way to pass the data would be to add an orig.data argument to
> the
> various methods, so that the user could explicitely pass the original
> data
> used in fitting the model to the various methods. If this was omitted
> then
> the reference could be used. I think this would be my favorite
> approach.
>
> If one is going to pass the name of the data, I think the current
> scoping
> rules are perfectly reasonable. S looks for a local variable, then
> for a
> global variable, then for a built-in object.
>
> It is currently possible to use the strategy of looking in all parent
> frames. Just use
>
> get("x", inherit=T)
>
> Looking through parent frames will usually cause problems, because the
> "x" you find is likely to be a different "x" than the one you used to
> fit
> the model.
>
> I think the concern about the scoping rules is something of a red
> herring
> caused by the modeling functions doing a form of pass-by-reference. I
> think
> the best solution would be for the convention to be:
>
> a) Get the data by looking at the call as is currently done.
>
> b) Add an option to the fitting function to stash the data in the
> model
> object. If the data is there the call will be ignored.
>
> c) Add an argument to the method function to supply the original data.
> If
> that is present then the call will be ignored.
>
> I guess I see the value of just getting the data name from the call in
> order
> to save space. The problem is that this doesn't support finding the
> data
> properly when the model object is used later within a function.
>
> My perspective is that the current strategy is fine, it just needs
> extension
> to cover the cases which aren't currently supported elegantly.
> (That is, you currently have to use the assign to frame 1 trick.)
>
> Another approach, which I like somewhat less but which I used in the
> bootstrap() code, is to pass an argument saying which frame holds the
> data.
> So maybe I should add:
>
> d) In the fitting function, stash the number of the frame containing
> the
> data in the model object. In the method functions, evaluate the name
> of the
> data in this frame.
>
> As a user, I'd find the {a, b, c} combination preferable, as passing
> frame
> numbers about is pretty techie. I'm interested in other opinions.
>
> The main point I'd like to emphasize is that I don't think the scoping
> rules
> are bad. I think there are lots of advantages in them that people
> take for
> granted.
>
> The issues with model objects in functions arise from using
> pass-by-reference in a language where that isn't encouraged. I think
> it
> would be best to solve these within the current scoping rules.
>
> Again, I'm speaking only for myself. Opinions vary.
>
> Charlie Roosen
>
>
> **********************************************************************
> Charles Roosen, PhD 1700 Westlake Ave N, Suite 500
> Senior Statistician Seattle, WA 98109
> Data Analysis Products Division (206) 283-8802 x254
> MathSoft email: roosen@statsci.com
> **********************************************************************
>
> ----------------------------------------------------------------------
> -
> This message was distributed by s-news@wubios.wustl.edu. To
> unsubscribe
> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
> message: unsubscribe s-news
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news