This is a very common source of confusion. In a nutshell, lm.influence()
needs to do further computations using yn, nn and xn but these variables
are *local* to the function hn.plot(). Since this frame is not on the
search list for the lm.influence() call, they are no longer accessible.
The solution is to make them accessible through the search list. There are
several ways of doing this, none entirely elegant (within this particular
programming paradigm, see later), but some better than others. Here's two:
1. Adam Kleczkowski's solution. Make yn, nn and xn global variables using
assignments such as
assign("yn",yn,0)
assign("nn",nn,0)
assign("xn",xn,0)
within the function. This puts them in frame 0, the frame of the
session, which means they cease to be when the session is finished, so
they could cause problems later in the session if you have other
variables called yn, nn or xn. I prefer to work with data frames and to
put the variables in frame 1 so that they die with the expression:
hn.plot <- function(yn, nn, xn)
{
assign(".df", data.frame(yn = yn, nn = nn, xn = xn), frame = 1)
fitit.glm <- glm(cbind(yn, nn - yn) ~ xn, family = binomial, data = .df)
residuals(fitit.glm)/sqrt(1 - lm.influence(fitit.glm)$hat)
}
This way the data frame dies silently when the expression is complete.
You could be in trouble, though, if the expression itself involved
another variable called ".df".
2. Put the variables in a data frame temporarily sitting at position 1 of
the search list itself:
hn.plot <- function(yn, nn, xn)
{
attach(data.frame(yn = yn, nn = nn, xn = xn), 1)
on.exit(detach(1))
fitit.glm <- glm(cbind(yn, nn - yn) ~ xn, family = binomial)
residuals(fitit.glm)/sqrt(1 - lm.influence(fitit.glm)$hat)
}
__________________________________________________
Obligatory paternalistic advice:
Whether the local frame of the function should be on the search list or not
is a topic of some debate, but the fact is it is *not* and to change this
would break so many things in S it seems unlikely to happen. Rather than
wait for that day which may never come, it seems better for now to embrace
the object oriented style of programming more fully. Work with variables
in *data frames* and always use the data= argument of the fitting function
to make them part of the fitted model object. In this way the variables go
wherever the object goes and the problem simply does not arise.
If you started your programming career using something like Fortran, then
you have some serious re-orientation ahead of you. Sorry about that, but
you will gain in the end...
Bill
-- ___________________________________________________________________________ Bill Venables, Department of Statistics, Telephone: +61 8 303 3026 The University of Adelaide, Facsimile: +61 8 232 5670 South AUSTRALIA. 5005. Email: venables@stats.adelaide.edu.au