Re: [S] Sampling in bootstrap()

Marc Feldesman (
Wed, 25 Feb 1998 12:16:02 -0800

Actually I know this. We used the bootstrap algorithm to formulate an
approximate randomization test (= permutation test). The issue here isn't
whether we did the right thing, but what SPlus is doing by default with the
sampling. There are 55 cases in the data set. The sample statistic being
computed is the F-statistic, generated by the ANOVA. If you examine the
aov object, it includes the parametric estimate of the F-statistic, and if
you take a summary of the aov object you can extract the F-statistic as the
statistic to bootstrap.

When I do this, the parametric estimate is 7.49..... The bootstrapped
estimate is about 7.46 (10000 replicates). It is not obvious, however,
whether the 7.46 comes from resampling each of the three groups
independently, or whether it comes from resampling the entire 55 cases and
then placing observations into three groups of the appropriate sizes.

At 08:59 PM 2/25/98 +0100, you wrote:
>a quick and dirty answer (it's late here):
>randomly putting cases into groups sounds like a permutation test. As I
>understand Bootstrapping, the connection between a case and its
>predictor-values (belonging to a group) never changes. The resampling just
>changes the sample composition. I think, Efron discusses the relation
>between bootstrapping and permutation tests in his book:
>Efron, B. and R. J. Tibshirani (1993). An introduction to the Bootstrap.
>New York, Chapman & Hall.
>Hope this helps
>Best regards
>Jens Oehlschlaegel
>On Wed, 25 Feb 1998, Marc Feldesman wrote:
>> I'm trying to understand the default sampling scheme used in the
>> bootstrap() function. I can best explain my question with a problem.
>> In an earlier message, I explained that I was trying to replicate an
>> analysis a statistician colleague and I published two years ago in which we
>> used bootstrapping to put empirical confidence limits on an F-statistic
>> derived from a one-way ANOVA involving three groups. Numerous users
>> generously helped me figure out the appropriate syntax for replicating the
>> analysis. I've now done this and the results look more or less like what
>> they did when we published the paper two years ago.
>> Now, for my question. The original sample had a total of 55 cases
>> distributed as 13 in one group, 15, in the second group, and 27 in the
>> third group. The original (published) bootstrap analysis simply took
>> random samples of 55 (with replacement of course) and assigned the first 13
>> randomly selected cases to group 1, the next 15 cases to group 2, and the
>> last 27 cases to group 3.
>> Is this an accurate interpretation of what bootstrap() does using the
>> default sampling scheme?
>> Thanks.
