Re: counting duplicates

Douglas Bates (bates@stat.wisc.edu)
15 Jan 1998 09:29:36 -0600


Bruce McCullough <BMCCULLO@fcc.gov> writes:

> I have a character vector of 9000 names.

> Some names appear more than once, as
> many as thirty times.

In that case, you should probably convert the character vector to a
factor. It is easier to manipulate like that and probably will occupy
less storage.

So start with your character vector, called Names and form
Names <- as.factor(Names)

> I wish to create an associated numeric vector, also
> of length 9000 which indicates, for each name, the
> number of times the name appears in the list.

> If "John Smith" appears 30 times,
> then for each occurrence of "John SmithFrom s-sender@utstat.toronto.edu Thu Jan 15 15:23:20 1998
Received: from utstat.toronto.edu (utstat.toronto.edu [128.100.73.1]) by pascal.math.yorku.ca (8.8.7/8.6.9) with SMTP id PAA37234 for <georges@mathstat.yorku.ca>; Thu, 15 Jan 1998 15:23:20 -0500
Message-Id: <199801152023.PAA37234@pascal.math.yorku.ca>
Received: by utstat; Thu Jan 15 11:04 EST 1998
Received: from IRIS.RFMH.ORG by IRIS.RFMH.ORG (IBM VM SMTP V2R3)
with BSMTP id 1740; Thu, 15 Jan 98 11:02:47 EST
Received: from iris.rfmh.org (NJE origin SLIN@NKI) by IRIS.RFMH.ORG (LMail V1.2a/1.8a) with BSMTP id 9082; Thu, 15 Jan 1998 11:02:48 -0500
Date: Thu, 15 Jan 98 10:53:08 EST
From: "Shang P. Lin" <SLIN@rfmh.org>
Subject: sample size for longitudinal studies
To: s-news@utstat.toronto.edu
Status: OR

Dear Colleagues,

If you know of any papers on or programs for sample size calculations
for longitudinal studies, I would greatly appreciate hearing from You.

Best,

Shang P. Lin
------------------------------------------
Stat. Sci and Epi. Div.
Nathan S. Kline Inst. for Psychiatric Res.
Orangeburg, NY