Re: counting duplicates

Douglas Bates (
15 Jan 1998 09:29:36 -0600

Bruce McCullough <> writes:

> I have a character vector of 9000 names.

> Some names appear more than once, as
> many as thirty times.

In that case, you should probably convert the character vector to a
factor. It is easier to manipulate like that and probably will occupy
less storage.

So start with your character vector, called Names and form
Names <- as.factor(Names)

> I wish to create an associated numeric vector, also
> of length 9000 which indicates, for each name, the
> number of times the name appears in the list.

> If "John Smith" appears 30 times,
> then for each occurrence of "John SmithFrom Thu Jan 15 15:23:20 1998
Received: from ( []) by (8.8.7/8.6.9) with SMTP id PAA37234 for <>; Thu, 15 Jan 1998 15:23:20 -0500
Message-Id: <>
Received: by utstat; Thu Jan 15 11:04 EST 1998
with BSMTP id 1740; Thu, 15 Jan 98 11:02:47 EST
Received: from (NJE origin SLIN@NKI) by IRIS.RFMH.ORG (LMail V1.2a/1.8a) with BSMTP id 9082; Thu, 15 Jan 1998 11:02:48 -0500
Date: Thu, 15 Jan 98 10:53:08 EST
From: "Shang P. Lin" <>
Subject: sample size for longitudinal studies
Status: OR

Dear Colleagues,

If you know of any papers on or programs for sample size calculations
for longitudinal studies, I would greatly appreciate hearing from You.


Shang P. Lin
Stat. Sci and Epi. Div.
Nathan S. Kline Inst. for Psychiatric Res.
Orangeburg, NY