A colleague has shown me some data for which her manuscript reviewer(s)/
editor have requested an interrater reliability coefficient. I recommended
one of the intraclass correlation coefficients discussed by Bartko (1966)
and Shrout and Fleiss (1979) based on a repeated measures analysis of
variance for which the data layout looks like this:
Judge
1 2 3 4 ... K
Target 1 X X X X ... X
2 X X X X ... X
3 X X X X ... X
4 X X X X ... X
. . . . . ... X
. . . . . ... X
. . . . . ... X
N X X X X ... X
where X is a rating of a target by a judge.
All well and fine, but now here's the quirk. The targets and the judges
are the same persons. That is, each person in the group rated him/herself as
well as all the other persons in the group (targets), so the rating on the
diagonal is a self rating.
The intent is to compute a mean rating across judges for each target.
For theoretical reasons, it is desired to omit the target's self rating from
this mean. Thus, the data layout is:
Judge
1 2 3 4 5 ...
Target 1 0 X X X X
2 X 0 X X X
3 X X 0 X X
4 X X X 0 X
5 X X X X 0
.
.
.
where 0 is a missing score, missing because it is a self rating
that needs to be omitted. Is it possible to do the repeated measures
analysis of variance on this data set? I know about the recommendations
of some ANOVA experts regarding constructing substitute scores for
missing data -- is doing that appropriate in this case? The missing
scores aren't random, obviously. Moreover, it seems to me that any
interrater reliability coefficient should be based on the same data
that are going to be aggregated into the means, and that won't be
the case if substitute scores are used.
Any ideas?
Many thanks,
Carol Nickerson
caroln@stat.berkeley.edu