> I have no idea how to test a hypothesis of the form:
> given a choice of the numbers from 1 to 10, subjects are more
> likely to chose 7 than any other number.
I received the usual large number of interesting responses. I am enclosing
two of them at the end of this post.
Most of those writing began their suggestions with the assumption that
all of the responses are equally likely and thus proposed a test based
on the null hypothesis that p(7) = 0.1. The second attached message,
from Jens Oehlschlaegel, is a good example of this approach. However,
as convenient as this approach is, I was in fact looking for something
more like an alternative that Oehlschlaegel considers briefly at the
end of his post. In essence, I think the assumption that the numbers
are equally likely is too much of a straw man. One looks at these data
because of the belief that there are preferences, thus it would seem
odd to suspend this belief when testing the hypothesis that 7 is the
most preferred alternative.
The only person who addressed this issue is David Krantz. His message is
the first attached at the end of this summary. He proposes a likelihood
ratio test comparing the LR for a model in which 7 is the most preferred
response and a model in which 7 and the next most preferred choice are
equally preferred. I am still not certain that I fully understand the
logic that justifies this comparison, but it seems reasonable to me.
For those curious, in my daughter's data, the distribution of responses
Response 1 2 3 4 5 6 7 8 9 10 Total
Frequency 9 18 24 17 38 30 70 12 11 8 237
Using the binomial test proposed by Oehlschlaegel, the probability of
obtaining as many as 70 seven responses given the null hypothesis that
p(7) = 0.1 is 0 to S-Pluses limits of precision. However, for Krantz's
test, the p-value for the hypothesis that the seven response and the next
most frequent response (five) are equally likely is ~.028.
=============== Encolsed message from David H. Krantz
From: Dave Krantz <firstname.lastname@example.org>
I've worked on similar ordinal issues in the context of measurement
theory from time to time, though not recently. The approach was
to use a likelihood-ratio statistic. Suppose, for example, that
one has data like this:
resp 1 2 3 4 5 6 7 8 9 10 | TOTAL
freq 7 9 24 12 20 5 44 3 6 13 | 143
Then under the hypothesis that 7 is the highest, the log likelihood is
7*log(7) + 9*log(9) + ... + 13*log(13) - 143*log(143)
but under the alternative, the maximum log likelihood will be found
by assuming that 7 and 3 are equally preferred (since 3 is the next
lowest) and the maximum will be the same as above, except that
24*log(24) + 44*log(44)
will be replaced by
(24 + 44)*log(34).
The LR statistic is twice the difference between these--I make
it to be 5.97 for the above example.
Thus, I end up just picking the second highest observation and
testing the null hypothesis of a 50-50 split between the highest
and next highest versus the saturated hypothesis.
By the usual theorems, the LR statistic has an asymptotic chisq
distribution with 1 df. Naturally it would pay to take this
with a grain of salt for any particular dataset; one might in
fact look at the distribution of this statistic by simulation,
under a variety of hypotheses.
============== Enclosed message from Jens Oehlschlaegel
From: Jens Oehlschlaegel <oehl@Psyres-Stuttgart.DE>
a non-bayesian answer would be: you could
assume the sharp null hypothesis that
which according to the rules of a Laplacian probability space gives
p(7) = 0.1
and under the assumption of independent observations is binomial
distributed. Now you could state your hypothesis onesided as
p(7) > 0.1
If you observe n pupils *independently* choosing numbers, under the
sharp null hypothesis you would expect k.exp=n*0.1 times a seven. However
you probably observed k.obs times a seven, where
k.obs > k.exp
p.obs(7) > 0.1
asking yourself whether this is an unusual observation under the
assumption p(7)=0.1. The binomial distribution helps you determining the
probability of your observation or even more extreme observations under
the sharp null hypothesis, which is adding
p(k=k.obs)+p(k=k.obs+1)+...+p(k=n) which can be calculated in S+ by
p.reject = 1 - pbinom(k.obs - 1 , n, 0.1)
If your resulting "rejection probability" is "rare" (say p.reject>0.05)
you might doubt that such a rare event has happened and might be willing
rather to believe that the sharp null hypothesis is false. Rejecting the
sharp null hypothesis also rejects a class of other alternatives, namely
because under those your observation would be even more unlikely.
However note that this approach did not reject the competing hypothesis
i.e. rejecting the sharp null hypothesis does not automatically imply
anything meaningfull, especially if your n has been big.
More interesting is the case where you do not limit the range of numbers
which can be choosen: because there is an infinite amount of numbers, the
assumption of a Laplacian probability leads to assuming
that p(7) aproaches *zero*, which is easily "rejected" by empirical
[ Bayesian laughter at this point? ]
However, I realize that perhaps you meant
p(7) more likely than any other in the sense that
and thus the class of alternatives to be rejected would be
of course the above conclusion p(7)>0.1 did not rule out p(1)>0.2 i.e. it
I've no idea how to specify a sharp null hypothesis in this case
[ Bayesian laughter again? ]
I'have an idea how to follow this, but no more time today,
was the first thing what you where looking for?
-- Jens Oehlschlaegel-Akiyoshi Psychologist/Statistician Project TR-EAT + COST Action B6 F.rankfurt email@example.com A.ttention +49 711 6781-408 (phone) I.nventory +49 711 6876902 (fax) R .-----. / ----- \ Center for Psychotherapy Research | | 0 0 | | Christian-Belser-Strasse 79a | | ? | | D-70597 Stuttgart Germany \ ----- / -------------------------------------------------- '-----' - (general disclaimer) it's better
----------------------------------------------------------------------- This message was distributed by firstname.lastname@example.org. To unsubscribe send e-mail to email@example.com with the BODY of the message: unsubscribe s-news