Wilk and Shapiro wrote a paper in 1968 (Technometrics, Vol.10, No.4,
pp.825-839) suggesting a way to test simultaneously for normality for
several, say k, groups with possibly different population means and
standard deviations, based on random samples from each of the groups.
Wilk and Shapiro actually propose two similar procures, both based on
combining the k p-values from the k individual goodness-of-fit tests
for each group. (One procedure was proposed by Fisher in the context
of combining the results of several independent tests.)
Wilk and Shapiro's procedures are not really specific to the
Shapiro-Wilk test, since they involve simply combining p-values from
several independent tests.
Here's my question: I've seen it suggested that you simply compute the
z-scores for each group, combine the z-scores, and perform the
goodness-of-fit test on the z-scores. Has anyone seen this proposed
test in the published literature?
