8.1 Introduction to surveys
8.2 Methodological approaches
8.3 Doing survey research
8.4 Statistical Analysis
8.4.1 Descriptive statistics
8.4.2 Exploring relationships
8.4.3 Analysing samples
8.4.3.1 Generalising from samples
8.4.3.2 Dealing with sampling error
8.4.3.3 Confidence limits
8.4.3.4 Statistical significance
8.4.3.5 Hypothesis testing
8.4.3.6 Significance tests
8.4.3.6.1 Parametric tests of significance
8.4.3.6.2 Non-parametric tests of significance
8.4.3.6.2.1 Chi-square test
8.4.3.6.2.2 Mann Witney U test
8.4.3.6.2.3 Kolmogorov-Smirnov Test
8.4.3.6.2.4 H test
8.4.3.6.2.5 Sign test
8.4.3.6.2.6 Wilcoxon test
8.4.3.6.2.7 Friedman test
8.4.3.6.2.8 Q test
8.4.3.6.2.8.1 A note on the Median Test for Related Data
8.4.3.7 Summary of significance testing and association: an example
8.4.4 Report writing
8.5 Summary and conclusion
When to use the Q test
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
Unworked examples
When to use the Cochran Q test
The Cochran Q test is used to test more than two related samples. It may be used on any scale data, provided the combined sample size exceeds 20. However, in practice its use is restricted to nominal-scale data as it is less powerful than analysis of variance or the Friedman test
Top
Hypotheses
The null hypothesis (H0) is that the samples are not significantly different.
The alternative hypothesis (HA) is that the samples are significantly different.
Only the two-tail test is considered.
Top
What we need to know
The individual values of the related sample data, which can be divided meaningfully into two categories.
Top
Sampling distribution
The sampling distribution associated with the Conhran Q test is the chi-square distribution (χ2).
The rationale of the test is as follows. The data in the samples is assumed to come from the same population (which accords with H0). If the combined data is divided into two categories, where the categorisation is carried out independently for each group of related variables, then each sample should contain equal proprotions of the binary categories if the null hypothesis is indeed true.
The method of categorsation is as follows:
The test can only be carried out if the nominal data can meaningfully be divided into two categories. If that applies then the data is categorised as either 0 or 1.
The Cochran Q test compares the distribution of the 1s and 0s to see if the samples differ significantly.
Top
Assumptions
There are no assumptions associated with the Q test so long as the data can be divided into two categories.
Top
Testing statistic
The Cochran Q testing statistic is χ2 for j-1 degrees of freedom, thus
Q = χ2= (j-1)[(j∑a2)-(∑a)2]/[j∑a- ∑x2]
where
j is the number of samples;
a is the number of 1s in any given sample;
x is the number of 1s in any given set of related units.
Top
Critical values
The critical value of χ2 can be found in tables of critical χ2 for j-1 degrees of freedom.
Top
Worked examples
1. In an attempt to see if the current government's popularity was declining a sample of 20 pro-government voters was questioned at monthly intervals. In the first month all 20 indicated they would vote for the government. The table below shows the voting intention each month (V for the government; N not for the government)? Has there been a significant change in support for the government?
Respondent |
Month 1 |
Month 2 |
Month 3 |
Month 4 |
1 |
V |
V |
V |
V |
2 |
V |
V |
V |
V |
3 |
V |
V |
N |
V |
4 |
V |
N |
N |
N |
5 |
V |
N |
V |
V |
6 |
V |
N |
N |
N |
7 |
V |
V |
V |
N |
8 |
V |
V |
N |
N |
9 |
V |
N |
V |
V |
10 |
V |
N |
N |
V |
11 |
V |
N |
N |
V |
12 |
V |
V |
N |
N |
13 |
V |
V |
N |
V |
14 |
V |
V |
N |
V |
15 |
V |
V |
N |
N |
16 |
V |
V |
V |
N |
17 |
V |
V |
N |
V |
18 |
V |
N |
V |
N |
19 |
V |
N |
N |
N |
20 |
V |
V |
V |
V |
|
|
|
|
|
Note that the question posed cannot be answered directly as the Q test simply shows whether or not the samples come from the same population and does not indicate a direction of change, this would need to be inferred from an inspection of the data.
Hypotheses:
H0: The four samples come from identical distributions.
HA: The four samples come from different distributions.
Significance level: 5%
Testing statistic: Q = χ2= (j-1)[(j∑a2)-(∑a)2]/[j∑a- ∑x2]
Critical value: Critical χ2 for j-1 degrees of freedom (i.e 4-1=3 degrees of freedom) from tables of critical values, χ2 equals 7.815.
Decision rule: Reject H0 if calculated χ2 > 7.815
Computation: V=1, N=0
Respondent |
Month 1 |
Month 2 |
Month 3 |
Month 4 |
x score |
x2 |
1 |
1 |
1 |
1 |
1 |
4 |
16 |
2 |
1 |
1 |
1 |
1 |
4 |
16 |
3 |
1 |
1 |
0 |
1 |
3 |
9 |
4 |
1 |
0 |
0 |
0 |
1 |
1 |
5 |
1 |
0 |
1 |
1 |
3 |
9 |
6 |
1 |
0 |
0 |
0 |
1 |
1 |
7 |
1 |
1 |
1 |
0 |
3 |
9 |
8 |
1 |
1 |
0 |
0 |
2 |
4 |
9 |
1 |
0 |
1 |
1 |
3 |
9 |
10 |
1 |
0 |
0 |
1 |
2 |
4 |
11 |
1 |
0 |
0 |
1 |
2 |
4 |
12 |
1 |
1 |
0 |
0 |
2 |
4 |
13 |
1 |
1 |
0 |
1 |
3 |
9 |
14 |
1 |
1 |
0 |
1 |
3 |
9 |
15 |
1 |
1 |
0 |
0 |
2 |
4 |
16 |
1 |
1 |
1 |
0 |
3 |
9 |
17 |
1 |
1 |
0 |
1 |
3 |
9 |
18 |
1 |
0 |
1 |
0 |
2 |
4 |
19 |
1 |
0 |
0 |
1 |
2 |
4 |
20 |
1 |
0 |
0 |
0 |
1 |
1 |
|
a1 = 20 |
a2 = 11 |
a3 = 7 |
a4 = 11 |
|
∑x2= 135 |
j=4
∑x2= 135
∑a = a1 +a2 +a3 +a4 = 20+11+7+11 = 49
∑a2 = a12+a22+a32+a42 = 400+121+49+121 = 691
Q = χ2= (j-1)[(j∑a2)-(∑a)2]/[j∑a- ∑x2] = (4-1)[(4x691)-(49)2]/[4x49-135]
Q = χ2= 3[(2764)-(2401)]/[196-135] = 3[363]/61 = 1089/61 = 17.85
Decision: Reject the null hypotheses (H0) as calculated χ2 is greater than the critical value of 7.815. Hence the samples are significantly different.
Top
8.4.3.6.2.8.1 A note on the Median Test for Related Data
Section 8.4.3.6.2.1, on the chi-square test, explained a method of categorising data using the median. This is sometimes known as the Median Test for Independent Data. Similarly, the Median Test for Related Data is a variant of the Cochran Q test where the median is used to divide the data in related units into two categories. Using the median implies at least ordinal scale data and a more powerful test in such circumstances is the Friedman Two-Way Analysis of Ranks (Section 8.4.3.6.2.7).
Top
Unworked examples
1. A group of 10 people who regularly listened to a BBC popular music radio station were asked to listen to a competitor commercial station as well. The respondents were asked their preferences after four weeks, ten weeks and twenty weeks. The results below show preferences for the BBC station (R) and the commericlal station (C). Is there any significant difference in preference between the three choice periods?
Respondent |
4 weeks |
10 weeks |
20 weeks |
1 |
C |
C |
C |
2 |
C |
C |
R |
3 |
C |
R |
R |
4 |
C |
R |
R |
5 |
R |
C |
R |
6 |
R |
C |
R |
7 |
R |
C |
R |
8 |
R |
C |
C |
9 |
R |
C |
R |
10 |
R |
R |
R |
Top
2. Forty students from Birmingham were split into 5 groups based on their academic subject and the groups were matched based on where they lived in the city. The students given a questionnaire about noise pollution. A score was computed from the answers ranging from 0 to 20 (where a high score indicated the student lived in an area with high levels of noise poluution). Is there a significant difference in perception of noise pollution based on the academic discipline?
Respondent |
Sociology |
English |
Music |
Science |
Art |
Erdington |
10 |
11 |
12 |
11 |
11 |
Edgbaston |
12 |
3 |
6 |
9 |
4 |
Moseley |
15 |
11 |
10 |
19 |
13 |
Handworth |
10 |
12 |
8 |
18 |
12 |
Balsall Heath |
13 |
10 |
15 |
12 |
13 |
Alum Rock |
14 |
12 |
15 |
14 |
14 |
Harborne |
5 |
7 |
15 |
9 |
15 |
Selly Oak |
15 |
18 |
13 |
18 |
9 |
|