OrientationObservationIn-depth interviewsDocument analysis and semiologyConversation and discourse analysisSecondary Data
SurveysExperimentsEthicsResearch outcomes
Conclusion
8.3.12.13 Summary significance testing and association: an example To sum up and review the above brief outline of significance testing and association we will consider a short extract from John Leggett's (1963) survey of working-class consciousness and explain what the statistical references mean (but will not demonstrate how to calculate them).
Leggett undertook research in the United States to see if there was any difference in class consciousness between industrial workers who grew up in industrial regions and industrial workers who had recently moved to industrial regions from the countryside. He called the first group 'the prepared' and the second group 'the uprooted'. He used several questions to allocate his sample into five categories of class consciousness, from most class conscious to least class conscious. The results are shown in Table 8.2.12.17.
Table 8.2.12.17 Level of class consciousness by place of origin (prepared or uprooted)
Level of class consciousness
Uprooted
(n)
Column %
Prepared
(n)
Column %
5
23
17
3
3
4
49
35
18
19
3
49
35
29
31
2
15
11
35
37
1
3
2
10
10
Total
139
100
95
100
It was anticipated that the uprooted would express a higher degree of class consciousness than the prepared. The data support this expectation: 52% of the uprooted fall in the two most class-conscious categories compared to only 22% of the prepared. At the other extreme, only 13% of the uprooted, compared to 47% of the prepared, fall into the least class conscious categories. The Kendall's Tau measure of correlation is 0.41. The difference between the uprooted and the prepared respondents is significant at the <.001 level. Other relevant variables do not upset our findings. When one controls for ethnicity, the relationship still obtains. (Leggett,1963).
The percentages in Table 8.2.12.17 are self-explanatory and show a difference in class consciousness between the two groups of workers. What does the reference to Kendall's Tau measure of correlation tell us? As explained in Section 8.3.12.12, correlation is a statistical process that attempts to measure exactly the interrelationship between two variables, in this case between background and class consciousness. Correlation is thus a measure of association.
There are many different measures of association of which Kendall's Tau is one and is appropriate for the data in the example. The percentages show a difference between the two groups that would suggest that class consciousness and background are associated. In this case the correlation is 0.41, which shows a reasonable degree of association. It is well above zero but certainly well short of a perfect relationship.
In social science, very high correlations between two variables are unlikely because the social world is complex and, for example, class consciousness is likely to be the result of several things besides the background of the respondent.
A correlation of 0.4 between two variables, known as bivariate correlation, would be regarded as a reasonably good association for social science data. In general, the larger the measure of association the stronger is the relationship between the variables.
What does it mean to say that the difference between the two groups is 'significant at the <.001 level'? Significance testing, as discussed in Section 8.3.12.11, attempts to take account of the fact that research involves a sample and not the whole population. It assumes that the sample is not biased (that is, it is a random sample).
However, even if it is unbiased, the sample will still have sampling error. An unbiased sample will be a good approximation to the population but it will not be perfect, that is, there will be some variation between the sample and the population. The sample might slightly overrepresent the class consciousness of the population of uprooted workers or it might slightly underrepresent it. There is no way of knowing which is the case. However, it is possible to estimate the likely degree of variation of an unbiased random sample from the population that it represents. The variation is known as sampling error.
The point of a statistical survey is to be able to make some generalisations about the population from which the samples were taken, and not just the sample itself. While, in Leggatt's study, there is some association between class consciousness and background for the sample, does this apply to the population as well? In short, it is necessary to take account of the sampling error. In other words, is the difference, in the sample, of a size that will be significant for the population: that is, is the difference bigger than the likely sampling error?
This is precisely what Leggett (1963) has examined and the statement that 'the difference between the uprooted and the prepared is significant at the <.001 level' means that there is only a tiny probability (less than a 1 in 1000 chance) that the difference in the samples does not reflect a difference in the populations from which the samples were drawn. Or, put another way, that there is a more than 999 out of 1000 chance that there is a difference in class consciousness among the 'uprooted' and 'prepared' populations.