OrientationObservationIn-depth interviewsDocument analysis and semiologyConversation and discourse analysisSecondary Data
SurveysExperimentsEthicsResearch outcomes
Conclusion
When undertaking a research project, one normally wants to draw conclusions that go beyond the specific group or area studied. Making statements or inferences that apply more widely than the study boundaries necessitates addressing to what extent the results can be generalised or transferred to another setting. This requires addressing issues about the representativeness of a sample and the problems of error.
This section addresses:
generalisability: to what extent a statement made about a specific study can be generalised to a wider milieu;
transferability: whether the results in one setting can be transferred to another setting;
representativeness: to what extent a sample is representative of a wider population;
error: whether errors are the result of mistakes, are due to systematic bias or are the result of taking a sample.
1.10.1 Generalisability Some studies examine a unique event or set of circumstances with perhaps a view to documenting the event as special. For example, a study of audience reaction to the Opening Ceremony of the 2012 London Olympics may not be generalisable.
In most cases, though, investigators suggest that the results of their study have some resonance or implications for a wider group of people or for other similar settings.
For positivists (see Section 2.2) generalisability is the extent to which research findings and conclusions from a study conducted on a sample can be applied to the population. The key concern in this case is that the sample is representative of the population to which the conclusions are generalised. Representative means that the sample is, in effect, a small-scale replica of the wider population.
For positivists, the ability to generalise findings to wider groups and circumstances is indicative of 'external validity'.
For phenomenologists and critical social researchers (see Sections 2.3 and 2.4), generalisability is not so important because critical and phenomenological research is more concerned with understanding or interpreting events rather than making sweeping general statements or inferring causal connections (see section 2.2 on causality).
Maxwell (1992, pp. 293–95) observed that generalisation is one of the most common tests of 'validity' for quantitative research (by which he implied positivist approaches) and 'yet is considered to be of little, or even no, importance for many qualitative researchers' (by which he implied phenomenological researchers). He added that 'sampling, a vital consideration in establishing the 'validity' of a statistical test, is usually purposeful in qualitative research as opposed to random'.
Maxwell also states that: 'Qualitative research almost exclusively limits itself to 'internal' generalisations, if indeed it seeks to claim any form of generalisability at all. Quantitative research, on the other hand, attempts to deal with both 'internal' and 'external' generalisations, referring to these as 'internal validity' and 'external validity' respectively (Maxwell, 1992, p. 294).
Williams (2000, p. 209) stated that: 'it is often those who define themselves as interpretivists (as opposed to more generic qualitative researchers) who deny the possibility of generalisation'.
Williams suggested three broad approaches to generalisation:
Total generalisations: where situation S´ is identical to S in every detail. Thus S´ is not a copy of S but an instance of a general deterministic law that governs S also. Such generalisations are in fact axioms and do no more than express instances of particular laws. Thus the rate of cooling of an electric element is an instance of (and calculable through) the second law of thermodynamics.
Statistical generalisations: where the probability of situation S occurring more widely can be estimated from instances of s. This is simply the relationship between sample and population and is the basis upon which most generalisations (other than some in physics and chemistry) in the natural sciences are made. It is, of course, the same basis upon which survey researchers in sociology generalise. The sociological survey usually depends upon some form of probability sampling, whereby within each stratum every case will have the same probability of selection. … Importantly, however, the researcher is able to express statistically the level of confidence [of the sample representing] the population.
Moderatum generalisations: where aspects of S can be seen to be instances of a broader recognisable set of features. This is the form of generalisation made in interpretive research, either knowingly or unknowingly. Geertz's claim that 'Every people . . . loves its own form of violence' is an example of such a general feature. .
Williams argued that total generalisations are impossible in the social world. Statistical generalisation, whilst possible in the social world, cannot usually be made from the kind of data generated by interpretive research. He concluded: 'On the first two counts then, the generalisation of interpretive findings is impossible, but what interpretivists do—and I would suggest they are right to do—is make moderatum generalisations'.
For example, in their study of sparsely populated rural areas, Payne and Grew (2005, p. 895) argued that their research could be generalised in moderatum:
...Payne and Williams' (2005) recent injunctions against drawing extensive conclusions from small qualitative samples calls for clarity about claims for sample designs. We can think of no substantial reason why our findings should not be generalized in moderatum (Williams, 2000) to similar groups living in other sparsely populated regions, and possibly also to rural areas with limited levels of commuting.... We are not making precise statistical generalizations from the data because our mode of analysis is qualitative rather than quantitative. We are not making statements about the whole of British society on the basis of our selected areas alone, nor about the geographically-defined populations from which our samples were drawn…. we drew not a random sample of areas, but a purposive one (i.e. rural, non-northwest), designed to test one dimension of the range of Savage et al.'s generalization. By random sampling from the membership of the associations at the second stage, we eliminated interviewer subjectivity in respondent selection, thus reducing the prospect of having selected an atypical sample, although not, of course, producing a standard probability sample. Taken together, the two stages have produced a relatively internally-unbiased sample of people living in one type of non-urban, non-northwest England social circumstances, to compare with the Savage et al. findings. The key point is that rather than claiming to have a representative sample of an area, we can make a reasoned case that those who we have interviewed are a group who do not differ substantially from others to whom we might wish to generalize.
In qualitative research, generalizing claims are less explicit. Indeed, some interpretivist sociologists (e.g.Denzin, 1983;Denzin and Lincoln, 1995;Marshall and Rossman, 1989) minimize the relevance of generalization or even deny any intention toward generalization in qualitative research.
A belief that one must choose between an 'interpretive sociology', which
rejects all generalization, and a sociology dependent on total or axiomatic generalizations (represented by statistical generalizations or physical laws) is too simplistic (Williams, 2000a, 2000b, 2001). Qualitative research methods can produce an intermediate type of limited generalization, 'moderatum generalizations'. These resemble the modest, pragmatic generalizations drawn from personal experience which, by bringing a semblance of order and consistency to social interaction, make everyday life possible. Indeed, a strong claim can be
made that in qualitative research (even in the interpretivist sociology loudest in its rejection of generalization) such moderatum generalizations are unavoidable.
They go further and look at a wide range of qualitative research and examine what sociologists actually do by way of generalisation in their publications. They conclude that generalisation should be planned in. They looked at 17 non-quantitative articles with empirical data published in Sociology in 2003.
There was almost no explicit discussion of the grounds on which findings
might be generalized beyond the research setting. Despite this, all the 17 articles made generalizations, albeit of different kinds. Kelly (2003: 37) defended her generalizations on the grounds of later feedback from a conference with a wider range of key informants, whileHislop and Arber (2003: 710) cautiously linked their conclusions with a call for more studies.Punch (2003: 289–90) made generalizing claims but also denied making them. The most explicit comment on generalization came fromGladney et al. (2003: 311), who claimed moderatum status for their position. However, all four exceptions consisted of only very brief comments. The generalizations in some of the remaining qualitative articles could be captured as moderatum generalizations represented in the taxonomy which follows, but others were vague, sweeping and essentially immoderate. The vague generalizations were often juxtaposed with theoretical statements by other sociologists, so that it was hard to tell who was making the generalization. Words like 'suggest', 'tend', 'illustrate', and 'some of' were also used in a way that rendered claims unclear. This could be interpreted as further evidence of sociology's collective failure to achieve clarity in generalization. Payne and Williams (2005, pp. 300–1)
Positivist research, especially quantitative studies based on random samples claim 'external' validity. However such research is restricted to measuring those elements that, by definition, are common to all. This raises the question of 'at what cost' are we exchanging accuracy for generalisability?Winter (2000) argued that although an account may be judged valid, replicable and stable, and thus generalisable, one could argue that generalisation in itself is neither valid nor accurate. 'A generalisable statement, whilst relating to all those to whom it is applied, may not actually describe the phenomena of any single case with any accuracy, in the same way that a mean average score need not be the same value as any of the numbers of which it is an average.'
Guba and Lincoln (1982, p. 238) had gone further and maintained that it is not possible to generalise social research:
The aim of inquiry is to develop an idiographic body of knowledge. This knowledge is best encapsulated in a series of 'working hypotheses' that describe the individual case. Generalizations are impossible since phenomena are neither time- nor context-free.
Activity 1.10.1 Outline how you would undertake a generalisable study of the alcohol consumption of weekend club goers? This activity needs to be done in pairs. Spend 15 minutes writing down an outline and then make a brief presentation to try and convince a colleague (who in return tries to convince you of the generalisability of his/her proposed study)
1.10.2 Transferability Whether or not the results of a study are generalisable to the whole population, one might ask if the outcomes are transferable to another context? For positivists the difference between generalisation and transfer is as follows. The outcomes of a study of a representative sample can be generalised to the population from which the sample was taken, allowing for sampling error (see section 1.10.4.3). There is no need of 'transferability' in such circumstances. Transferability, would occur if the study in one setting could be demonstrated to another different setting. For example, could the outcomes of a survey of football supporters at a football match be used to draw conclusions about rugby supporters at a rugby game?
'External validity' is also important to phenomenological researchers but rather than focus on generalisability, external validity is re-presented as transferability. Transferability is the ability of research results to transfer to situations with similar parameters, populations and characteristics.Winter (2000), for example argues that 'qualitative' findings can be used to develop theories (by taking outcomes from different settings) rather than generalising from a sample to a population.
1.10.3 Representativeness One question asked of social research is whether the data used in a study is representative of the broader group the research was intended to cover? For example, if you are studying the television watching habits of young people are those in your study representative of young people in general. Similarly, if you are looking at the sexism in fashion magazines are the magazines you have analysed representative of fashion magazines in general?
Why does the data you collect need to be representative? It needs to be representative if you intend to extend the analysis of your data to the wider group. If, for example, you conclude that fashion magazines are sexist because they continue to make women objects of men's gaze then you would need to be sure that you have examined sufficient range of different types of fashion magazines to be able to make that claim. It might be that, in fact, you have only looked at magazines that are aimed at teenage females, in which case your data is at best only representative of that subgroup of fashion magazines.
Extending the analysis to a wider constituency is, as discussed above (1.10.2), known as generalisation. In short, the more you generalise your findings the more you need to be assured that your data is representative of the realm into which you are generalising your results.
Representativeness is usually construed as a sampling issue (see Section 8 for details of sampling). The data you are exploring can be seen as a sample of the all the data you could have looked at. For example, the fashion magazines examined are a sample of all the fashion magazines currently available (or, indeed, ever published), which is called the 'population'.
If you are making claims about the nature of all currently-available fashion magazines, is the selection you have looked at (the sample) representative of all the fashion magazines currently available (the population)? If it isn't then claims about the population based on your sample would be biased (see below).
The accepted view (albeit basically a positivist view) is that if you want to generalise from your sample to the population, the sample has to be representative of the population.
How can you claim that a sample represents a population? There are four broad ways of doing this.
First, a representative sample can be selected using random sampling. It is representative because a random sample is such that the selection process means that everyone in the population has an equal chance of being in the sample (see Section 8). For example, you may be undertaking a study of first-year student's satisfaction with enrolment procedures at a university. A random sample can be obtained by selecting every tenth student on the list of all enrolled first-year students.
Second, representativeness can be established by demonstrating, retrospectively, that the proportions of the different categories of people in the sample are similar to the proportions of these different categories in the population that is being explored? For example, one might know, from census data what the gender, age, ethnicity and socio-economic composition of a local council borough is and that in a study of council services, the sample reflects the population characteristics.
Third, where the characteristics of the population are not known with any precision, generalisation requires a demonstration that the sample appears to reflect what is known about the population, especially in regard to the key factors pertinent to the research question. For example, a study of the leisure activity of Type II diabetics would necessitate having a sample that fitted the key aspects of the known population profile of diabetics, such as age (over 40), weight, ethnicity and family history.
Fourth, by appealing to the likelihood that your sample is as 'acceptable' for practical purposes and persuading the reader that the data is indicative of a wider setting, albeit that no one can know for sure. For example, in exploring the extent and impact of needle-sharing among intravenous drug users, there is no way of constructing a random sample as there is no known population data on intravenous drug users. Gathering information depends on the ingenuity of the researcher, making contacts, being vouched for, observing what happens, asking questions and building up a picture of an opaque world. Add to this, for example, what is known through medical or other records about communicable diseases and the outcome is an appeal to the reader of the research that says that although this study relates to a loose network of intravenous drug users, their activities are likely to be typical of intravenous drug users in general.
Not all research is concerned with representativeness. An American study (Kamiya and Loewen, 2013) had a sample of one! An experienced English-as-a-foreign-language teacher was interviewed before and after being given three research articles relating to his teaching. He read two of them but 'the articles did not appear to influence the nature of [his] stated beliefs', although 'they did succeed in raising [his] awareness'. On the basis of this the authors suggested that one way to expose experienced teachers to new pedagogical ideas 'might be to include research articles in a teacher development program' (Kamiya and Loewen, 2013, p. 11). It is staggering that a study with a single respondent should be used to make broad recommendations (albeit rather trivial and obvious ones) and even more bewildering that such a study was actually published in a journal.
1.10.4.1 Mistakes 'Mistakes' include poor design of questionnaires, including the use of 'leading questions', miscoding of responses, inconsistent observation, misunderstanding of the investigative questions by the respondents, unconscious non-verbal prompts by interviewers, inappropriate statistical analysis, or errors in the write-up of the report.
Mistakes can vary from minor to fundamental. Minor mistakes may cancel themselves out or make little impact on the overall results, thus not creating any systematic bias in the results. Fundamental mistakes may mean that the research has very limited value.
1.10.4.2 Bias Bias occurs when a study claims to be about a particular population or area of study but the sample only represents a sub-section of the population that it claims to represent. Bias, thus, occurs when the sample is systematically skewed or distorted.
For example, a student survey that collects data by handing a questionnaire out in a lecture is not representative of all students. Those who attend lectures may not be a representative sample of all students, as perhaps the excellent students who know the subject may decline to attend lectures.
In the fashion magazine study mentioned above, if the selection of magazines is mostly teen-fashion magazines then the sample would be a biased sample of all published fashion magazines.
1.10.4.3 Sampling error Sampling error occurs whenever you take a sample. It is different from bias. Even an unbiased sample will result in sampling error.
If a population consists of 100 Labour supporters and 100 Conservative supporters and you take a sample of 20 people, it would be most likely that your sample would consist of 10 Labour and 10 Conservative. However, it could be 15 Labour and 5 Conservative, although this is less likely. At the extreme it may be all Labour and no Conservative but the chance of that is much lower.
However, taking a random sample could lead to that result. So when taking a random sample, there is some margin of error. You may slightly overestimate or slightly underestimate the characteristics of the population from which the sample is taken. The problem is that you will not know whether you are exactly replicating or are underestimating or overestimating. The variation of the sample from the population is called sampling error. For random samples, the extent of sampling error can be calculated statistically (see Section 8 for more details)