Orientation Observation In-depth interviews Document analysis and semiology Conversation and discourse analysis Secondary Data Surveys Experiments Ethics Research outcomes



Social Research Glossary

About Researching the Real World



© Lee Harvey 2012–2017

Page updated 13 January, 2017

Citation reference: Harvey, L., 2012–2017, Researching the Real World, available at
All rights belong to author.


A Guide to Methodology

8. Surveys

8.1 Introduction to surveys
8.2 Methodological approaches
8.3 Doing survey research

8.3.1 Aims and purpose
8.3.2 Background to the research
8.3.3 Feasibility
8.3.4 Hypotheses
8.3.5 Operationalisation
8.3.6 How will data be collected and what are the key relationships?
8.3.7 Designing the research instrument
8.3.8 Pilot survey
8.3.9 Sampling Sampling frame Random samples Simple random sample Systematic random sample Random cluster sample Multi-stage random sample Stratified random sample Area sampling Non-random samples Convenience sample Volunteer sample Snowball sample Quota sample

8.3.10 Questionnaire distribution and interviewing
8.3.11 Coding data
8.3.12 Analysis
8.3.13 Hypothesis testing
8.3.14 Report writing

8.4 Summary and conclusion


8.3.9 Sampling
It is not possible for social researchers to study everybody in the population due to the constraints of time and cost. Thus researchers need to select a sample to represent the population. Population in this sense does not necessarily mean everyone in a country but can refer to subgroups, for example, women, students, shop stewards and so on. Thus we can talk about the population of women in Great Britain or the population of shop stewards working for British Leyland.

Researchers may not, therefore, talk to all women but take a representative sample from the population of women. Using a representative sample allows us, in theory, to make generalisations about the population from the information gleaned from the sample.

For example, in their study of poverty, Townsend et al. (1987) took a sample of 2,700 adults from the population of London. The addresses were chosen at random from 30 wards selected from the total of 755 wards. They thus had a representative sample of households and were able to generalise about poverty in London from their findings.

Sampling is essential to make social scientific analysis possible. A small representative sample can provide ‘accurate’ data on much larger populations. Fifteen hundred people in an opinion poll can give a good indicator of how twenty million will vote at a general election.

The following sections outline how to select a sample and the different kinds of samples that can be used. It is important to be are aware of different sampling procedures and to understand the notion of a representative sample.

In practice, however, it is difficult and sometimes impossible to collect a truly representative sample. Selecting a representative sample can be a complex and time-consuming process and the resources to do so may not be available.

It is important to bear in mind that a non-representative sample may lead to biased results. Therefore, care needs to be taken in making generalisations about the population from which the sample was taken. However, that does not mean that research should cease if there if it is not possible to obtain a representative sample. Awareness of the nature of the sample is crucial.

Top Sampling frame
To draw a representative sample it is necessary to find a sampling frame that provides a list of everyone in the population from which the sample is to be drawn.

Which sampling frame is to be uses is dependent upon the type of study being carried out. For example, the electoral roll, which contains the names and addresses of all people eligible to vote in Great Britain, would be an appropriate frame from which to select a sample for a study of voting patterns in Great Britain. However, if the study was about political attitudes of underground train drivers in London a much better sampling frame would be the personnel lists of London Transport.

It is important that the frame selected includes every member of the survey population because if it does not you will get sampling bias. This can lead to some groups of the population being overrepresented and others underrepresented and some not represented at all.

For example, if you wanted a sample of children under ten years old it would be no use using school registers as children too young to go to school would be excluded.

How would you compile a sampling frame to undertake the following research:
1. An analysis of church attendance in Great Britain.
2. Child-care problems of single parents with children under five years of age?

There are two broad types of sampling procedure, random and non-random.

Top Random samples
Random samples are designed to give everybody in the population an equal (or known) probability of being in the sample. In this way the sample will not be biased towards any particular group within the population. This is important if you want a representative sample.

To get a random sample it is necessary to have a complete and up-to-date sampling frame. Random sampling, despite its name, is therefore a systematic form of sampling and is sometimes known as probability or scientific sampling. There are several ways of drawing a random sample.

Top Simple random sample
Simple random sampling involves selecting individuals from a list of the population at random. Much like drawing a lottery, with all the names or numbers in a containiner and the required sample being taken out in such a way that everyone in the population has an equal chance of being in the sample.

In practice, simple random sampling is usually done by using a computer to select names at random from a data base.

Top Systematic random sample
Systematic random sampling involves selecting a starting point from a list at random and then selecting every nth item from the list.

For example if a sample of 1000 students is needed from a university enrolment list containing 10,000 names, then a starting point between 1 and 10 would be chosen at random and then every 10th person on the list would be selected for the sample.

Although everyone has an equal chance of being in the sample it is possible that this method will still generate an unrepresentative sample.

For example, if the list was an address list in house number order then every tenth house might generate only even numbers that would (in Britain) lead to the houses selected all being on one side of a street (for most streets). It is possible that this might generate a sample with an unrepresentative housing class.

How might you select a systematic random sample of 200 businesses listed in your local area telephone book?

Top Random cluster sample
In random cluster sampling the population is viewed as divided into groups (for example, school children into school classes, bank workers into bank branches) and these groups are then sampled.

For example, to get a sample of first-year secondary school children in Brighton you could select a random sample from all the first-year class registers in the area. This is a time-consuming job. Alternatively, you could select one first-year class at random from each school in the area and survey all the pupils in the class. This is a much quicker way to compile the sample.

Cluster sampling may often save time and money but it can lead to a biased sample. In the example above, some schools will be a lot larger than others, and so some schools will have five or six first-year classes while other much smaller schools might have just one or two. If one class is taken from each school, irrespective of school size, then some students will have a much greater chance of being in the sample than others. Where there is only one first-year class the students in that school will have a hundred per cent chance of being in the sample but where there are five first-year classes they will only have a twenty per cent chance.

Cluster sampling might be single-stage, where everyone in the cluster is in the sample (that is, everybody in the class selected is part of the sample). Or it might be multi-stage, where further subsamples may be selected from within the cluster. For example, one may only want one in three students from the randomly selected first-year class.

Top Multi-stage random sample
Multi-stage random sampling is a means of selecting a sample covering a large and dispersed population without having to interview one or two people in a lot of widely scattered locations. This is important when you have limited resources. It is much less costly and much quicker to interview people in a small area than it is to interview people spread thinly over a large area. This is not a consideration if you are using a mailed or on-line questionnaire.

Multi-stage random sampling works by selecting in stages.

For example, a sample of 1,000 voters in Britain might be chosen by first selecting 20 constituencies at random from the list of all British constituencies, then choosing 2 wards at random within each constituency, and then selecting 25 voters from the electoral register of each ward. The sample will be selected at random but the location of the interviewees would be more concentrated than if a sample of 1,000 people had been selected by simple random sampling techniques from the whole electoral register for Great Britain.

In practice, multi-stage random sampling tends to bring in an element of bias as it is rare that the different probabilities of selection for individuals are taken into account. Usually, it is assumed that each member has an equal chance of being in the sample. In the above example this would not be the case if, for example, a constituency of 100,000 voters and one of 50,000 were selected at stage one. The probability of any individual from the first constituency being in the sample would be half that of a voter in the second constituency.

Top Stratified random sample
Stratified random sampling involves spliting the population into different sections, or strata, prior to drawing the sample. The nature of of the strata depends on what the research is trying to discover.

For example, if the research is attempting to find out what political party members think about the leader of the party, the ‘population’ of members would be split into the members of the different parties and the sample then selected at random, as either a fixed number from each party member list or proportionate to the number of members (say 10%, from each list).

In practice, stratified random sampling can be combined with multi-stage sampling.

For example, when selecting a sample of voters, opinion pollsters rank-order British constituencies according to the percentage Conservative vote at the last general election. They thus stratify the population on the basis of political opinion at the constituency level. Then they use systematic random sampling to choose the constituencies and, finally, random sampling to select voters.

Top Area sampling
Area sampling attempts to construct a random sample when a sampling frame is not available. Instead of selecting names from a list, a town, suburb or street might be selected at random and the houses in that street are then selected in some way, either randomly, or more likely, for practical reasons, using systematic random sampling with a random starting point and then every nth dwelling.

The residents from the houses may comprise the sample, or there may be a further selection process based on whether or not the dwelling has residents that fit specified criteria, such as having children under the age of 18 living at home.

Top Non-random samples
A non-random (or non-probability) sample is any sample where it is not possible to say that all the members of the relevant population had a known probability of being selected at the outset. Convenience samples, volunteer samples, snowball samples and quota samples are all types of non-random samples.

Non-random sampling does away with the need for a sampling frame but also usually results in biased samples.

Top Convenience sample
Convenience sampling is a form of sampling in which anyone who is convenient becomes part of the sample. Samples of this sort might be used in the pre-pilot stage of a research project to test out some initial wording of questions but are of little use otherwise. They can lead to enormous distortions.

For example, an opinion poll survey carried out by a local newspaper in the Midlands in 1974 showed that the Conservatives had a 20% lead in the area. Days later, Labour narrowly won the election. The inaccuracy supposedly arose because the paper’s interviewer stood at the main entrance to one of the most prestigious and expensive department stores in the region and asked the opinions of people who went to shop there. This highly biased convenience sample led to the serious distortion in the results.

Top Volunteer sample
A volunteer sample is one that uses volunteers. They are often recruited through advertisements. Ien Ang’s (1985) study of viewers’ attitudes to the television programme Dallas was based on a volunteer sample of people who responded to her advertisement in a Dutch magazine.

Volunteer samples are thus self-selecting and usually biased as they are a subgroup of a population who are prepared to be involved in the research. Sometimes researchers have to resort to volunteer samples as there is no other way of reaching sufficient numbers of people to build up a sample.

A researcher may want to locate a sample of people who have given up smoking for the whole of the last month. It would be very time-consuming to locate potential sample members through random sampling and so advertising and inviting people to participate in the research may be the only feasible way forward. The results, however, would have to be treated with care as the sample will inevitably be biased.

Top Snowball sample
Snowball samples occur when the researcher makes contact with a suitable subject and is then directed to, or makes contact with, other members in a network of contacts. Participant observers often build up their sample in this way (see Sections 3.4.2 and

It is a useful technique for locating sample subjects when you are looking for a narrow range of people (such as career criminals) where no sampling frame exists. However, this does tend to mean that the research is focused on a set of interlinking networks. For participant observers who are trying to interpret or understand the nature of a particular social phenomenon without necessarily attempting to explain and generalise the results, this is an acceptable approach to broadening their investigation.

For example, Koji Ueno and Haley Gentile (2014) for their in-depth interview study of gay-straight friendships:

obtained a convenience sample at a state university in the southeastern US. Between 2010 and 2012, email invitations were sent to students enrolled in sociology courses. The invitations also asked students to forward the information to others who might be interested. To recruit more participants, we also employed a chain referral method by asking existing participants to forward the invitation email. While recruiting participants, we conducted initial data coding and evaluated the extent to which new participants confirmed, refuted, or elaborated existing themes.… Recruitment continued until we determined that the data were saturated and that additional respondents did not provide additional insights….. In all, 16 GLB and 17 straight students participated. These students were not friends with each other.

Top Quota sample
Non-random sampling is therefore almost always biased. One possible exception, in practice, is the use of quota sampling. This method does not give every member of the population an equal (or known) chance of being in the sample. However, it attempts representativeness by selecting respondents, from those available, in proportion to a predetermined quota that reflects data already known about the population (usually demographic data such as age, gender and so on).

Quota sampling works by dividing the population into subgroups and then giving interviewers a quota of people from each subgroup that they have to locate. For example, a quota of 40 interviewees might be made up of 20 females and 20 males, of whom 10 of each must be under forty and 10 over forty. When one quota is full, for example, the interviewer has questioned 10 women under forty, then she or he must not ask any more young women but must continue until the rest of the quotas are completed.

Quotas are designed, as far as possible, so that the subgroups are in the same proportion in the sample as they are in the population being investigated. Sometimes, however, quotas are decided arbitrarily without a full knowledge of the population demographic data.

The difference between quota and stratified random sampling (Section is that the interviewer is free to ask anybody, within the geographic area they have been allocated, who falls into their quotas. With stratified random sampling, although the population is divided into strata, the individuals are still selected at random by name from a sampling frame.

Design a quota for a sample of 50 students, aged over 16 from a school or college. The quota should be selected so that it provides a good cross-section of students assuming that you are doing some research to assess student attitudes to the amount of coursework they have.

For other forms of qualitative sampling including theoretical and purposive sampling see Section

In summary, random samples are more representative but to achieve a truly random sample often requires considerable resources.

Survey researchers usually try to obtain samples that are as representative as possible and are aware of their limitations. However, it is quite possible to encounter claims of represnetativeness that have no credible foundation.

The CASE STUDY survey, Attitudes Towards Homosexuality, stated that its sample consisted of 151 students from 3 further education colleges in Birmingham. The subjects taken by the students included printing, design, art, hairdressing, catering, hotel management, tourism, language, sociology and government. The questionnaires were handed out to the respondents by lecturers teaching selected classes and the completed forms were returned to the researchers.
Does this description provide enough information to be able to identify what sampling procedure was adopted in the survey? If so, what kind of sampling would you say was used? What else would you need to know to be able to judge the adequacy of the sampling procedure used?

Next 8.3.10 Interviewing or questionnaire distribution