8.1 Introduction to surveys
8.2 Methodological approaches
8.3 Doing survey research
8.4 Statistical Analysis
8.4.1 Descriptive statistics
8.4.2 Exploring relationships
8.4.3 Analysing samples
8.4.3.1 Generalising from samples
8.4.3.2 Dealing with sampling error
8.4.3.3 Confidence limits
8.4.3.4 Statistical significance
8.4.3.5 Hypothesis testing
8.4.3.6 Significance tests
8.4.3.6.1 Parametric tests of significance
8.4.3.6.1.1 z tests of means and proportions
8.4.3.6.1.2 t tests of independent means
8.4.3.6.1.3 F test of variances
8.4.3.6.1.4 Parametric tests of differences in means of matched pairs
8.4.3.6.1.5 Analysis of variance
8.4.3.6.1.5.1 One-way analysis of variance
8.4.3.6.1.5.2 Two-way analysis of variance
8.4.3.6.2 Non-parametric tests of significance
8.4.3.7 Summary of significance testing and association: an example
8.4.4 Report writing
8.5 Summary and conclusion
Introduction
The analysis of significance tests (z, t and F) have, so far, been restricted to comparison of one or two samples. When more than two samples are being compared, another approach is required.
When to use analysis of variance
Analysis of Variance is used to compare more than two samples of interval scale data. Analysis of variance can be used very extensively to compare any number of samples for many different variables at once and, as such, is a more complex test than one- or two-sample tests.
For example, comparing the scores of students on a scale of conservativism in four different disciplines, Arts, Science, Social Science and Humanities, could be done by comparing each pair using a t-test. That would involve six separate tests. Analysis of variance compares all four at once.
Using a technique known as one-way analysis of variance (one-way ANOVA) it is possible to compare three or more independent samples (of interval scale data).
Section 8.4.3.6.1.4. explained how the t test may be adapted to two related examples. Two-way analysis of variance (two-way ANOVA) compares three or more related samples.
Analysis of variance could be used for two samples but is not as efficient as either the z or t tests and is therefore used only in the multi-sample case.
Besides one- and two-way ANOVA, it is also possible carry out three-, four-, indeed n-way ANOVA. Three-way compares samples, with not one but two related variables: n-way way, thus, compares samples with n-l related variables. This section considers only the simple cases of one- and two-way ANOVA.
Note that the calculations for analysis of variance can be extensive, as the following sections demonstrate. You would normally use a computer program when undertaking analysis of variance. This section intends to explain what analysis of variance does and what is involved, thus aiding understanding of the process and its limitations.
Top
8.4.3.6.1.5.1 One-way analysis of variance
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
Unworked examples
Hypotheses
H0: µ1=µ2=µ3...=µj
where =µ is the population mean and j is the number of samples
The only alternative hypothesis is that all means are not equal.
HA: µ1≠µ2≠µ3...≠µj
What we need to know
Sample mean for each sample: SM1, SM2, ... SMj
Variance for each sample: var1, var2, ... varj
Size of each sample: n1, n2, ... nj
Sampling distribution
The sampling distribution employed in analysis of variance is the F distribution.
The null hypothesis assumes that the means of the samples being compared are all the same: that is, each sample is from the same distribution. If the null hypothesis (H0) is true, then there is a single population, and therefore a single population variance, consequently each sample should provide an estimate of the common population variance.
The procedure is to compute two different estimates of the population variance (var=sigma squared). The first estimate is the unbiased estimate of the population variance derived from the sample variances.
The estimate of the variance of the population derived from sample variances is denoted .
Hence = (varm1, varm2, ... varmj)/j = ∑varmj/j
Where varm is the modified variance of a sample and j is the number of samples.
The second estimate of the population variance is derived from the equality:
Variance of the sample means (varSM) = population variance divided by sample size = var/n
So:
varSM = var population/n
or
n(varSM) = var population
varSM is estimated by calculating the deviations of the sample mean from the mean of the sample means (SMSM)
SMSM is found by dividing the sum of the sample means by the number of samples (j)
SMSM = ∑SM/j
Hence the estimate of the variance of the means (varSM) is given by
varSM = ∑(SM - SMSM)2/(j-1)
Note that the sume= of squared deviations is divided by j-1 as this gives an unbiased estimate of varSM from the sample data.
So, as population variance = n(varSM)
the unbiased estimate of population variance (var), derived from the variances of the means is varestSM
varestSM = n(∑(SM - SMSM)2/(j-1))
Consequently, there are two estimates of the population variance.
The first, (varS) is found by averaging the variance within the samples.
The second (varestSM) is the variance between samples.
The first variance is an estimate derived irrespective of the criteria dividing the samples, the second estimate is dependent upon sample differences.
If the data being analysed is from the same population and has been arbitrarily assigned into separate samples, then the two estimates of variance should be approximately the same.
If the samples come from homoscedastic populations with different means, then the estimate varS will still provide an unbiased estimate of variance but the variation between the sample means (as reflected by varSM) will lead to an estimate of variance that is significantly larger than the estimate varS
I.e. as the sample means diverge, the variance of means will increase until the difference between them cannot be accounted for by sampling variation.
varestSM will only provide an estimate of variance as long as the sample means do not differ significantly. When varSM is significantly larger than varS then the null hypothesis of equal population means is rejected.
Assumptions
In the section above it was argued that the null hypothesis assumes that samples are from the same population thereby implying that there is a single population variance. This implication is only valid on the assumption that all samples come from populations with identical variances. In addition, when using the F distribution, it is assumed that populations are approximately normally distributed. Hence a one-way analysis of variance, has the same assumptions as when using the two-sample sample t test, namely that all samples are from normal populations with identical variances (the condition of homoscedasticity).
Testing statistic
One-way ANOVA tests two variances for significant difference using the F distribution.
The F testing statistic (denoted with degrees of feeedom x for largest variance, y for smallest). (See NOTE on degrees of freedom)
Fd1,d2 =varestSM / varS
Where
varestSM= n(∑(SM - SMSM)2/j-1)
and
SMSM = ∑SM/j
varS = ∑varm/j
and
varm = ∑(x-SM)2/n-1
Critical values
The critical value of F is found from Tables of F critical values for the following degrees of freedom.
varestSM has d1 degrees of freedom = j-1
varS has d2 degrees of freedom = j(n-1)
where j is number of samples and n is sample size.
H0 is rejected if the calculated value of F is greater than critical F.
Examples of F distribution tables are as follows:
University of Baltimore (accessed 8 October 2020)
University of Texas at Austin, Department of Mathematics (accessed 8 October 2020)
Note that some small degree of variation can be found between tables of F critical values.
Worked example
The table below shows the mean and variance of the number of cars owned by 20 households selected at random from six regions of England and Wales. Is there a significant difference in number of cars owned between regions?
Region |
Sample Mean |
Sample Variance |
Modified variance |
South East |
1.5 |
1.05 |
1.1052 |
Wales & SW |
1.3 |
0.71 |
0.7474 |
Midlands |
1.4 |
0.84 |
0.8842 |
East Anglia |
1.6 |
0.84 |
0.8842 |
North West |
1.2 |
1.06 |
1.1158 |
North East |
1.4 |
1.23 |
1.2947 |
∑ |
8.4 |
|
|
Note that this is a one-way ANOVA because there is one independent variable (the region) and one-dependent variable, number of cars. As there are six samples ANOVA is used (otherwise this would involve 15 separate t-tests of independent samples).
n=20 for all regions
Modified variance = (sample variance)n/n-1
Hypotheses:
H0: µ1=µ2=µ3...=µj
HA: µ1≠µ2≠µ3...≠µj
Confidence level: 95%
Testing statistic: Fd1,d2 =varestSM / varS
Critical value:
Fd1,d2 = F5,114 = 2.3
d1=6
d2=6(20-1)=6x19=114
Decision rule: reject H0 if calculated F > 2.3
Computations:
n= 20
SMSM = ∑SM/j = (1.5 + 1.3 + 1.4 + 1.6 + 1.2 + 1.4)/6
SMSM = 8.4/6 = 1.4
∑(SM - SMSM)2 = (1.5 -1.4)2 + (1.3 -1.4)2 + (1.4 -1.4)2 + (1.6 -1.4)2 + (1.2 -1.4)2 + (1.4 -1.4)2
=(0.1)2 + (-0.1)2 + (0)2 + (0.2)2 + (-0.2)2 + (0)2
=0.01 + 0.01 + 0 + 0.04 + 0.04 + 0
So ∑(SM - SMSM)2 = 0.1
varestSM= n(∑(SM - SMSM)2 /j-1)
varestSM= 20(0.1)/(6-1) = 2/5 = 0.4
Compute the modified sample variance by multiplying sample variance by n/(n-1) i.e. 20/19. Column 3 of the table (above).
varS = ∑varm/j
= (1.1052+0.7474+0.8842+0.8842+1.1158+1.2947)/6
= 1.007
Thus Fd1,d2 = F5,114 = varestSM/varS = 0.4/1.007 = 0.4
Decision: H0 cannot be rejected as calculated F is less than than critical value. There is no significant difference in car ownership between regions.
Computing the F statistic could be time consuming and there are various programs that will compute it for you. The key thing is to understand the result, what it is that you have actually calculated and when to use which test.
Top
8.4.3.6.1.5.2 Two-way analysis of variance
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
Unworked examples
Two-way ANOVA compares the mean differences between groups that have been split on two independent variables (called factors). The primary purpose of a two-way ANOVA is to see to what extent the two independent variables have an effect on the dependent variable.
Hypotheses
Main hypothesis: H0: µ1=µ2=µ3...=µj
i.e. the means of the samples are the same
where =µ is the population mean and j is the number of samples
The only alternative hypothesis is that all means are not equal.
HA: µ1≠µ2≠µ3...≠µj
Supplementary hypothesis: H0: µ1R=µ2R=µ3R...=µjR
i.e. the means of the related categories are the same
where = µR is the mean of the related category
What we need to know
To carry out a two-way analysis of variance it is necessary to know all values in the samples, clearly broken down into their related categories.
(Strictly speaking, it would be possible to carry out the analysis provided all the sums of squares were known but it is unreasonable to expect sample data to be summarised in such a fashion).
Sampling distribution
The sampling distribution employed in analysis of variance is the F distribution (as in one-way analyis of variance)
The F distribution is used to compare two different estimates of the common population variance. The estimates are derived differently than these of one-way analysis of variance as the variation due to the related factor is isolated.
It is a property of variance that total variance is the sum of the component variances. The variation due to different factors will add up to the total variation of a distribution. When data is analysed using two-way analysis of variance, the total variance consists of three components:
the variance between the samples,
the variance between the related categories, and
the variation due to the sampling procedure (known as experimental error).
The estimate of the population variance derived from the variance between samples is denoted varestSM and is the same as in one-way ANOVA, see 8.4.3.6.1.5.1.
The estimate derived from the variance between related categories is denoted varR
The estimate derived from experimental error is varE
Population variance may be estimated from the combined sample data (varestSM).
As in as in 8.4.3.6.1.5.1; varm is an unbiased estimate of population variance derived from combining samples that are assumed to be from the same population.
varm is derived by calculating the total variation of all the X's from the mean of the combined sample, thus:
varm = ∑(X-SMSM)2/((∑n)-1)
where
SMSM is the mean of the sample means
n is sample sizes
∑n is the sum of the sample sizes.
It was stated in in 8.4.3.6.1.5.1 that
varestSM = n(∑(SM - SMSM)2/j-1)
Similarly varR is the deviation of the mean of each related category from SMsm, thus:
varR = j((∑(SMR - SMSM)2)/(n-1))
where
SMR is the mean of a related category (as opposed to SM which is the mean of a sample).
As the population variance, estimated by varm, is the sum of variances between samples, between related categories and the variance due to sampling procedure (experimental error), if two of the three estimates of total variance are known, then the third may be derived by algebraic manipulation based upon the summation property of variance.
The estimate of the experiemtal error, then is
varE = varm - varestSM - varR
(i.e variance due to experimental error = total variance minus variance between samples minus variance between related categories)
varE =
(∑(X-SMSM)2/((∑n)-1))-(n(∑(SM - SMSM)2/j-1))-(j((∑(SMR - SMSM)2)/(n-1)))
This expression is not so complex as it looks at it only involves calculating deviations from SMSM (a) for all the x values (b) for the sample means, and (c) for the means of the related categories. See worked example below.
The null hypothesis states that the samples are from the same population and thus the three estimates of the population variance should all be equal if the sample variables had been assigned at random from a single population. Given that the samples were first broken down into related categories, the estimate of population variance (sigma squared) derived from the variation between samples should be the same as that derived from the experimental error. Once the effect of the related factor has been isolated then the variation between samples should not differ significantly from the sampling variation if, in fact, all samples come from identical populations.
Therefore, having isolated the effect of the related category, if the ratio
Fx,y = varestSM/varE
is significant, then the null hypothesis is rejected and the sample means are significantly different.
Corollary. Having isolated an estimate of varR derived from the related categories, it would be possible to see if the means of the related categories are significantly different. (This analysis may not necessarily be required.) It may be self-evident that the related category means are different, i.e. that the related categorisation is a facter causing variation, but this may not always be the case.)
If the ratio
Fd1,d2 =varR/varE
is significant then the supplementary hypothesis
H0: µ1R=µ2R=µ3R...=µjR
is rejected and the means of the related categories are not identical.
Assumptions
The same assumption of homoscedacity, as for one-way ANOVA, holds (see 8.4.3.6.1.5.1). In addition, the addititive property of variance only holds if the component variances act independently. Hence the assumption that the different causes of variation are independent.
Testing statistic
The testing statistic for the basic hypothesis, H0: µ1=µ2=µ3...=µj is
Fd1,d2 = varestSM/varE
which simplifies to:
(n(n-1)(∑(SM-SMSM)2))/(∑(X-SMSM)2-n∑(SM-SMSM)2-j(∑(SMR - SMSM)2))
The testing statistic for the supplementary hypothesis, H0: µ1R=µ2R=µ3R...=µjR is
Fd1,d2 =varR/varE
which simplifies to
(j(j-1)(j((∑(SMR-SMSM)2)/(∑(X-SMSM)2-n∑(SM-SMSM)2-∑(SMR-SMSM)2)
Critical values
The critical value of F is found from Tables of F critical values for the following degrees of freedom.
varestSM has d1 degrees of freedom = j-1
varE has d2 degrees of freedom = (n-1)(j-1)
varR has d3 degrees of freedom = (n-1)
where j is number of samples and n is sample size.
H0 is rejected if the calculated value of F is greater than critical F.
Worked example
A survey of England divided the country into three areas, South, Midland and North. In each area a community was chosen at random within each of the population categories shown in the following table (Table 8.4.3.6.1.5.2.1) Each selected community was inspected and the number of public houses (including free access licensed hotels) was noted. The results are summarised in Table 8.3.1.4.1.5.2.1. Is there a significant difference in the average number of public houses in each area?
Table 8.4.3.6.1.5.2.1 Number of public houses by population size and region
Population |
South |
Midlands |
North |
below 250 |
2 |
2 |
2 |
250 - 499 |
4 |
2 |
3 |
500 - 999 |
4 |
4 |
4 |
1000 - 1999 |
5 |
3 |
7 |
2000 - 2999 |
3 |
4 |
2 |
3000 - 4999 |
5 |
8 |
8 |
5000 - 9999 |
7 |
6 |
8 |
10000 - 24999 |
14 |
15 |
13 |
25000 - 49000 |
20 |
16 |
24 |
50000 -100000 |
26 |
20 |
29 |
Note that this is a two-way ANOVA because there are two independent variables (the region and the population group) and one-dependent variable, number of pubs. As there are more than two samples two-way ANOVA is used (rather than t-tests of two related samples).
Hypotheses:
H0: µ1=µ2=µ3...=µj
HA: µ1≠µ2≠µ3...≠µj
Confidence level: 95%
Testing statistic: Fd1,d2 =varestSM / varE
(estimated variance between samples divided by estimate derived from experimental error)
Critical value:
Fd1,d2 = F2,18 = 3.55
d1= (j-1) =3-1 =2 where j is number of related categories
d2= (n-1)(j-1) = (9)(2) = 18
where n is number in each sample
Decision rule: reject H0 if calculated F >3.55
Computations:
Rewrite Table 8.4.3.6.1.5.2.1 and calculate deviations from SMSM (SMSM is the mean of the sample means)
Population |
South |
Mid |
North |
∑XR |
SMR |
SMR-SMSM
i.e SMR-9 |
(SMR-SMSM)2 |
below 250 |
2 |
2 |
2 |
6 |
2 |
-7 |
49 |
250 - 499 |
4 |
2 |
3 |
9 |
3 |
-6 |
36 |
500 - 999 |
4 |
4 |
4 |
12 |
4 |
-5 |
25 |
1000 - 1999 |
5 |
3 |
7 |
15 |
5 |
-4 |
16 |
2000 - 2999 |
3 |
4 |
2 |
9 |
3 |
-6 |
36 |
3000 - 4999 |
5 |
8 |
8 |
21 |
7 |
-2 |
4 |
5000 - 9999 |
7 |
6 |
8 |
21 |
7 |
-2 |
4 |
10000 - 24999 |
14 |
15 |
13 |
42 |
14 |
5 |
25 |
25000 - 49000 |
20 |
16 |
24 |
60 |
20 |
11 |
121 |
50000 -100000 |
26 |
20 |
29 |
75 |
25 |
16 |
256 |
∑ |
90 |
80 |
100 |
|
|
|
572 |
Mean |
9 |
8 |
10 |
|
|
|
|
SM-SMSM |
0 |
1 |
1 |
|
|
|
|
(SM-SMSM)2 |
0 |
1 |
1 |
|
|
|
|
∑(SM-SMSM)2 = 0+1+1 = 2
SMSM = ∑X/(nj) = (90+80+100)(10x3) = 270/30 =9
This is the same as dividing the sum of the separate sample means (SM) by the number of samples (j): i.e. SMsouth+SMmidlands+SMnorth/3= (9+8+10)/3 = 27/3 =9
∑(X-SMSM)2 = ∑(X-9)2 which means finding the difference for each of the 30 values of X from SMSM, squaring the difference and then summing the squares. Rewriting the table as a fruequency table for all thirty items:
Rewrite Table 8.4.3.6.1.5.2.1 as a frequency table combining the three samples and calculate deviations from SMSM (mean of the sample means, which is 9 in this case)
X |
f |
d |
∑fd |
∑fd2 |
2 |
5 |
-7 |
-35 |
245 |
3 |
3 |
-6 |
-18 |
108 |
4 |
5 |
-5 |
-25 |
125 |
5 |
2 |
-4 |
-8 |
32 |
6 |
1 |
-3 |
-3 |
9 |
7 |
2 |
-2 |
-4 |
8 |
8 |
3 |
-1 |
-3 |
3 |
13 |
1 |
4 |
4 |
16 |
14 |
1 |
5 |
5 |
25 |
15 |
1 |
6 |
6 |
36 |
16 |
1 |
7 |
7 |
49 |
20 |
2 |
11 |
22 |
242 |
24 |
1 |
15 |
15 |
225 |
26 |
1 |
17 |
17 |
289 |
28 |
1 |
20 |
20 |
400 |
∑ |
|
|
0 |
1812 |
∑(X-SMSM)2 = 1812 - 02 = 1812
If SMSM is not a whole number it is easier to use the assumed mean method of calculation if computing by hand. Otherwise use a calculator or a computer program.
Therefore
Fd1,d2 = (n(n-1)(∑(SM-SMSM)2))/((∑(X-SMSM)2)-(n∑(SM-SMSM)2)-(j((∑(SMR - SMSM)2))
Fd1,d2 = 10(9)(2)/((1812) - 10(2) - 3(572))
Fd1,d2 = 180 /(1812 - 20 -1716)
Fd1,d2 = 180/76 = 2.37
Decision: H0 cannot be rejected as calculated F is less than than critical value (of 3.55). The three means are not significantly different: there is no significant difference between the number of public houses per region.
Top
Unworked examples
1. A group of 60 students were lectured together but split into five groups (I - V) for seminars. The examination results at the end of the year are shown in the table below. Is there a significant difference in marks between the seminar groups?
I |
II |
III |
IV |
V |
23 |
26 |
23 |
20 |
22 |
24 |
40 |
32 |
24 |
27 |
35 |
40 |
43 |
27 |
28 |
40 |
42 |
49 |
39 |
29 |
45 |
47 |
49 |
44 |
33 |
48 |
48 |
50 |
46 |
41 |
49 |
50 |
57 |
50 |
42 |
50 |
56 |
58 |
53 |
44 |
53 |
74 |
59 |
57 |
53 |
55 |
77 |
80 |
60 |
55 |
58 |
|
|
|
60 |
59 |
|
|
|
66 |
60 |
|
|
|
70 |
62 |
|
|
|
82 |
74 |
|
|
|
83 |
Answer available at ANOVAQ1Answer, computed using the One-Way ANOVA Calculator at Social Statistics.com (accessed 13 January 2020).
2. Investigating the effectiveness of four different fertilisers (a, b, c and d), 8 areas were slected (I to VIII) and divided into four plots of equal size. Each of the different fertilisers was applied to one of the plots in each area at random. EWeat was sown in each plot and the yileds in Bushels is shown in the Table below. is there a significant difference in the yield for the four fertilisers?
Area |
Fertliser |
|
a |
b |
c |
d |
I |
17 |
10 |
14 |
9 |
II |
13 |
12 |
8 |
13 |
III |
13 |
14 |
10 |
16 |
IV |
15 |
13 |
12 |
10 |
V |
10 |
11 |
11 |
8 |
VI |
12 |
12 |
14 |
13 |
VII |
15 |
16 |
15 |
18 |
VIII |
20 |
17 |
18 |
13 |
Note that this is a two-way ANOVA because there are two independent variables (the plot and the fertiliser) and one-dependent variable, bushels of wheat.
3. A first-year biology course at a university wanted to see if student's entry grades affected their results at the end of year one. The Table below shows the test scores (out of 20) for 60 students (30 female and 30 male) divided into two groups, high entry grade and low entry grade. Is there a significant difference between the high and low- entry-grade groups; or between the genders?
Male high entry |
Male low entry |
Female high entry |
Female low entry |
8 |
7 |
8 |
9 |
8 |
8 |
8 |
9 |
10 |
9 |
8 |
10 |
11 |
11 |
11 |
11 |
11 |
12 |
11 |
11 |
12 |
12 |
12 |
11 |
13 |
13 |
13 |
13 |
14 |
13 |
14 |
14 |
14 |
14 |
14 |
14 |
14 |
15 |
15 |
14 |
14 |
15 |
16 |
15 |
15 |
15 |
16 |
16 |
15 |
16 |
16 |
17 |
15 |
17 |
16 |
18 |
16 |
18 |
17 |
19 |
|