8.1 Introduction to surveys
8.2 Methodological approaches
8.3 Doing survey research
8.4 Statistical Analysis
8.4.1 Descriptive statistics
8.4.2 Exploring relationships
8.4.3 Analysing samples
8.4.3.1 Generalising from samples
8.4.3.2 Dealing with sampling error
8.4.3.3 Confidence limits
8.4.3.4 Statistical significance
8.4.3.5 Hypothesis testing
8.4.3.6 Significance tests
8.4.3.6.1 Parametric tests of significance
8.4.3.6.1.1 z tests of means and proportions
8.4.3.6.1.1.1 One sample z test of means
8.4.3.6.1.1.2 Two sample z test of means
8.4.3.6.1.1.3 One sample z test of proportions
8.4.3.6.1.1.4 Two sample z test of proportions
8.4.3.6.1.2 t tests of means
8.4.3.6.1.3 F test of variances
8.4.3.6.1.4 Parametric tests of differences in means of matched pairs
8.4.3.6.1.5 Analysis of variance
8.4.3.6.2 Non-parametric tests of significance
8.4.3.7 Summary of significance testing and association: an example
8.4.4 Report writing
8.5 Summary and conclusion
When to use the z Test
z tests are applicable in general when the sample size exceeds thirty. It is a test of population means or proportions.
It tests either a hypothetical value against an actual value of a population mean or proportion, or tests two population means or proportions for significant differences.
There are, therefore, four cases of the z test. Each case will be considered separately below.
Unworked examples (for all four cases).
Top
When to use the one sample z test of means
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
The one sample z test of means is used to compare a sample mean with a claimed or hypothesised population mean. (This may be a particular assertion made about the population or comparison with past population statistics.)
The claim is made about the population and the sample mean is an estimate of the population mean.
The test considers whether any difference between the claimed population mean and the sample mean is indicative of any difference between the claimed population mean and the true population mean, or whether such a difference may be accounted for by sampling error.
Hypotheses
H0: µ0=µ
i.e. a null hypotheses that the claimed value of the population mean is equal to the true population mean. The alternative hypothesis is that it is not, which can be in one of three ways (not equal, greater or smaller):
HA: µ0≠µ not equal; two-tail test
HA: µ0>µ greater than; upper-tail test
HA: µ0<µ less than; lower-tail test
What we need to know
Claimed value of population mean: µ0
Sample mean (the conventional notation for the sample mean is an upper case X with a line across, called X-bar. However, it does not work well on the Internet and so in equations the full word "sample mean' will be used or where there is no ambiguity, it will be abbreviated to "SM").
Sample (or population) standard deviation: s (or σ)
Sample size: n
Sampling distribution
The sampling distribution is the distribution of sample means. The mean of the distribution of sample means is equal to the mean of the population from which the samples are drawn.
The standard deviation of the distribution of sample means is called the standard error (the conventional notation for the standard error is σ with X-bar as a subscript. Again this doesn't work well on the Internet and so the full word "standard error' will be used or where there is no ambiguity, it will be abbreviated to "SE".
The standard error equals the population standard deviation (σ) divided by the square root of the sample size, i.e. σ/√n.
But it is unlikely that the population standard deviation is known and so it has to be estimated on the basis of the sample.
So the estimated standard error (est SE) is the sample standard deviation divided by the square root of the sample size minus 1:
est SE = s/√(n-1)
See Case Study: Illustration that the sample variance is a biased estimate of the population variance for an explanation of why the sample size is reduced by 1 in estimating the standard error.
For an explanation of the concept of the distribution of sample means see Section 8.3.13
See NOTE on degrees of freedom
Assumptions
The only assumption made is that the distribution of sample means closely approximates a normal distribution when the sample size is thirty or more (i.e. n > 30).
Testing statistic
The formula for the testing statistic, z, is the difference between the sample mean and the hypothesised population mean divided by the standard error
z = (SM - µ0)/SE
The resultant value of z is compared to critical values to determine whether there is any significant difference.
Critical values
The critical values for z are derived from tables of areas under the normal curve, z is the number of standard units from the mean that contain the required area as defined by the confidence interval.
See Section 8.3.13.13 for some typical critical values for the z test.
The decision rule is that the null hypothesis is rejected if the calculated value is:
greater than the critical +z value for an upper-tail test;
is less than the critical -z value for a lower-tail test;
greater than +z or less than -z for a two-tail test. (See Section 8.3.13.13).
Worked example
The average monthly disposable income for residents of a large conurbation excluding the Northern sector, was found to be £2,200. It was claimed that the residents in the Northern sector had a higher income on average than the conurbation average. The disposable income of a sample of 50 residents in the Northern sector was as follows (income brackets and freqency):
Income £ |
f |
x |
fx |
fx2 |
1000-1500 |
2 |
1250 |
2500 |
3125000 |
1501-2000 |
12 |
1750 |
21000 |
36750000 |
2001-2500 |
20 |
2250 |
45000 |
101250000 |
2501-3000 |
8 |
2750 |
22000 |
60500000 |
3001-3500 |
1 |
3250 |
3250 |
10562500 |
3501-4000 |
4 |
3750 |
15000 |
56250000 |
4001-5000 |
1 |
4500 |
4500 |
20250000 |
5000-10000 |
1 |
7500 |
7500 |
56250000 |
10000-20000 |
1 |
15000 |
15000 |
225000000 |
Totals |
50 |
|
135750 |
569937500 |
Hypotheses:
H0: µ0=µ
HA: µ0>µ
One-tail test to see if the Northern sector had more disposable income than the rest of the conurbation.
Significance level: 5%
Testing statistic: z = (SM - µ0)/SE
Critical value: z = 1.64 (one-tail test at 5%)
Decision rule: reject H0 if z>1.64
Computation:
µ0 = £2200
Using the mid-point of each income bracket as indicative of the income within each bracket,
Sample Mean (SM) =135750/50 = £2715
Sample Standard Deviation (s) = √(569937500/50) - (2715 x 2715)
= √(11398750 - 7371225) = √4027525 = £2006.865
Est Standard Error (est SE) = 2006.865/√(50-1) = £286.695
z = (2715 - 2200)/286.695 = 515/286.695 = 1.796
Decision: Reject the null hypotheses (H0: µ0=µ) as 1.796 > 1.64 (the critical value of z in a one-tail test at 5% significance level). The test was to see if the Northern sector had more disposable income than the rest of the conurbation.
Hence, the average disposable income in the Northern sector is greater than in the rest of the conurbation.
Top
When to use the two sample z test of means
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
This test compares two sample means to see if the difference between them is indicative of any difference between the populations from which the samples are taken.
Two samples taken from the same population are likely to have different means, therefore it is not sufficient to conclude that any difference between sample means represents a difference between the means of the populations from which the samples were taken.
The difference between sample means must be shown to be significant before differences in population means may be asserted. The test provides the procedure for establishing significant difference.
Hypotheses
H0: µ1=µ2
i.e. the mean of the population from which sample one was drawn is equal to the mean of the population from which sample two was drawn.
The alternative hypothesis is that the two populations have different means, which can be in one of three ways (not equal, greater or smaller):
HA: µ1≠µ2 not equal; a two-tailed test
HA: µ1>µ2 greater than; upper-tail test
HA: µ1<µ2 less than; lower-tail test
What we need to know
Mean of sample 1: SM1
Mean of sample 2: SM2
Standard deviation of sample 1 (or the standard deviation of the population from which sample 1 was taken): s1 (or σ1)
Standard deviation of sample 2 (or the standard deviation of the population from which sample 2 was taken): s2 (or σ2)
Size of sample 1: n1
Size of sample 2: n2
Sampling distribution
The sampling distribution is the distribution of sample mean differences. Again, this is a theoretical distribution. Consider a single population. Any two samples taken from it are likely to have different means. Both samples are prone to sampling error.
The distribution of sample mean differences is the distribution of all the differences between all possible pairs of sample means that could be taken from a population. The difference between the two observed sample means is then compared with the distribution of all differences to see whether or not it is likely to be a difference that could be the result of sampling error (i.e. the two samples are from the same population) or whether the difference is too great to be accounted for by sampling error (i.e. the two samples are from different populations).
The distribution of sample mean differences has a mean equal to zero. On average, the differences between sample means, taken from the same population, will be zero.
The standard deviation of the distribution of sample mean differences is equal to the sum of the two sample standard errors.
Both sample means deviate from the mean of the population but both are from the same population. The possible difference between two sample means is greater than the difference between either sample mean and the mean of the population (i.e. two samples could be at either extreme of the distribution whereas a single sample can only deviate from the population mean, at the centre of the distribution, by at most half the width of the distribution). This suggests that the standard deviation of the distribution of sample mean differences will be bigger than the standard deviation of the distribution of sample means. As both samples are prone to sampling error then the distribution of differences will have a sampling error that is equal to the sum of the sampling errors of the samples.
Thus, the variance (or standard deviation squared) of the distribution of sample mean differences, denoted varianceSM1-SM2 equals the sum of the variances of the distribution of the sample means from which the two samples were taken.
Thus:
varianceSM1-SM2 = varianceSM1 + varianceSM2
Therefore the standard deviation of the distribution of sample means differences equals
σSM1-SM2 = √(varianceSM1 + varianceSM2)
Replacing varianceSM1 and varianceSM2 by the best estimates (which are the sample variances divided by sample size minus 1)
then the best estimate of the standard error of the distribution of mean differences is
Estimated standard error of mean differences = √(variance sample 1/ sample size minus 1) +(variance sample 1/ sample size minus 1)
est σSM1-SM2 = √(var1/n1-1) +(var2/n2-1)
See Case Study: Illustration that the sample variance is a biased estimate of the population variance for an explanation of why the sample size is reduced by 1 in estimating the standard error.
For an explanation of the concept of the sampling distributionsee Section 8.3.13
Assumptions
The only assumption made is that the distribution of sample means differences closely approximates a normal distribution when both samples exceed thirty (i.e. n1 > 30 and n2 > 30).
Testing statistic
The formula for the testing statistic, z, is the difference between the sample means divided by the standard error:
z = (SM1 -SM2)/σSM1-SM2
The resultant value of z is compared to critical values to determine whether there is any significant difference.
Critical values
The critical values for z are derived from tables of areas under the normal curve, z is the number of standard units from the mean that contain the required area as defined by the confidence interval.
See Section 8.3.13.13 for some typical critical values for the z test.
The decision rule is that the null hypothesis is rejected if the calculated value is:
greater than the critical +z value for an upper-tail test;
is less than the critical -z value for a lower-tail test;
greater than +z or less than -z for a two-tail test.
(See Section 8.3.13.13).
Worked example
Assembly-line workers in London and Birmingham were investigated to see if they had significantly different earnings. Two random samples were taken. Sample 1 (from London) connsisted of 43 workers with mean daily earnings of £58 and standard deviation of £6. Sample 2 (from Birmingham) consisted of 40 workers with mean earnings of £49 and standard deviation of £13. Does this indicate that similar workers in London and Birmingham have different earnings?
This is a test of two sample means and, as no direction is indicated, a two-tail test should be used.
Hypotheses:
H0: µ1=µ2
HA:µ1≠µ2
Two-tail test to see if there is any difference, direction unspecified.
Significance level: 5%
Testing statistic: z = (SM1 - SM2)/σSM1-SM2
Critical value: z = 1.96 (two-tail test at 5%)
Decision rule: reject H0 if z>1.96 or z<-1.96
Computation:
SM1 = 58
SM2 = 49
s1 = 6, so var1 = 36
s2 = 13, so var2 = 169
n1 = 43
n2 = 40
est σSM1-SM2 = √((var1/n1-1) +(var2/n2-1))
est σSM1-SM2 = √((36/42) +(169/39)) = 2.278
z = (58 - 49)/2.278 = 3.95
Decision: Reject the null hypotheses (H0: µ1=µ2) 3.95 > 1.96 (the critical value of z in a two-tail test at 5% significance level); the two samples of workers earning are significantly different.
Unless it can be shown that the weekly earnings used to compute the two sample means are unrepresentative, then the conclusion is that assembly-line workers in London and Birmingham have different earnings.
Top
When to use the one sample z test of proportions
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
This test compares a sample proportion with a claimed population proportion to see if the difference is indicative of any difference between true and claimed population proportions. It is, therefore, similar to the one-sample a test of means.
Hypotheses
H0: ∏0=∏
i.e. a null hypotheses that the claimed value of the population proportion is equal to the true population proportion. The alternative hypothesis is that it is not, which can be in one of three ways (not equal, greater or smaller):
HA: ∏0≠∏ not equal; a two-tailed test
HA: ∏0>∏ greater than; upper-tail test
HA: ∏0<∏ less than; lower-tail test
What we need to know
Claimed value of population proportion: ∏0
Sample proportion: p
Sample size: n
Sampling distribution
The sampling distribution is the distribution of sample proportions. The mean of the sampling distribution of proportions is equal to the population proportion.
The standard deviation of the distribution of sample proportions is called the standard error of proportions (denoted as σp).
The standard error of proportions ie equal to √(∏(1-∏)/n) or √(∏0(1-∏0)/n), which is the form we use as we do not know ∏.
For an explanation of the concept of sampling distribution see Section 8.3.13
Assumptions
The distribution of sample proportions is in fact a binomial distribution (for more information see, for example, Stat Trek (accessed 1 December 2018)).
The binomial distribution may be said to closely approximate a normal distribution when np > 5 (where p < 0.5) or n(1-p) > 5 (where (1-p) < 0.5).
So if the sample slze is 30 then p must lie between 0.17 and 0.83 if np is to be greater than 5.
Hence, for very small p (or 1-p), a larger sample size is required before the assumption of a normal distribution of sample proportions is fulfilled. However, the approximation to normal is tenuous if p is smaller than 0.1 or greater than 0.9.
Testing statistic
The formula for the testing statistic, z, is the difference between the sample proportion (p) and the hypothesised population proportion (∏0) divided by the standard error of proportions (σp)
z = (p - ∏0)/σp
Critical values
The critical values for z are derived from tables of areas under the normal curve. See Section 8.3.13.13 for some typical critical values for the z test.
The decision rule is that the null hypothesis is rejected if the calculated value is:
greater than the critical +z value for an upper-tail test;
is less than the critical -z value for a lower-tail test;
greater than +z or less than -z for a two-tail test.
Worked example
It is claimed that one out of every three trade unionists vote for the Conservative Party. A sample of 200 trade unionists selected at random showed that at the last election 78 voted Labour, 21 voted Liberal and 95 voted conservative and 48 did not vote at all. Test the data to see if the number of trade unionists who voted Conservative is as great as is claimed.
This is a test of proportions, testing a sample proportion against a claimed population proportion. The direction of the test is specified, hence it is a one-tail test.
Hypotheses:
H0: ∏0=∏
HA: ∏0<∏
Significance level: 5%
Testing statistic: z = (p - ∏0)/σp
Critical value: z = 1.64 (1-tail test at 5%)
Decision rule: reject H0 if z<-1.64
Computation:
∏0 = 0.333
p = 55/200 = 0.275
σp = √(∏0(1-∏0)/n) = √(0.333(1-0.333)/200) = √(0.2218/200)
= √.0011 = 0.0333
Therefore
z = (0.275 - 0.333)/0.0333 = 0.058/0.0333 = -1.74
Decision: Reject the null hypotheses (H0:∏0=∏) as calculated z less than lower critical value (-1.674 < -1.64).
The proportion of trade unionists who vote Conservative is less than claimed. The claim lies outside the lower limit of the confidence interval for the population proportion.
Top
When to use the two sample z test of proportions
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
This test compares two sample proportions for significant difference. It shows whether or not two different sample proportions are the result of samples being taken from different or identical populations. It is therefore similar to the two-sample z test of means.
Hypotheses
H0: ∏1=∏2
i.e. the the value of the population proportions for the populations from which the two samples were drawn are the same.
The alternative hypothesis is that the two populations proprotions have different means, which can be in one of three ways (not equal, greater or smaller):
HA: ∏1≠∏2 not equal; a two-tailed test
HA: ∏1>∏2 greater than; upper-tail test
HA: ∏1<∏2 less than; lower-tail test
What we need to know
Sample proportion for sample 1: p1
Sample proportion for sample 2: p2
Size of sample 1: n1
Size of sample 2: n2
Sampling distribution
The sampling distribution is the distribution of differences of sample proportions. The mean of this distribution is equal to zero. Again, this is a theoretical distribution and is similar to the distribution of sample mean differences (see Section 8.3.14.1.1.2)
The standard deviation of the distribution of differences of sample proportions is called the standard error of proportion differences and is denoted by σp1-p2
σp1-p2 = √((∏1(1-∏1)/n1) + (∏2(1-∏2)/n2))
which can be estimated by using the weighted average of the sample proportions (denoted by upper case P1,2) (in some cases this is denoted by p-bar (i.e. a p with a bar above) but as before this does not work well on the Internet)
So P1,2 = √((n1p1 + n2p2)/(n1 + n2))
est. σp1-p2 = √((P1,2(1-P1,2)(1/n1 + 1/n2))
Assumptions
The assumption is that the binomial distribution of sample poroportion differences closely approximates a normal distribution. This is the case when nP1,2 > 5 (where P1,2 > 0.5) or n(1-P1,2) > 5 (where P1,2) < 0.5).
I.e. where the weighted average of proportions tends to be at the extreme edges (greater than 0.9 or less than 0.1), then a larger sample than 30 is required)
Testing statistic
The formula for the testing statistic, z, is the difference between the sample means divided by the standard error
z = (p1 -p2)/σp1-p2
Critical values
The critical values for z are derived from tables of areas under the normal curve. See Section 8.3.13.13 for some typical critical values for the z test.
The decision rule is that the null hypothesis is rejected if the calculated value is:
greater than the critical +z value for an upper-tail test;
is less than the critical -z value for a lower-tail test;
greater than +z or less than -z for a two-tail test.
Worked example
In a recent survey of attitudes towards violence on the screen, of a sample of 5O women chosen at random 35 said that the amount of violence should be reduced. A similar survey of 80 men showed that 48 of them were in favour of reducing screen violence. Does the survey indicate a significant different in attitudes towards screen violence according to gender?
This is a test of two sample proportions to see whether the populations from which they have been taken have the same population proportion.
Hypotheses:
H0: ∏1=∏2
HA: ∏1≠∏2
Two-tail test to see if there is any difference, direction unspecified.
Significance level: 5%
Testing statistic: z = (p1 -p2)/σp1-p2
Critical value: z = 1.96 (two-tail test at 5%)
Decision rule: reject H0 if z>1.96 or z<-1.96
Computation:
p1 = 35/50 = 0.7
p2 = 48/80 = 0.6
n1 = 50
n2 = 80
P1,2 = √((n1p1 + n2p2)/(n1 + n2)) = √((50(0.7) + 80(0.6))/(50 + 80))
=√((35 + 48)/130) = √0.63846 = 0.799 = 0.8
est. σp1-p2 = √((P1,2(1-P1,2)(1/n1 + 1/n2)) = √((0.8(1-0.8)(1/50 + 1/80))
= √((0.8(0.2)(0.02 + 0.0125)) = √((0.16)(0.0325)) = √0.0052 = 0.072
z = (p1 - p2)/est. σp1-p2 = (0.7 - 0.6)/0.072 = 1.39
Decision: Cannot reject the null hypotheses (H0: µ1=µ2) 1.39 < 1.96 (the critical value of z in a two-tail test at 5% significance level); males and females do not have significantly different views.
It does not matter how you get the results, either working them out yourself or using a computer program; key issue is (a) to understand which test to use in a given circumstance and (b) how to interpret the outcomes.
Unworked Examples
1. Past records at a school show that the average grade of students taking an English examination is 65% with a standard deviation of 16%. A new method of teaching is employed and a random sample of 64 students is selected. The sample average is 69%. Is there a significant increase ia grade at a 5% significance level?
2. A medicine manufacturer claims that his product 'Supercure' is 70% effective in relieving a certain type of headache. A sample of 200 patients test the medicine and 125 find that Supercure relieves the ache. Is the manufacturer's claim an overstatement?
3. In 1958 a random sample of 109 men from Humberside showed their mean wage to be £15 while a sample of 400 men from Tyneslde had a mean wage of £14.75. Both have standard deviatons of £2. Is there a significant difference between them.
4. A sample poll of 200 college students and 250 sixth formers showed that 59% and 55% respectively were against two-tier education. Does the difference of 4% suggest that college students are significantly more opposed to two-tier education that sixth formers? (Test at 1% level).
5. The contents of a bottle of lotion is claimed to be 100cc. A random sample of 145 bottles is selected and the average amount of lotion per bottle was found to be 99cc, with a standard deviation of 4cc. Is there a significant difference between the claimed and actual content of the bottles?
6. The average age of football referees last season was estimated to be 37 years. A sample of lOl referees was taken this season and their average age was discovered to be 35, with a standard deviation of 5 years. Is the age of referees significantly lower this season?
7. A sample of 225 students from University A showed that 33% would resort to violence in order to achieve greater representation. In a similar study of 360 students at University B, 50% favour violence as a last resort. Are the students at the two establishments significantly different in their attitude towards violence? Comment on your result.
8. Eight hundred and eightly people out of one thousand six hundred indicated a preference for a Labour Government, according to an opinion poll. is this a significant majority?
9. A survey of alcohol drinking at McGill University (Canada) was carried out in 1969 and it showed that of 713 Arts students 89.7% drank alcohol and of 202 Medicine students 91.2% drank alcohol. Was there a significantly greater use of alcohol by Medical students than Arts students at McGill University in 1969?
Answers to Unworked examples:
Question 1:
Using Social Science Statistics and inputting just the population mean (65), population variance (256), sample mean (69) and sample size (64)
Z Score Calculations
Z = (M - μ) / √(σ2 / n) = (69 - 65) / √(256 / 64) = 4/2 = 2
The value of z is 2. The value of p is .02275. The result is significant at p < .05.
The calculated z statistic is greater than 1.64 (critical value for one-tail test at 5% significance level). Or, alternatively, the probability of there not being a significant difference is 0.02275 (2.275%), which is lower than 5%. The student scores have increased (but whether this is down to a change in teaching method or other factors that may have changed (not least the nature of the student cohort) is a moot point).
Question 2:
z= −2.315
the null hypothesis Ho is rejected, as z <-1.64 (one-tail test of claimed proportion at 5%). Therefore, the evidence suggests the manufactuer's claim is overstated.
Question 3:
z= 1.113, which is less than 1.96 (two-tail test of 2 sample means at 5%)) so the null hypothesis is not rejected. There is no signifcant difference between the earnings.
Question 4:
Using Social Science Statistics and inputting just the sample sizes and proportions, the result:
The value of z is 0.8511. The value of p is 0.19766. The result is not significant at p < .01.
Question 5:
z=-3 , which is less than -1.64 (one-tail test of claimed mean at 5%) so the null hypothesis Ho is rejected.Therefore, there is enough evidence to claim that the population mean μ is less than 100.
Question 6:
z=-4 , which is less than the -1.64 (one-tail test of claimed mean at 5%) so the null hypothesis Ho is rejected.Therefore, the mean age of referees (μ) is significantly less than 37.
Question 7:
z=0.68 , which is less than the 1.96 (two-tail test of claimed mean at 5%) so the null hypothesis Ho ist rejected.Therefore, the attitude to violence at the two universities is not significantly different.
Question 8:
z=4 , which is more than 1.64 (one-tail test of claimed mean at 5%) so the null hypothesis Ho is rejected.Therefore, there is a signifcant majority in favour of Labour.
Question 9:
z=0.56 , which is less than 1.96 (two-tail test of claimed mean at 5%) so the null hypothesis Ho is not rejected.Therefore, there is no signifcant difference in alcohol use between medical and arts students at McGill in 1969. [Note z would be a minus value if arts students are labelled as sample 1 but as -0.56 is more than -1.96 the outcome is the same and Ho is rejected]
|