MAIN MENU

Basics

References

About Researching the Real World

Search

Contact

© Lee Harvey 2012–2024

Page updated 8 January, 2024

Citation reference: Harvey, L., 2012–2024, Researching the Real World, available at qualityresearchinternational.com/methodology
All rights belong to author.

MAIN MENU Basics

References

About Researching the Real World Search Contact

© Lee Harvey 2012–2024

Page updated 8 January, 2024 Citation reference: Harvey, L., 2012–2024, Researching the Real World, available at qualityresearchinternational.com/methodology All rights belong to author.

MAIN MENU

Basics

About Researching the Real World

Search

Contact

Page updated 8 January, 2024

Citation reference: Harvey, L., 2012–2024, Researching the Real World, available at qualityresearchinternational.com/methodology
All rights belong to author.

RESEARCHING THE REAL WORLD

Orientation Observation In-depth interviews Document analysis and semiology Conversation and discourse analysis Secondary Data Surveys Experiments Ethics Research outcomes
Conclusion

Activities

Social Research Glossary

8.1 Introduction to surveys
8.2 Methodological approaches
8.3 Doing survey research
8.4 Statistical Analysis

8.4.1 Descriptive statistics
8.4.2 Exploring relationships
8.4.3 Analysing samples

8.4.3.1 Generalising from samples
8.4.3.2 Dealing with sampling error
8.4.3.3 Confidence limits
8.4.3.4 Statistical significance
8.4.3.5 Hypothesis testing
8.4.3.6 Significance tests

8.4.3.6.1 Parametric tests of significance
8.4.3.6.2 Non-parametric tests of significance

8.4.3.6.2.1 Chi-square test

8.4.3.6.2.1.1 Yates' Correction to the chi-square test
8.4.3.6.2.1.2 Does the chi-square test show dependence?
8.4.3.6.2.1.3 The problem of categorising: the 'Median Test' solution

8.4.3.6.2.2 Mann Witney U test
8.4.3.6.2.3 Kolmogorov-Smirnov test
8.4.3.6.2.4 H test
8.4.3.6.2.5 Sign test
8.4.3.6.2.6 Wilcoxon test
8.4.3.6.2.7 Friedman test
8.4.3.6.2.8 Q test

8.4.3.7 Summary of significance testing and association: an example

8.4.4 Report writing

8.5 Summary and conclusion

8.4.3.6.2.1 Chi-square test

Introduction to Chi-square test
Hypotheses
What we need to know
Sampling distribution
Assumptions
Testing Statistic
Critical values
Worked examples
Unworked examples

Chi is a greek letter used to represent this test. It looks a bit like an X, especially when produced on a computer screen, which is confusing. In most computerised usages it looks like this χ², with the X extending below the line. The words chi-square will be used in the main below rather than the χ² representation to avoid ambiguity.

The chi-square test is used to test whether an observed series of values differs significantly from what was expected. It can be used for one or more samples provided provided the (1) data can be divided into the same categories in each sample; (2) it is possible to establish the expected values; (3) there is sufficient sample data. The chi-square text cannot be used for related samples, nor when more than 20% of the expected frequencies are under 5. The chi-square test is inapplicable when any expected frequency is zero. How expected frequencies are calculated is explained below.

The chi-square test is relatively straightforward and is widely used. Sometimes it is used when a more powerful (precise) test would be more appropriate.

For a flow chart of what tests to use in different cicumstances, see significancetestdecisionflowchart.jpg

How are the expected frequencies determined? Consider a simple example; suppose an an unbiased coin is tossed 100 times we would expect the coin to land heads 50 times and tails 50 times. If the coin landed heads 52 times this would be considered merely the result of sampling error. If, however, the coin landed heads 65 times one might wonder whether the coin was in fact unbiased. The chi-square test is a way of comparing the observed and expected values to see if the difference is significant.

The determination of the expected values is frequently not as simple as in the illustration above, and is the major difficulty in applying the test. The worked examples below will concentrate on the derivation of the expected frequencies as the mechanics of calculating chi-square is straightforward, if a little laborious. (There are various programs that will do the computations as suggested below)

Expected frequencies can sometimes be increased by combining categories. But before any such combination is made it is necessary to consider whether the resultant combination is meaningful, and whether the analysis based upon the resultant combined categories is desirable. If combining categories results in only two categories remaining and one of these has an expected frequency of less than five, then the binomial distribution should be used to test the combined data.

Top

Hypotheses
H0: The observed series of values do not differ significantly from those expected

HA:The observed series of values differ significantly from those expected.

The chi-square test is not a test ofany parameter, it tests to see whether an observed distribution is significantly different in any way to that expected. It is not possible to define the direction of this difference with the chi-square test hence there is only one alternative hypothesis, one tail tests do not exist for chi-square.

Top

What we need to know
To use the chi-square test we need to know all the sample data being investigated in the form of frequency tables. If more than one sample is being tested, each sample must be able to be broken down into the same categories (Whether or not there is any observed frequency attributable to that particular category). Finally, we need to be able to derive the expected values for each category into which we have divided the observed frequencies.

So, in essence, chi-square requires the sample data and is not concerned with descritpive statistics such as the sample mean or variance.

Top

Sampling distribution
The chi-square distribution is the distribution of the sum of the squares of the standard unit differences between observed and expected frequencies.The chi-square test is thus related to the normal carve but differs significantly in that it applies only to discrete variables (each category being discrete from the others). The normal curve is continuous and in certain conditions the chi-square test will lead to inaccurate results if there is no allowance for the continuous nature of the the curve from which it derives. This allowance is known as Yates' Correction and is explained in 8.4.3.6.2.1.1.

The chi-square distribution can be viewed as giving the probability of a given sum of the relative squared differences between observed and expected frequencies for any given number of degrees of freedom. (See NOTE on degrees of freedom.) Such that, if the probability of occurence of the calculated chi-square is less than the required significance level, then the observed frequencies are unlikely to be the result of simple sampling error when the sample is taken from a population based upon the assumption of the expected frequencies. The expected frequencies are compiled upon the basis of the null hypothesis being tested.

For example, if the null hypothesis is that a given coin is unbiased a sample of tosses is made and the results recorded. The expected frequencies are calculated on the basis of the null hypothesis that heads and tails will occur an equal number of times. Chi-square is then calculated on the basis of the observed and expected frequencies. The resultant chi-square value is then compared with the required critical value of chi-square for the relevant number of degrees of freedom (in this case one, discussed below). If the calculated chi-square exceeds the critical value, then, at the given significance level, it is unlikely that the observed sample results are compatible with those espected on the basis of the null hypothesis. The summed relative difference is larger than can be accounted for by sampling error.

Top

Assumptions
The only assumptions concerned with the chi-square test are those relating to Yates' Correction, that is, the test is valid when the Correction is not applied (i.e. when the number of degrees of freedom exceeds one.) (See Section 8.4.3.6.2.1.1)

Top

Testing statistic
The formula for the testing statistic for chi-square (χ²)

χ² = ∑(O-E)²/E

Where

O is the observed value(s) and E is the expected value(s)
Note: the sigma sign (∑) is a summation sign and the arithmetic operation therefore has to be carried out for each pair of observed and expected frequencies and the result for each is then summed.

Top

Critical values
The chi-square (χ²) distribution is another distribution whose shape depends upon the number of degrees of freedom associated with the observed data.
How many degrees of freedom?

1) One sample: number of degrees of freedom equals the number of categories the data is divided into minus one.
2) More than one sample: number of degrees of freedom equals (m-1)(j-1) where m is the number of categories and j is the number of samples.

Having established the number of degrees of freedom the critical chi-square value can be found from tables of critical values for whatever significance level is required. (Usually any computersied calculation program for chi-square will also generate the appropriate critical value.)

An on-line version of the chi-square critical values table can be found, for example, at the Engineering Statistics Handbook:

https://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (accessed 24 May 2020)

Top

Worked examples
1. A simple one sample example to establish the principle of the chi-square test.

A coin is tossed one hundred times resulting in 63 heads. Use chi-square to test to see if the coin is fair at a 95% confidence level.

Hypotheses:

H0:The coin is fair (i.e. the observed frequencies do not differ significantly from what was expected)
HA:The coin is biased (i.e. the observed and expected frequencies differ significantly).

Significance level: 5%

Testing statistic: χ² = ∑(O-E)²/E

Critical value: Number of degrees of freedom. One sample, two categories (heads and tails) so 2-1 degrees of freedom = 1

Chi-square for one degree of freedom at 5% significance level = 3.84

Decision rule: reject H0 if χ²>3.84

Computation: First calulate expected frequencies. In this case the calculation is simple as a fair coin would result in 50 heads and 50 tails.

Observed data: 63 heads 37 tails

Expected data: 50 heads 50 tails

χ² = ∑(O-E)²/E = (63-50)²/50 +(37-50)²/50 = (13)²/50 + (-13)²/50

χ² = 169/50 + 169/50 = 338/50 = 6.76

Decision: Reject the null hypotheses (H0) as 6.76 > 3.84. It is unlikely that a fair coin would result in the distribution of heads and tails observed. This suggests that the coin is biased.

Top

2. A two sample comparison. In the following example, the sample is broken down into four nominal categories that cannot be ordered or ranked. Chi-square is very useful for this kind of data.

A sample of 100 men and 200 women in Scotland were asked to indicate which party they voted for in the last election. The results were:

	Labour	Conservative	SNP	Other	Totals
Men	50	20	25	5	100
Women	70	40	65	25	200
Totals	120	60	90	30	300

Is there a significant difference in the distribution of votes between the men and women?

Hypotheses:

H0:The distribution of votes do not differ significantly
HA:The distribution of votes differ significantly

Significance level: 5%

Testing statistic: χ² = ∑(O-E)²/E

Critical value: Number of degrees of freedom. Two samples, (m-1)(j-1) where m is number of categories (4) and j the number of samples (2) thus (4-1)(2-1) degrees of freedom = 3

Chi-square for 3 degrees of freedom at 5% significance level = 7.81

Decision rule: reject H0 if χ²>7.81

Computation: First calulate expected frequencies. If there is no difference between men and women then the proportion of votes for each party would be reflect the proportion of men and women in the total sample. So men account for 100/300, i.e. 1/3. So the expected frequency for Labour votes would be 1/3(120) i.e 40 and the other ttwo thirds would be the expected female number, viz. 80. And so on

Expected frequencies

	Labour	Conservative	SNP	Other	Totals
Men	1/3(120)=40	1/3(60)=20	1/3(90)=30	1/3(30)=10	100
Women	2/3(120)=80	2/3(60)=40	2/3(90)=60	2/3(30)=20	200
Totals	120	60	90	30	300

Calculating chi-square requires finding the difference between each observed value and the corresponding expected value, squaring it, dividing by the expected value, then summing all the results.

O	E	O-E	O-E²	(O-E²)/E
50	40	10	100	100/40	2.5
20	20	0	0	0.0	0.0
25	30	-5	25	25/30	0.83
5	10	-5	25	25/10	2.5
70	80	-10	100	100/80	1.25
40	40	0	0	0.0	0.0
65	60	5	25	25/60	0.42
25	20	5	25	25/20	1.25
				Chi-square	8.75

Decision: Reject the null hypotheses (H0) as χ²>7.81. The distribution of votes of men and women in Scotalnd do differ at a 5% significance level.

Top

3. A three sample comparison. In the following example, the sample is broken down into six interval categories that can be ordered. Chi-square is is used for multiple samples and ordinal categories.

A factory has three shifts, A, B and C. The output is divided into 6 different quality grades I=VI and the amount oif product in each category is shown in the table below. Is the quality grade of the output of the three different shifts significantly different?

Quality Grade	I	II	II	IV	V	VI	Totals
Shift A	10	8	6	6	7	13	50
Shift B	16	16	18	12	18	20	100
Shift C	30	40	36	33	26	35	200
Totals	56	64	60	51	51	68	350

Hypotheses:

H0:The three shifts do not differ significantly
HA:The three shifts differ significantly

Significance level: 5%

Testing statistic: χ² = ∑(O-E)²/E

The chi-square test can be used as there are more than two samples of non-interval-scale data.

Critical value: Number of degrees of freedom. Three samples and six categories, (m-1)(j-1) where m is number of categories (6) and j the number of samples (3) thus (6-1)(3-1) degrees of freedom = 10

Chi-square for 10 degrees of freedom at 5% significance level = 18.31

Decision rule: reject H0 if χ²>18.31

Computation: First calulate expected frequencies. If the shifts produce equal quality distributions we would expect then to produce a proportion of each grade relative to the proportion of total production that each shift produces. Shift A produces one seventh of total output (i.e. 50/350 = 1/7). Therefore. it would be expectedto produce 1/7th of the total grade I output, l/7th of the total grade II output and so on. Similarly Shift B produces 2/7th of total output and so would be expected to produce 2/7 of each quality grade. Similarly, Shift C produces the remaining 4/7th of total output and thus would be expected to produce 4/7th of each quality grade. Hence, the following expected frequencies:

Quality Grade	I	II	II	IV	V	VI	Totals
Shift A	56/7=8	64/7	..	..	..	..	50
Shift B	56x2/7=16	64x2/7	..	..	..	..	100
Shift C	56x4/7=32	64x4/7	..	..	..	..	200
Totals	56	64	60	51	51	68	350

The resultant Observed and Expected frequencies and calculation of chi-square is as follows:

O	E	O-E	(O-E)²	(O-E)²/E
10	8.00	2.00	4.00	0.50
8	9.14	-1.14	1.30	0.14
6	8.57	-2.57	6.60	0.77
6	7.29	-1.29	1.66	0.23
7	7.29	-0.29	0.08	0.01
13	9.71	3.29	10.82	1.11
16	16.00	0.00	0.00	0.00
16	18.29	-2.29	5.24	0.29
18	17.14	0.86	0.74	0.04
12	14.58	-2.58	6.66	0.45
18	14.58	3.42	11.70	0.80
20	19.43	0.57	0.32	0.02
30	32.00	-2.00	4.00	0.13
40	36.57	3.43	11.76	0.32
36	33.88	2.12	4.49	0.09
33	29.14	3.86	14.90	0.51
26	29.14	-3.14	9.86	0.34
35	38.86	-3.86	14.90	0.38
			χ² =	6.13

Decision: Cannot reject the null hypotheses (H0) as χ²<18.31. The three shifts do not produce significantky different quality outputs at a 5% significance level.

Note that the calculation above rounds to two decimal places at each stage. This reults in a very small distortion compared to computations that go to five decimal places.

The computerised chi-square calculator from Maths is Fun (accessed 1 May 2020) provides the following output:

Actual Values:
10 8 6 6 7 13
16 16 18 12 18 20
30 40 36 33 26 35

Expected Values:
8 9.14286 8.57143 7.28571 7.28571 9.71429
16 18.2857 17.1429 14.5714 14.5714 19.4286
32 36.5714 34.2857 29.1429 29.1429 38.8571

Chi-Squared Values:
0.5 0.142857 0.771429 0.226891 0.0112045 1.11135
0 0.285714 0.0428571 0.453782 0.806723 0.0168067
0.125 0.321429 0.0857143 0.510504 0.338936 0.382878

Chi-Square = 6.13407

Degrees of Freedom = 10

p = 0.803876

The difference is small and makes no difference to the decision in this case but in marginal cases premature rounding could lead to a different result.

Top

4. The table below shows the daily newspaper most preferred by 500 respondents, divided into four groupd by age and gender. Is there a significant difference in newspaper preferred according to demographic grouping?

Newspaper	A	B	C	D	E	F	G
Female under 35	10	5	0	1	20	10	4
Female 35+	16	8	14	0	30	31	1
Male under 35	50	24	18	7	49	50	2
Male 35+	54	23	18	12	21	9	13

Using the chi-square calculator from Maths is Fun, the results were:

Actual Values:
10 5 0 1 20 10 4
16 8 14 0 30 31 1
50 24 18 7 49 50 2
54 23 18 12 21 9 13

Warning: Actual Value less than 5.
Results not reliable.

Expected Values:
13 6 5 2 12 10 2
26 12 10 4 24 20 4
52 24 20 8 48 40 8
39 18 15 6 36 30 6

Chi-Squared Values:
0.692308 0.166667 5 0.5 5.33333 0 2
3.84615 1.33333 1.6 4 1.5 6.05 2.25
0.0769231 0 0.2 0.125 0.0208333 2.5 4.5
5.76923 1.38889 0.6 6 6.25 14.7 8.16667

Chi-Square = 84.5693

Degrees of Freedom = 18

p = 0

On the basis of the ch-square calculation, the very low p value shows that there is a significant difference in newspaper preference between the four groups.

Note, though that chi-square does not tell us anything about the nature of the difference (is it age or gender or both) just that there is a difference.

Note also that the low expected frequencies mean that the result is unreliable. The general rule of thumb is that if more than 20% of expected frequencies are less than 5, then the chi-square test becomes too unreliable to be valid.

Top

8.4.3.6.2.1.1 Yates' Correction to the chi-square test
Yates' Correction is only applied in practice when the number of degrees of freedom is equal to one. It ls used to relate the discrete data used in chi-square analysis to the continuous nature of the normal curve from which the chi-square distribution derives.

χ²(corrected) = ∑((O-E)-0.5)²/E

In other words the difference between each pair of observed and expected frequencies is reduced by 0.5 before being squared.

For example if a coin is tossed 100 times we would expect 50 heads, if as in worked example 1 the coin landed heads 63 times the corrected chi-square calculation would be:

χ²(corrected) = ∑((O-E)-0.5)²/E

χ²(corrected) = ((63-50)-0.5)²/50 +((37-50)-0.5)²/50 = (12.5)²/50 + (12.5)²/50 = 6.25

In this case the ultimate decision would not be changed (the uncorrected value of chi-square was 6.76) and the decision would still stand, thus

Decision: Reject the null hypotheses (H0) as 6.25 > 3.84. It is unlikely that a fair coin would result in the distribution of heads and tails observed. This suggests that the coin is biased.

In marginal cases, though the decision could be reversed.

Yates' correction is rarely applied in practice as when the number of degrees of freedom equals 1, a z test of proportions is usually applicable, as there are only two categories to test with either one or two samples and the z test is not restricted to a two-tail test.

Note that when the degrees of freedom=1 then the square root of chi-square = z

Top

8.4.3.6.2.1.2 Does the chi-square test show dependence?
The chi-square test is often used as a test to show that one variable does or does not depend on another. A significant chi-square is considered to be evidence that the two variables considered are inter-dependent.

Consider the following example. A sample of three hundred income earners of the same age, divided into three groups of equal size according to income, were asked what educational qualifications they possessed. The three random samples were drawn independently of each other. Normally, we would set up the null hypothesis that the three samples (representing three populations with different incomes) do not differ significantly with respect to the variable under observation, namely educational qualification. An alternative approach that appears popular is to set up a null hypothesis that states that income and educational qualifications are not related.

Suppose the results to the survey were as in the table below.

Income	None	BA	PhD
High	10	30	60
Medium	20	60	20
Low	60	30	10

From a cursory glance at the data it can be seen that the maximum frequencies in each sample lie in cells that fall on a left to right diagonal, such that the majority of high income earners have Phds and the majority of low income earners have no qualifications. The idea that income and qualification are related seems reasonable. The chi-square test is used at this juncture to 'prove' that the two variables are related. By carrying out a chi-square test for four degrees of freedom the result is significant.

Actual Values:
10 30 60
20 60 20
60 30 10

Expected Values:
30 40 30
30 40 30
30 40 30

Chi-Squared Values:
13.3333 2.5 30
3.33333 10 3.33333
30 2.5 13.3333

Chi-Square = 108.333

Degrees of Freedom = 4

p = 0

So, the argument is that chi-square shows that in this case, income depends on qualification. However, if this conclusion is correct, what about the strength of the relationship? The chi-square test says nothing about that; nothing about the extent to which income would depend on qualification, in this example.

In fact, chi-sqaure does not show dependence at all. Consider the rearrangement of data in the table below

Income	None	BA	PhD
High	40	60	0
Medium	0	60	40
Low	50	0	50

Clearly, in this case there is no relationship between income and qualification, although the three samples are clearly significantly different in their distribution.

Actual Values:
40 60 0
0 60 40
50 0 50

Warning: Actual Value less than 5.
Results not reliable.

Expected Values:
30 40 30
30 40 30
30 40 30

Chi-Squared Values:
3.33333 10 30
30 10 3.33333
13.3333 40 13.3333

Chi-Square = 153.333

Degrees of Freedom = 4

p = 0

However, there is clearly no linear relationship, and seemingly no relationship at all. All we can say is that the three samples have significantly different qualifications.

This is all that our original null hypothesis hopes to ascertain, the commonly used, but incorrect alternative null hypothesis speculating about dependence is quite clearly misleading. The significance test merely exposes the difference, it is up to the researcher to probe for the cause.

Thus, the chi-square test can only be used to show significant difference between samples with respect to a variable: it cannot be used so show that the variable, differentiating the populations from which the samples were taken, depends directly upon the variable being investigated.

Top

8.4.3.6.2.1.3 The problem of categorising: the 'Median Test' solution
When data of an ordinal nature is available in an uncategorised state, for example the results of three samples of students measured on an introversion scale, how do we categorise the data in order to carry out a chi-square test?

Suppose the data in the table below is to be tested for significant difference. 10 people in three different classes have introversion scores as shown. Note introversion scores are not interval (let alone ratio) scale data so that analysis of variance could not be used.

Class I	Class II	Class III
34	28	30
40	42	33
43	50	34
49	56	35
50	58	36
62	59	39
64	63	40
68	64	52
70	69	63
75	69	76

The data could be grouped into four categories, below 40, 40 to 49, 50 to 59, 60 and above. However, that would result in all expected frequencies below 5 and the chi-square test would be invalid.

An approach that would provide a valid chi-square would be to divide the results into two halves, above and below the median.

The median provides us with simple compact logic on which to base the null hypothesis. If the null hypothesis is that there is no difference between the samples, i.e. that they come from the same population, then the median of the combined samples should split each separate sample in half. Any variation from this equal split being solely due to the effects of random sampling and therefore within the bounds of probability tested by the chi-square test.

Consider the example above; the median of the combined sample is 51. Each sample consists of ten variables and therefore we expect five scores above 51 and five below in each sample.

Observed	Above median	Below median
I	5	5
II	7	3
III	3	7

Expected	Above median	Below median
I	5	5
II	5	5
III	5	5

Critical chi-square for 2 degrees of fredom at 95% is 5.99

Calculated chi-square = 3.2

In this example, the samples do not differ sihgnificantly from that which was expected.

A problem arises with this approach when we have values equal to the median. Should such values be counted as above the median, below the median, or ignored altogether?

As such values are variables from a sample they may not be ignored and the simplest procedure is to divide the number of values in each sample that equal the median, in half. This can sometimes become complicated if the distributions cluster around the median.

So, for relatively small samples of ordinal data the best method of testing, if using chi-square test is to divide the data on the basis of the combined sample median. This particular process is often referred to as the Median Test (for independent data), and as such is usually carried out in much more elaborate ways with the use of specially computed formulae, which give a chi-square value, and, as far as the author can see, have no advantage over the sample approach outlined above.

Although the chi-square test is widely used, it is not a powerful test and the discussion above has suggested that there are other tests in some circumstances that are more appropriate. The following sections consider other non-parametric tests (for use with nominal, ordinal and even interval scale data).

Top

Unworked examples
1. A sample of 300 people from three age groups were asked which of the local supermarkets they used most. The results are shown in the table below. Is there any significant difference in supermarket use for the three age groups?

Supermarket	A	B	C	D
Under 30	60	20	20	20
30 to 49	36	16	20	8
50+	54	9	5	32

2. In a survey designed to see who watches football, a sample of 100 men in four occupation groups were asked whether they watched football matches, "frequently", "occasionally", or "never". Of 20 Farmers interviewed, 15 "never" watched and only 2 did so "frequently". Of 30 businessmen 8 and 12 fell in the "never" and "frequently'" categories respectively. Similarly of 20 teachers 5 "never" watched and 10 "frequently" watched. The remaining respondents were classed as Labourers and 16 "frequently" watched whilst 2 "never" watched. Tabulate the results and test the data for significant difference? Explain your result.

3. The number of records sold by four different shops in a small area are given in the table below. Do the shops cater for significantly different tastes in music?

4. In order to see if Labour gain any advantage from high turnout at elections a sample of 40 seats with an increased turnout were selected at random. Of these 12 had a swing more than 1% above national average, 6 had a swing more than 1% below national average and the rest had a swing within 1% either way of national average. In sixty seats sselected at random fromt those with no increase in turnout only 8 had a swing more than 1% above national average while 14 had a swing more than 1% below national average. By comparing these two samples see if Labour gain any significant advantage by a high turnout.

5 The students on all four years of social science degree course were all asked fo fill in a questionnaire to show each students' conservatism score. The results are shown in the table below, where a low score indicates conservative and high score non-conservative. Test the data to see if there is any significant difference in scores between years.

Scores
1st Year 2 3 3 4 5 6 7 9 9 10 11 12 13 13 IS 18 20
2nd Year 3 5 8 8 9 10 11 13 14 16 17 17 18 19 20
3rd Year 4456 9 lO 12 12 13 15 16 18 19 20 20 20
4th Year 2 2 4 8 11 11 13 13 15 17 18 18 19 20

"Classical"