Social Research Glossary A B   Citation reference: Harvey, L., 2012-20, Social Research Glossary, Quality Research International, http://www.qualityresearchinternational.com/socialresearch/ This is a dynamic glossary and the author would welcome any e-mail suggestions for additions or amendments. Page updated 24 January, 2020 , © Lee Harvey 2012–2020. A fast-paced novel of conjecture and surprises

_________________________________________________________________

Significance test

core definition

A significance test is a statistical procedure that is applied to random samples to take account of sampling error.

explanatory context

Introduction

Essentially, a statistical significance test indicates whether any observed difference between samples or relationships between variables in a sample is due to random chance or is indicative of a 'real' difference or a 'real' relationship.

There are a large number of tests of significance dealing with a large variety of situations. Tests of significance are usually divided into parametric tests and non-parametric tests.

Parametric tests are those that test sample parameters such as means, proportions, standard deviations, correlation coefficients, etc.

Non-parametric tests compare entire distrubutions rather than parameters. Non-parametric statistical tests (also known as distribution-free tests) do not involve the estimation of a population parameter and generally require no assumptions about how the scores in question are distributed. Non-parametric tests are used when parametric tests are inappropriate, for example, the data is not interval scale. However, non-paramertic tests lack some of the power of parametric tests.

The following determine a suitable test for a given situation :

a. whether samples are related or independent

b. the level of measurement of the sample data

c. the number of samples being compared

d. which parameter, if any, is being compared

Level of confidence

The level of confidence (or confidence level) is the degree of reliance that may be placed on a particular interval estimate for a population parameter. Measured as the number of times out of 100 that the confidence interval can be expected to include the 'true' parameter value, if the research were repeated a large number of times.

Level of significance

The level of significance (or significance level) is the level (or percentage) at which a statistically significant result may be incorrect. It is the probability of falsely rejecting the null hypothesis. The significance level is equivalent to 100% minus the confidence level.

Critical values

Critical values are border-line values, which may be specified for any given statistical test, dividing the region of acceptance from the region of rejection of the null hypothesis.

There are very many miinterpretations of significance tets and these are explored in Greenland, S. et al., 2016.

Note that signifcant difference is not the same as substantive difference see: Note on substantive difference.

analytical review

Matched T-Test: A statistical test used to compare two sets of scores for the same subject. A matched pairs T-test can be used to determine if the scores of the same participants in a study differ under different conditions. For instance, this sort of t-test could be used to determine if people write better essays after taking a writing class than they did before taking the writing class.
T-Test: A statistical test. A t-test is used to determine if the scores of two groups differ on a single variable. For instance, to determine whether writing ability differs among students in two classrooms, a t-test could be used.

The NHS Health News Glossary, (NHS, undated) refers to statistical significance, thus:

If the results of a test have statistical significance, it means that they are not likely to have occurred by chance alone. In such cases, we can be more confident that we are observing a 'true' result.

associated issues

Power of significance tests

How good a statistical test is at identifying differences not accounted for by sampling error is referred to as the power of the test.

Technically, the power of a statistical test is the probability of the test not accepting a null hypthesis when it is false.

Accepting a null hypthesis when it is false is known as a type 2 error (sometimes denoted by β).

A type 1 error is rejecting a null hypothesis when it is true and this is the level of significance (see above) (sometimes denoted by α).

Different tests are more or less powerful depending on how they operate.

For any given test, the factors that affect the power of the test include the sample size (the larger the sample the more power the test has, which is logical as there is less chance of 'rogue' values affecting the result) and the significance level, the power of the test is reduced the significance level is reduced (there is more chance of not rejecting the null hypothesis).

related areas

probability

statistics

Researching the Real World Section 8.3.14

Sources

Colorado State University, 1993–2013, Glossary of Key Terms available at http://writing.colostate.edu/guides/guide.cfm?guideid=90, accessed 3 February 2013, still available 27 May 2019.

Greenland, S. et al., 2016, 'Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations', European Journal of Epidemiology, 31, pp. 337–50 available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/, accessed 27 May 2019.

NHS, undated, Health News Glossary, available at https://www.nhs.uk/news/health-news-glossary/, accessed 1 June 2019. A NOVEL