Social Research Glossary A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Home
Citation reference: Harvey, L., 2012-24, Social Research Glossary, Quality Research International, http://www.qualityresearchinternational.com/socialresearch/
This is a dynamic glossary and the author would welcome any e-mail suggestions for additions or amendments.
|
|
_________________________________________________________________
Dispersion
Dispersion refers to the extent to which a set of data is spread out, or dispersed from the ‘average’.
There are various measures of dispersion.
The range is simply the difference between the highest and lowest value. (For example, an age range of 18 to 39 is a range of 22 years inclusive).
The quartile deviation (QD) is defined as half the inter-quartile range (see also deciles). It is the mean difference between the median and the first and third quartiles, that is, the average deviation of each quartile from the median.
The inter-quartile ratio is equal to the inter-quartile range divided by the median.
[This seems an odd measure because a dispersion with the same inter-quartile range but different absolute median’s would have quite different inter-quartile ratios.
For example:
Distribution 1:
1 2 2 4 4 5 5 9
has a median of 4 and IQ range of 5-2=3
The inter-quartile ratio is thus 3/4 = 0.75
Distribution 2:
11 12 12 14 14 15 15 19
has a median of 14 and IQ range of 15-12=3
The inter-quartile ratio is thus 3/14 = 0.21]
This is sometimes known as the quartile dispersion coefficient and is a measure of how heterogeneous the data are.
The inter-quartile range is a measure of dispersion of a frequency distribution. When an ordered distribution is split into four equal parts the quartiles are those values of variable below which 25 per cent (Q1), 50 per cent (Q2) and 75 per cent (Q3) of the distribution lie. The distance between the upper quartile (75% quartile, Q3) and the lower quartile (25% quartile, Q1) is known as the interquartile range.
The interquartile range thus measures the range of the middle half of a distribution, ignoring extremes.
The mean deviation is a measure of dispersion. It is the sum of the absolute differences (ignoring the minus signs for those below the mean) between a set of observations and their mean, divided by the number of observations. xxxx This statistic is not often used as an alternative measure of dispersion around the mean, the standard deviation, is more useful.
The standard deviation is a measure of spread around the mean and is suitable for interval scale data.
The standard deviation is a measure of dispersion that takes into account all the values in a distribution. It is based on the deviation of each value of X from the arithmetic mean.
The standard deviation is the square root of the variance. See http://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/2-mean-and-standard-deviation (accessed 3 June 2019) for how to compute a standard deviation.
Variance is a measure of the variation within a distribution. The variance of a distribution is the mean squared deviation of each value of X from the mean.
To compute the variance:
first compute the arithmetic mean;
second, calculate the difference between the mean and each value of the variable,
third, square these differences (they all become positive),
fourth, add them up;
fifth, then divide by the number in the sample.
This gives the average squared deviation of each value of the variable (X) from the mean, that is, the variance.
Variance is, thus, equal to the square of the standard deviation (or the standard deviation is the square root of the variance.)
This appears to be rather complicated just to measure the spread of a distribution, and conceptually it is a bit cumbersome. However, it is an important measure of dispersion with a large number of applications, especially in probability theory, significance testing, and correlation and regression techniques.
Computers can make this calculation in a fraction of a second, although it is a bit laborious doing it by hand.
Analysis of variance is a useful statistical testing technique. The technique splits the variation within a variable into that which is explained by another variable(s) and that which is not. Viz. explained variance is that part of the variance of a variable that can be predicted given the values of another variable. Unexplained variance is that part of the variance that cannot be explained in this way. Unexplained variance may be attributable to other explanatory variables that have not been included in the analysis or the unreliability of measures constructed for variables that were included.
The coefficient of variation is a measure of relative dispersion. It is the standard deviation of a distribution divided by the mean. It can be used to compare variation between different variables, between variables measured in different units, and between variables with different mean values. The larger the resulting coefficient of variation the greater the heterogeneity in the data and the smaller the coefficient the greater the homogeneity.
The measure is misleading when comparing two distributions that are otherwise identical but have different base points. For example, the difference between the coefficient of variation for the temperature measured in degrees celsius or degrees absolute. [EXPLAIN WHY]
Range: The difference between the highest and lowest scores in a distribution.
Mean Deviation: A measure of variation that indicates the average deviation of scores in a distribution from themean: It is determined by averaging the absolute values of thedeviations.
Standard Deviation: A term used in statistical analysis. A measure of variation that indicates the typical distance between the scores of adistribution and the mean; it is determined by taking the square root of the average of the squared deviations in a given distribution.It can be used to indicate the proportion of data within certain ranges of scale values when the distribution conforms closely to the normal curve.
Standard Error (S.E.) of the Mean: A term used in statistical analysis. A computed value based on the size of the sample and the standard deviation of the distribution, indicating the range within which the mean of the population is likely to be from the mean of the sample at a given level of probability (Alreck, 456).
The standard deviation is a statistical term that measures how much individual scores of a given group vary from the average (mean) score of the whole group. Another way of saying this is that it measures the spread of the individual results around the average of all the results.
See also
Researching the Real World Section 8.3.12.7
DATA ANALYSIS: A BRIEF INTRODUCTION Section 9 (Downloads .pdf file)
NHS, undated,