Measures of Variability
Author(s)
David M. Lane
Prerequisites
Percentiles,
Distributions, Measures
of Central Tendency
Learning Objectives
- Determine the relative variability of two distributions
- Compute the range
- Compute the inter-quartile range
- Compute the variance in the population
- Estimate the variance from a sample
- Compute the standard deviation from the variance
What is Variability?
Variability refers to how "spread out"
a group of scores is.
The terms variability, spread, and dispersion are
synonyms, and refer to how spread out a distribution is. Just
as in the section on central tendency we discussed measures of
the center of a distribution of scores, in this chapter we will
discuss measures of the variability of a distribution. There are
four frequently used measures of variability, the range, interquartile
range, variance, and standard deviation. In the next few paragraphs,
we will look at each of these four measures of variability in
more detail.
Range
The range is the simplest measure of variability
to calculate, and one you have probably encountered many times
in your life. The range is simply the highest score minus the
lowest score.
Interquartile Range
The interquartile
range (IQR) is the range of the middle 50% of the scores in
a distribution. It is computed as follows:
IQR = 75th percentile - 25th percentile
A related measure of variability is called the semi-interquartile
range. The semi-interquartile range is defined simply as the
interquartile range divided by 2. If a distribution is symmetric,
the median plus or minus the semi-interquartile range contains
half the scores in the distribution.
Variance
Variability can also be defined in terms of how
close the scores in the distribution are to the middle of the
distribution. Using the mean as the measure of the middle of the
distribution, the variance is defined as the average squared difference
of the scores from the mean.
The formula for the variance is:
where σ2 is the
variance, μ is the mean, and N is the number of numbers.
If the variance in a sample is used to estimate
the variance in a population, then the previous formula underestimates
the variance and the following formula should be used:
where s2 is the estimate
of the variance and M is the sample mean. Note that M is the mean
of a sample taken from a population with a mean of μ. Since,
in practice, the variance is usually computed in a sample, this
formula is most often used. The simulation "estimating
variance" illustrates the bias in the formula with N
in the denominator.
Standard Deviation
The standard
deviation is simply the square root of the variance. The standard
deviation is an especially useful measure of variability when
the distribution is normal or approximately normal (see Chapter
5) because the proportion of the distribution within a given number
of standard deviations from the mean can be calculated.
The symbol for the population standard deviation
is σ; the symbol for an estimate computed in a sample is
s.
Please answer the questions:
|