Measures of Variability

Prerequisites
Percentiles, Distributions, Measures of Central Tendency

Learning Objectives

  1. Determine the relative variability of two distributions
  2. Compute the range
  3. Compute the inter-quartile range
  4. Compute the variance in the population
  5. Estimate the variance from a sample
  6. Compute the standard deviation from the variance

What is Variability?

Variability refers to how "spread out" a group of scores is.

The terms variability, spread, and dispersion are synonyms, and refer to how spread out a distribution is. Just as in the section on central tendency we discussed measures of the center of a distribution of scores, in this chapter we will discuss measures of the variability of a distribution. There are four frequently used measures of variability, the range, interquartile range, variance, and standard deviation. In the next few paragraphs, we will look at each of these four measures of variability in more detail.

Range
The range is the simplest measure of variability to calculate, and one you have probably encountered many times in your life. The range is simply the highest score minus the lowest score.

Interquartile Range

The interquartile range (IQR) is the range of the middle 50% of the scores in a distribution. It is computed as follows:

IQR = 75th percentile - 25th percentile

A related measure of variability is called the semi-interquartile range. The semi-interquartile range is defined simply as the interquartile range divided by 2. If a distribution is symmetric, the median plus or minus the semi-interquartile range contains half the scores in the distribution.

Variance

Variability can also be defined in terms of how close the scores in the distribution are to the middle of the distribution. Using the mean as the measure of the middle of the distribution, the variance is defined as the average squared difference of the scores from the mean.

The formula for the variance is:

where σ2 is the variance, μ is the mean, and N is the number of numbers.

If the variance in a sample is used to estimate the variance in a population, then the previous formula underestimates the variance and the following formula should be used:

where s2 is the estimate of the variance and M is the sample mean. Note that M is the mean of a sample taken from a population with a mean of μ. Since, in practice, the variance is usually computed in a sample, this formula is most often used. The simulation "estimating variance" illustrates the bias in the formula with N in the denominator.

Standard Deviation

The standard deviation is simply the square root of the variance. The standard deviation is an especially useful measure of variability when the distribution is normal or approximately normal (see Chapter 5) because the proportion of the distribution within a given number of standard deviations from the mean can be calculated.

The symbol for the population standard deviation is σ; the symbol for an estimate computed in a sample is s.