Introduction
Prerequisites
Percentiles
Learning Objectives
- Create a graph to make sense of a confusing array of numbers
- Be able to describe the bias in the 1970 draft lottery
In 1969 the war in Vietnam was at its height.
An agency called the Selective Service was charged with
finding a fair procedure to determine which young men would be
conscripted ("drafted") into the U.S. military. The
procedure was supposed to be fair in the sense of not favoring
any culturally or economically defined subgroup of American men.
It was decided that choosing "draftees" solely on the
basis of a persons birth date would be fair. A birthday
lottery was thus devised. Pieces of paper representing the 366
days of the year (including February 29) were placed in plastic
capsules, poured into a rotating drum, and then selected one at
a time. The lower the draft number, the sooner the person would
be drafted. Men with high enough numbers were not drafted at all.
Table 1 shows the order in which birth dates were
drawn from the drum (from left to right). The first number selected
was 258, which meant that someone born on the 258th day of the
year (September 14th) got a draft number of "1" and
was among the first to be drafted. The second number was 115,
so someone born on the 115th day (April 24th) got a draft number
of "2." All 366 birth dates were assigned draft numbers
in this way. Someone born on the 160th day of the year (the last
draft number drawn) got a draft number of 366.
The intention was for every birth date to have
the same chance of coming up first as coming up second, or third,
etc. Was this reasonable expectation met, or were some times of
year more likely to get lower numbers than others? Look at Table
1 and see if you can discern the answer to this question. Youll
see that staring at the numbers in the table provides little idea
of the overall pattern, and thus does not help to decide whether
the birth dates were drawn randomly.
Things are much clearer if we graph the relation
between birth dates and draft number. There are many ways of creating
such a graph. Lets proceed as follows. First, well
divide the 366 birth dates into thirds (122 days each). The first
third goes from January 1 to May 1, the second from May 2 to August
31, and the last from September 1 to December 31. The three groups
of birth dates yield three groups of draft numbers. The draft
number for each birthday is the order it was picked in the drawing.
Next, from each group of draft numbers we'll pick
six numbers to summarize all 122 of them. Specifically, in each
group, we determine:
- The minimum draft number of the group
- The draft number at the 25th percentile of the group
- The draft number at the 50th percentile of the group
- The draft number at the 75th percentile of the group
- The maximum draft number of the group
- The mean of the 122 draft numbers
Each set of 6 numbers (one such set for each group
of birthdays) is then used to draw a box along a vertical scale
running from 1 to 400. (We go beyond 366 just to stop at a nice,
round number.) The bottom of the box is drawn at the draft number
corresponding to the 25th percentile. The top is drawn at the
draft number corresponding to the 75th percentile. The draft number
corresponding to the 50th percentile is drawn as a line inside
the box. Lines outside of the box mark the minimum and maximum
draft numbers. Finally, a plus sign is used to mark the mean.
The three boxes are then set side-by-side starting with the earlier
birth dates and finishing with the latest. This procedure gives
us the three boxes shown in Figure 1. For example, we see from
the first box that the 25th percentile of the first group is the
draft number 122 whereas the 75th percentile is 298. The 50th
percentile of the first group is 217, the mean is 210, and the
minimum and maximum draft numbers are 2 and 365.
If the draft numbers had been chosen randomly,
then the three boxes should have been about the same. However,
they differ systematically. The later in the year someone was
born, the lower their draft number was likely to have been. In
other words, the box representing those born in the first third
of the year is higher than the box representing those born in
the second third which is, in turn, higher than the box for those
born in the last third. Had there been no relationship between
birth date and draft number, the three boxes in Figure 1 would
be lined up horizontally. Apparently the plastic capsules holding
the birth dates were not shuffled sufficiently by the rotating
drum. The last ones put in tended to be the first ones pulled
out. (Which boys went to war was thus partly determined by a premature
decision to stop turning the drum.)
The important point is that Figure 1 brings order to a confusing
array of data. Specifically, it makes clear the relationship between
birth date and draft number. Although not everyone born late in
the year was assigned a low draft number, draft numbers did decrease
systematically with birth date. This relation is not easy to detect
from the numbers in Table 1 but the visual representation in Figure
1 makes the relationship easy to see.
Choosing what to graph and how to graph it is often the most
important part of a statistical analysis. Even sophisticated statistical
analyses are often less revealing than a well-constructed graph.
|