Compute a randomization test for differences among more than two conditions

The method of randomization for testing differences among more than two means is essentially very similar to the method when there are exactly two means. Table 1 shows the data from a fictitious experiment with three groups.

Table 1. Fictitious data.

T1

T2

Control

7
8
11
12

14
19
21
122

0
2
5
9

The first step in a randomization test is to decide on a test statistic. Then we compute the proportion of the possible arrangements of the data for which that test statistic is as large as or larger than the arrangement of the actual data. When comparing several means, it is convenient to use the F ratio. The F ratio is computed not to test for significance directly, but as a measure of how different the groups are. For these data, the F ratio for a one-way ANOVA is 2.06.

The next step is to determine how many arrangements of the data result in as large or larger F ratios. There are 6 arrangements that lead to the same F of 2.06: the six arrangements of the three columns. One such arrangement is shown in Table 2. The six are:

For each of the 6 arrangements there are two changes that lead to a higher F ratio: swapping the 7 for the 9 (which gives an F of 2.08) and swapping the 8 for the 9 (which gives an F of 2.07). The former of these two is shown in Table 3.

Table 2. Fictitious data with data for T2 and Control swapped

T1

Control

T2

7
8
11
12

14
19
21
122

0
2
5
9

Table 3. Data from Table 1 with the 7 and the 9 swapped.

T1

T2

Control

9
8
11
12

14
19
21
122

0
2
5
7

Thus, there are six arrangements, each with two swaps that lead to a larger F ratio. Therefore, the number of arrangements with an F as large or larger than the actual arrangement is 6 (for the arrangements with the same F) + 12 (for the arrangements with a larger F), which makes 18 in all.

The next step is to determine the total number of possible arrangements. This can be computed from the following formula:

where n is the number of observations in each group (assumed to be the same for all groups), and k is the number of groups. Therefore, the proportion of arrangements with an F as large or larger than the F of 2.06 obtained with the data is

18/13,824 = 0.0013.

Thus, if there were no treatment effect, it is very unlikely that an F as large or larger than the one obtained would be found.