n/3 Approximation for Too Few Runs
I’ve written about run charts in several previous posts: here, here and here for example. They are the starting point for effective analysis for most process management situations.
A run chart shows a series of numbers in time order, with a reference median line. Remember that a run consists of consecutive points on one side of the median of the series. Points on the median are ignored in counting consecutive points.
The plot ‘Sealants 6-9 yrs’, pulled from a ‘small multiples’ display from last year’s oral health collaborative, shows the percent of patients aged 6-9 years old who come to the clinic for any reason and got all eligible molars sealed during the measurement month. The dashed line is the median for the series. The green solid line is the collaborative goal for this measure.
I think most people would look at the sealant chart and be willing to consider that performance has improved between January 2017 and spring of 2019, especially if the clinic team can relate one or more changes to their workflow they associate with the change in the sealant per cents.
There are six runs in the sealants chart. In a series of 27 independent values from some fixed statistical distribution, how unusual are six runs? (The series has 29 values but two are on the median, so the effective number of points is 29-2=27.)
The assessment of too few runs is based on a table calculated originally by Swed and Eisenhart in a 1943 paper, available here. Perla et al. reformatted the 1943 table, which appears as their Table 1, shown to the left.
The Perla table gives the rule for too few runs in the second column: for each series length n—the number of points not on the median--fewer than the value shown in column two arises no more than about 5 times in 100 when you examine a random series of data, equivalent to random sampling from a fixed statistical distribution.
As the oral health plot suggests, improvement is likely to be signaled by too few runs. Successful actions taken by an improvement team change the causal system. A single fixed statistical distribution no longer serves as a useful model.
You don’t need to memorize Perla’s table, carry it on a card or store in a smart phone. You can just remember to divide the length of the series n by 3. When the series length is not divisible by 3 without remainder, round down to the whole number in the result. For a series of length 17: 17/3 = 5.66…. Rounding down, the approximation value is 5, which matches the critical value in the table.
Across the range n=10 to 60, the only series length for which the n/3 approximation gives a value larger than the tabled values is n=12.
The approximation is conservative as n increases. That is, it provides a critical value more extreme than nominal .05 level values derived by Swed and Eisenhart. If you use the approximate value, the chance of seeing fewer than the number of runs is even less than 0.05.
What about ‘too many runs’?
A pattern of too many runs can arise in a series that represents a mixture of two systems, like per cent of on-time completion of scheduled orders across day and night shifts. In practice, relative to the number of run charts with too few runs, I have not seen as many examples of ‘too many runs’.
Remembering the value 3 as the divisor for too few runs, you can approximate too many runs by multiplying the series length by 2/3, taking the whole number part of that result and adding 3.