John Beasley, MD and his colleagues at the University of Wisconsin Department of Family Medicine and Community Health and the Department of Industrial and Systems Engineering recently completed a research project. They investigated the burden of electronic medical record (EMR) systems on family practice physicians at the University of Wisconsin. This burden is one factor driving physician burn-out and dissatisfaction with medicine as a profession.

They used logs from the EMR, validated by direct observation of physician computer use. The researchers found a consistent pattern of “work after clinic”--time spent during evenings and weekends.

Physicians averaged about 10 hours a week in EMR work after clinic over the three-year period of the study.

This research is notable for two reasons. First, it made novel use of EMR logs to quantify the extent of the work after clinic phenomenon.

Second, it described three specific ideas that could reduce primary care physician EMR work by as much or more time than 10 hours per week of work after clinic :

• Transcription with human assistance (Save 6+ hours each week).

• Paper/verbal order entry (Save 3+ hours each week).

• Automatic Log-in (Save 1+ hour each week)

What’s the prospect for successful adoption of these change ideas?

Dr Chris Hayes has made the case that changes in health care practice are more likely to be adopted if they have relatively high perceived value to patients and providers and at the same time don’t add workload (See Chris’s web site and this 2015 article in BMJ Quality and Safety, )

The perceived value and impact on time depend on each other—changes that reduce work seem likely to have more value to providers than changes that are work neutral or worse, add work.

Chris summarizes the situation with this picture:

By Hayes’ theory, the change ideas proposed in the UW research appear to be highly adoptable and sustainable once adopted. However, the two changes with the biggest impact requires other people to do more work and be paid to do so.

In the current situation, physicians are working after clinic “for free.” Longer-term effects like fatigue and burn-out, which lead physicians to seek less than full-time positions or leave the profession altogether, are diffuse and don’t show up in a regular bi-weekly cost statement. Adding cost for support staff, on the other hand, is easy to recognize and resist in a world of cost management.

So administrators and physician leaders will have to convince themselves of the business case for the package of changes.

How to make the business case?

Use the Model for Improvement:  Run tests, starting on a small scale—involve one physician, over one or two days, to iron out logistics. Then test the changes for longer periods of time, with more providers. Measure the impact on physician time using the EMR logs, costs for support staff, and physician perception. Have the administrators and physician leaders observe the tests themselves to inform their decisions.


In the previous post I described an exercise that uses Galton’s Quincunx.

The challenge: maximize the number of results in a five-value range, over three rounds of 20 drops of the Quincunx. Based on the original exercise devised by Dr. Rob Stiratelli, the exercise starts with the “aim” of the Quincunx at the low end of the range. And at the beginning of the second and third rounds, the device has a calibration problem that shifts the aim three or four units.

Last week, we ran the exercise with four teams in a workshop. Groups A, B and C used a simulation I wrote in R and Group D used the modern Quincunx shown at left.

The exercise's participant instruction sheet is available here.

Results of the exercise


As I stated in the first post, if you can aim the funnel at the center of the desired range, you usually can get at least 16 of 20 values to fall in that range.

All four teams realized on Round 1 that the system was running on the low side of the range and made adjustments to move the aim closer to the center of the range.

However, the hidden change in the aim at the start of Round 2 caused confusion and uncertainty. Teams that had been getting almost all values in the desired range now were getting values outside the range, even though the nominal position of the meter setting was the same.

With some testing and debate, teams recovered and by the end of the second round were again getting results mostly in the desired range.

The change at the start of Round 3 caused the same challenges—initial hesitation and lack of confidence in the relationship between meter setting and output gave way to better performance, after adjustments to the aim.

Team B had the best results, but this seems to be related to a problem in the simulation (see below.)

Team Lessons from the Exercise

1. No teams made a run chart of the results and the meter values. They looked at the table of numbers and did their best to look at average results and the impact of the meter setting. Our management exercise followed a 90-minute presentation and practice with run charts.  The failure to make run charts is a sobering reminder that it takes repetition and presence of mind to apply improvement methods and tools when under pressure to perform, even in a training-room exercise.

2. No team got perfect results because the Quincunx system as designed is not capable of regularly producing values in a five point range. Team best efforts, management incentives, and public scorecards don’t change the underlying structure.

3. No teams systematically tested the relationship between meter setting and output. Experimentation comes at the cost of possibly poor results. I did not hear any clear discussion of how to test in the face of uncertain outcomes.

4. The teams did not cooperate; no one sent any representatives to other tables to try to learn what strategies seemed promising.

Quincunx Properties: Variation Concepts

1. The Quincunx device shows that

a. Variation in results arises from variation in funnel position (an input measured by the meter setting) and system structure, represented by the pins.

b. In other words, variation in input and system structure causes the results to vary.

c. If we can study and identify how the changes in inputs and system conditions drive the variation in results, we can work “upstream” to reduce this variation in a cost-effective way.

2. A Model for Common Causes


a. The structure of the pins provides a physical model for what we call “common causes of variation.”

b. The built-in variation of the pins drives variation in results.

c. Each pin contributes a small but meaningful amount of variation to the results.

d. We can describe the variation that results from the pattern of pins--we expect to see a range of plausible values, without systematic patterns.

3. A Model for Special Causes

a. We can assign the specific movement of the funnel to changes in results so this movement will serve as our model for “special causes of variation.” Walter Shewhart, the inventor of control charts, used the term “assignable causes” to indicate that we may be able to assign a specific cause to this class of variation.

b. Movement of the funnel causes variation in results on top of the variation that arises from the pins.

c. The movement of the funnel can be relatively large or small; the size of the movement matches the ease of detecting the movement.

d. As a useful approximation, the total variation in results is composed of variation from movement of the funnel plus the variation contributed by the pins.

4. General Description of “Common Causes of Variation”

Common causes of variation are system conditions and inputs with the following properties:

a. Variation of the common causes drives variation in system results.

b. Variation in the common causes (and hence variation in the results) is built in to the current structure of the system.

c. Each common cause contributes a small portion of the variation in the results.

d. Common causes of variation combine to give a range of plausible values in the results, without systematic patterns or unusual values.

e. In practice, we can mimic “variation without systematic patterns or unusual values” by models of randomness. Also, we define system variation (with respect to a particular type of result) in terms of common causes of variation.

5. General Description of “Special Causes of Variation”

Special causes of variation are system conditions and inputs with the following properties:

a. Variation of special causes leads to variation in system results.

b. The variation in results from special causes is added to the variation that comes from common causes.

c. If the variation in special causes crosses a threshold, we will see patterns of variation or unusually large or small values in the results.

d. In practice, we say we have “evidence of special causes” when we detect patterns of variation or unusual values in system results. Then, using system knowledge, we often can match the variation in the results with variation in system conditions or inputs. In such a case, we say we have identified one or more special causes of variation.

A control chart is the primary tool to help to distinguish common causes of variation from special causes. See for example L.P. Provost and S.K. Murray (2011), The Health Care Data Guide: Learning from Data for Improvement, Jossey-Bass: San Francisco, especially chapters 4-8.

If you don’t have a physical Quincunx device, you can use the quincunx function in the animation package in R, which shows the pin variation:

My Simulation Software

My R simulation is an R shiny web app that mimics the physical Quincunx, available at https://iecodesign.shinyapps.io/Quincunx_shiny/ .

The code for the simulation is available at https://github.com/klittle314/Stiratelli_quincunx
The Admin-T and Admin-P tabs show a table and plot, respectively, for the simulation results. I do not show these tabs to participants during the simulation.

To generate a value for a given Meter Setting, the system manager clicks the “Tell System to Get Ready!” button and then clicks the “Get value” button.

The meter setting of 30 corresponds to an output value of 48, on the low end of the desired range 48-52.

After generation of 20 values, the meter setting “slips”: if the average of the first 20 values less than 50, meter value is offset three units lower. Otherwise, the meter is offset three units higher. Similarly, after the next 20 values, the meter slips again: if the average of the next 20 values is less than 50, the meter value is offset four units lower. Otherwise, the meter is offset four units higher.

Here’s a graph of 60 results from the Admin-P tab of the web app, with no adjustment to the meter (hence, no adjustment to the aim of the Quincunx funnel.) You can see how the system “center” changes during each set of 20.

A problem with the simulation software

If you lose connection to the server in the middle of a simulation, the web app restarts—there is no persistent memory in the version I used last week.

This restart phenomenon accounts for Team B’s good performance on rounds 2 and 3: the laptop used with this group repeatedly lost the connection to the server, so they worked with a system that never experienced a “slip” in the meter—once they had learned that a meter value of 32 was about right, they just could keep that setting and get pretty good results.

To avoid the problem of reset, you can run the simulation locally or edit the code to allow for persistent storage, e.g. https://shiny.rstudio.com/articles/persistent-data-storage.html.


In the mid-19th century, Francis Galton introduced the ideas and tools of regression and correlation to the world, derived from his studies of inherited characteristics.

To help his thinking and explanations, he built an analog simulation device, called the Quincunx. The picture at left is from Galton’s lucid description of the Quincunx on pp. 63-65 of his book Natural Inheritance (available in facsimile at http://galton.org).

The device consists of a chamber at the top that holds small metal balls and a funnel that directs the balls to drop through a field of pins, symmetrically placed.

Quincunx is simply a Latin word for five points, referring to the pattern of the pips representing the number five on a common six-sided die. This pattern, repeated, gives the field of pins in the device.

As the illustration from Galton’s book shows and Galton explicitly observed, the balls falling through the field of pins will pile up in a shape that looks like a “normal” distribution.


Dr. Rob Stiratelli taught me to use a Quincunx model slightly different from Galton’s version—Rob’s modern version, pictured at left, has a funnel that an operator can move left or right, changing the aim of the dropped bead. The operator of the modern Quincunx also can drop one bead at a time rather than dumping a whole collection at once as in Galton's model.

Rob used the Quincunx to help people understand the difference between special and common causes of variation.

While these two types of variation are the basis for control charts, Rob’s exercise aimed to get people to grasp the essence of acting rationally when faced with system results that vary.

Rob’s exercise poses a challenge: maximize the number of beads that fall in a specific five consecutive slot range. That’s not too hard to do if you can see the funnel and get the funnel aimed at the middle of that range. Given ten rows of pins and a funnel centered correctly on the center of the range in the Quincunx pictured here, you will consistently achieve at least 16 out of 20, and frequently 18 of 20.

But Rob made things more difficult, with two twists.

First, he covered up the front of the Quincunx. The only thing the audience could see was the bottom row of the display, where the slots were numbered. A five slot target range could be numbered 48 to 52.

To give people a little help, Rob described the position of the funnel, by means of a “meter value” in 1:1 relationship with the position.

The exercise starts with the funnel centered at the low end of the target range. In the set up I've used for years, the initial funnel position corresponds to a meter value of 30. Move the meter to 32 and the funnel will be centered at the middle of the target range.

Rob added one more twist. After the first 20 beads are dropped, he surreptitiously offset the funnel position—slipping the funnel left or right 3 positions, without telling the audience (remember, the Quincunx’s funnel is hidden from view.)

For example, at the beginning of the second round of 20 bead drops, a meter value of 30 will now point the funnel at outcome slot 45. At the beginning of the third round of 20, Rob slipped the funnel left or right 4 positions.

Teams are often confused by the offset, which mimics the fact that human-engineered systems never behave the same for long without regular maintenance and attention. How should managers respond?  There’s a tension between calling for an adjustment when the funnel is stationary (variation from the pins only) and waiting too long to make an adjustment for a system that has moved off target—either way on average leads to a lower total score.

I’ve continued to use Rob’s exercise with my clients.

In the next post, I’ll summarize the results from a training class running next week and link to the workshop materials I’m using.

Computer Simulation or Physical Device?

You can buy a version of Galton’s device (e.g. http://www.qualitytng.com/). I am fortunate enough to own three of these devices, a legacy from a Ford Motor Company training project a number of years ago.

The R statistical language has a package animations that includes a Quincunx simulator. I thought about using that simulator for Rob’s exercise but it turned out to be easier to build my own version, as a Shiny app. I’ll use it for the first time in public next week at the training class and then post the code to GitHub.

I like having a physical Quincunx in class though it is heavy and awkward to transport.

Here’s why:

The physical version allows you to easily make changes to the system to see what happens—tilting the device or blocking off channels in the pin block with tape or paper. These changes are easy for anyone to see, requiring no knowledge of R code.

More importantly, the physical device provokes a discussion of why the outcomes of the beads dropping through the pins vary. Usually, someone in the class will say “it’s just random variation”, which allows us to ponder the difference between math models of a system and the system itself.

That’s an important lesson any day but perhaps particularly so this month in the aftermath of our recent U.S. national election.

Back to top