Hi Kathy,

Thanks for this question. It comes up fairly often.

John Tukey, father of exploratory data analysis, invented box plots in 1970, long before computers with graphical user interfaces were available, let alone commonplace. Datasets were often small enough to be dealt with by hand. Tukey wanted graphical representations that could easily be created and understood with just paper and pencil. Thus his method for constructing a box plot didn’t require any computation, facilitated by the “removal” of the median value in a distribution when finding the “Q2 or Q4” values as the median of what remains.

But you probably knew all that! 😉

Tukey’s method comes up with an *approximation* to the 25th or 75th percentile of the distribution. CODAP (and Fathom) instead calculate the *actual* 25th or 75th percentile for their box plots. (See Wikipedia for computational methods.) This seems appropriate in an age when doing things by hand is rarely called for. And, of course, with any reasonably sized dataset, the two methods yield nearly, if not exactly, the same result.

As an aside, I don’t find box plots useful for characterizing a single distribution. They come into their own when comparing two or, especially, many distributions.

Bill