The Box Plot
- The box plot is a graphical tool used to analyse the shape, spread and outliers of a numerical distribution.
- It consists of a box with the bottom drawn at the value of quartile 1 and the top at quartile 3, a line drawn through the box at the median and a line either end of the box drawn to the lower and upper fences.
- If the median line is in the middle of the box, the distribution is approximately symmetric, if it is drawn closer to the bottom of the box, it is positively skewed, if it is drawn closer to the top of the box, it is negatively skewed.
- If the distribution has any outliers, they are represented as dots or crosses at their respective value along the y-axis and placed parallel to box.
Example
Note: it is also acceptable for box plots to be placed horizontally
Parallel Box Plots
- Parallel box plots consist of multiple box plots placed one above the other.
- Each box plot is given a title, listed below them along the x-axis.
- This visualisation allows for comparison between multiple datasets on the basis of shape, position, centre and spread.
Example
Five Number Summary
- The five number summary of a distribution is the minimum, quartile 1, the median, quartile 3 and the maximum.
Note: the order of these values matters.
Note: the maximum and minimum values include any outliers.
- This is most easily read off a box plot, though you will be required to know how to extract these values from other displays and raw datasets.
Example
The five number summary for the above box plot is: 1, 2, 2.5, 3.5, 10