Back to chapter

5.6:

5-Number Summary

JoVE Core
Statistics
This content is Free Access.
JoVE Core Statistics
5-Number Summary

Languages

Share

The 5-number summary of a dataset includes the minimum value, the first quartile, the median or second quartile, the third quartile, and the maximum value.

For example, consider the number of donuts sold every month in a small shop.

First, rearrange the data from lower to higher values to determine the minimum and maximum donuts sold each month. The median represents the second quartile.

Using the formula, the first and the third quartile are calculated to determine all five components of a 5-number summary.

This represents an overview image of the dataset; each value describes a specific part of the data: the median identifies the center, the upper and lower quartiles span the middle half, and the highest and lowest observations explain the actual data dispersion.

The 5-number summary is visually represented using a boxplot. It allows one to easily spot outliers, the spread, and the range of data.

5.6:

5-Number Summary

In a dataset, the 5-number summary includes the minimum data value, the data value of the first quartile, the median data value or data value of the second quartile, the data value of the third quartile, and the maximum data value. These 5 data values can be visualized as a box and whisker plot.

In a box plot, the minimum and maximum data values represent the lower and upper whiskers in the graph, and the median is designated as the center of the box in the chart. The first quartile and third quartile data values are represented by the lower and upper edges of the box in the box and whisker plot.

This 5-number summary is handy for a quick understanding of the spread of the data and the identification of any outliers.

Consider an example of a dataset containing the number of donuts sold in a shop. To obtain the 5-number summary, the researcher can initially arrange the values in the dataset in ascending order to determine the minimum, maximum, and median data values. Then, using the correct formulae, the researcher can determine the data values of the first and third quartiles. Further, these values can be used to construct a box and whisker plot.