1.18: Variation: Normal Distribution, Range, and Standard Deviation
In the field of psychology, there are several ways to organize measurements of a trait, feature, or characteristic (i.e., variables). Qualitative data, such as ethnicity, can be tabulated into a frequency count to provide information about the proportion, as well as the variety of groups in a sample or population. On the other hand, researchers can perform a wider set of calculations on quantitative data. The mean, mode, and median, for instance, are central tendency measures to identify a typical value of a variable within a given numerical data set. Likewise, there are also a few approaches to estimating the distance of scores from each other, referred to as variability or variation, including range, variance, and standard deviation.
The range calculates the distance or difference between only the highest and lowest scores of a variable but provides no details about the scores in between. A high value denotes a wider spread of scores, but outliers may result in misinterpretations. For these reasons, the range is considered a less precise method to measure variation.
Researchers typically use variance to estimate the average distance of all scores in the sample or population around the mean. First, the mean is determined by dividing the sum of all the raw scores of a specific variable by the total number of scores in the sample. Subtracting the mean from each of the raw scores produces a set of deviation scores that will comprise of both positive as well as negative integers, depending on whether the scores are higher or lower than the mean. Attempting to compute the mean of deviation scores will be insufficient, because the positive integers will cancel negative integers leading to a sum of zero. Squaring the deviations converts the negative deviation scores into positive scores, while still providing a reasonable estimate of the distance between the mean and each data point. Totaling the squared deviations forms the sum of squares (SS).
The SS is divided by either the total number of data points (N) or degrees of freedom (N-1), if the variance is computed for a sample or estimated for a population of scores, respectively. Dividing the sum of squared deviations provides an aggregate estimate of the general distance between the scores and the mean.
The square root of the variance is the standard deviation. This arithmetic step serves to counterbalance the squaring of deviations in the preceding step of the variance formula. The standard deviation not only describes the general spread of scores in a population or sample set, but it is also used to assess the distance between a particular score from the mean. If the scores follow a normal curve, the location of the score relative to the center of the curve can relate to its likelihood of occurrence (probability).
Implications of Variation
A reduced range of scores in a sample or population corresponds to a decrease in variance. For example, data from females exhibit a low spread in characteristics such as verbal performance, math performance, and height when compared with males. In these cases, understanding sources of the variation among males, such as environmental or biological factors, is as important as recognizing between-group differences.