# Variation: Normalverteilung, Bereich und Standardabweichung

JoVE Core
Social Psychology
Zum Anzeigen dieser Inhalte ist ein JoVE-Abonnement erforderlich.  Melden Sie sich an oder starten Sie Ihre kostenlose Testversion.
JoVE Core Social Psychology
Variation: Normal Distribution, Range, and Standard Deviation

### Nächstes Video1.19: Statistical Significance

During a data collection project, a student is interested in gathering the heights of adult men in their city. Upon returning to the classroom, the pupils graph the frequency of heights in the sample population. The resulting curve is bell-shaped, with a single peak at the center of which lies the mean. While a single data point, such as the mean, is crucial to the analysis of these results, so too is the variation. Defined as the dispersion of measurements within a data set, this quantity describes the spread of results, giving a sense of the distance between points. Additionally, the graph is symmetric, with half of the individuals demonstrating a stature taller than and half shorter than the average. This is referred to as a normal distribution or curve. To evaluate the variation, they first calculate the range of the results, which is the difference between the highest and lowest heights. Although the range describes the spread of data, it can be dramatically affected by outliers—like the school’s tallest basketball player—and doesn’t elucidate how measurements are positioned around the mean. To address this, the student uses an equation to compute a second gauge of variation, termed the standard deviation—the average amount that measurements differ from the mean. Here, the standard deviation is two-and-a-half inches, so males are—on average—two-and-a-half inches shorter or taller than the mean. Based on the properties of a normal distribution, within this one negative and positive standard deviation, 68% of individuals will fall. This number will increase to 95% for two standard deviations—here five inches above or below the average height—and 99.7% for three standard deviations. Importantly, the lower the standard deviation, the more tightly results cluster around the mean, which produces a tall and narrow normal curve. So, if a data set has a small standard deviation, it will have low variation. Thus, means for measurements with low variability are more likely to be a reliable representation of the sample population than those derived from results with high variation, which may be disproportionally affected by outliers.

## Variation: Normalverteilung, Bereich und Standardabweichung

In the field of psychology, there are several ways to organize measurements of a trait, feature, or characteristic (i.e., variables). Qualitative data, such as ethnicity, can be tabulated into a frequency count to provide information about the proportion, as well as the variety of groups in a sample or population. On the other hand, researchers can perform a wider set of calculations on quantitative data. The mean, mode, and median, for instance, are central tendency measures to identify a typical value of a variable within a given numerical data set. Likewise, there are also a few approaches to estimating the distance of scores from each other, referred to as variability or variation, including range, variance, and standard deviation.

Range

The range calculates the distance or difference between only the highest and lowest scores of a variable but provides no details about the scores in between. A high value denotes a wider spread of scores, but outliers may result in misinterpretations. For these reasons, the range is considered a less precise method to measure variation.

Variance

Researchers typically use variance to estimate the average distance of all scores in the sample or population around the mean. First, the mean is determined by dividing the sum of all the raw scores of a specific variable by the total number of scores in the sample. Subtracting the mean from each of the raw scores produces a set of deviation scores that will comprise of both positive as well as negative integers, depending on whether the scores are higher or lower than the mean. Attempting to compute the mean of deviation scores will be insufficient, because the positive integers will cancel negative integers leading to a sum of zero. Squaring the deviations converts the negative deviation scores into positive scores, while still providing a reasonable estimate of the distance between the mean and each data point. Totaling the squared deviations forms the sum of squares (SS).

The SS is divided by either the total number of data points (N) or degrees of freedom (N-1), if the variance is computed for a sample or estimated for a population of scores, respectively. Dividing the sum of squared deviations provides an aggregate estimate of the general distance between the scores and the mean.

Standard Deviation

The square root of the variance is the standard deviation. This arithmetic step serves to counterbalance the squaring of deviations in the preceding step of the variance formula. The standard deviation not only describes the general spread of scores in a population or sample set, but it is also used to assess the distance between a particular score from the mean. If the scores follow a normal curve, the location of the score relative to the center of the curve can relate to its likelihood of occurrence (probability).

Implications of Variation

A reduced range of scores in a sample or population corresponds to a decrease in variance. For example, data from females exhibit a low spread in characteristics such as verbal performance, math performance, and height when compared with males. In these cases, understanding sources of the variation among males, such as environmental or biological factors, is as important as recognizing between-group differences.