Back to chapter

6.14:

Central Limit Theorem

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Central Limit Theorem

Languages

Share

Consider the dot plots for populations with a normal and uniform distribution.

The distribution of sample means for different sample sizes shows that it approaches a normal distribution as the sample size increases – this is the core principle of the central limit theorem.

Although the mean of the sample means is the same as the population mean, its standard deviation is smaller than the population standard deviation.

However, this rule does not apply to populations that are not normal and with a sample size of less than or equal to 30.

By knowing that the sample means are normally distributed, one can make better statistical analysis using the properties of normal distribution.

For example, the empirical rule that applies to the normal distribution helps to determine the probability of a group of people having mean weights within one, two, or three standard deviations away from the mean of the sample means.

These values can also be standardized into z scores. So, one could determine the probability of a group of randomly selected people with a mean weight of less than 80 kg.

6.14:

Central Limit Theorem

The central limit theorem, abbreviated as clt, is one of the most powerful and useful ideas in all of statistics. The central limit theorem for sample means says that if you repeatedly draw samples of a given size and calculate their means, and create a histogram of those means, then the resulting histogram will tend to have an approximate normal bell shape. In other words, as sample sizes increase, the distribution of means follows the normal distribution more closely.

The sample size, n, that is required to be "large enough" depends on the original population from which the samples are drawn (the sample size should be at least 30, or the data should come from a normal distribution). If the original population is far from normal, then more observations are needed for the sample means or sums to be normal. Sampling is done with replacement.

It would be difficult to overstate the importance of the central limit theorem in statistical theory. Knowing that data, even if its distribution is not normal, behaves in a predictable way is a powerful tool.

The normal distribution has the same mean as the original distribution and variance that equals the original variance divided by the sample size. Standard deviation is the square root of the variance, so the standard deviation of the sampling distribution is the standard deviation of the original distribution divided by the square root of n. The variable n is the number of values that are averaged together, not the number of times the experiment is done.

This text is adapted from Openstax, Introductory Statistics, Section 7.0 Central Limit theorem.

This text is adapted from Openstax, Introductory Statistics, Section 7.1 Central Limit theorem for Sample Means (Averages).