Back to chapter

8.15:

F Distribution

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
F Distribution

Languages

Share

The F test, named after the renowned statistician Sir Ronald Fisher, compares the difference between population variances of two normally distributed populations.

The F test uses the F statistic, which is the ratio of the sample variances and, thus, is never negative.

Generally, for ease of calculations, the numerator represents the higher sample variance while the denominator denotes the smaller sample variance.

As the difference between the sample variances reduces, the F statistic gets closer to unity.

Computing the F statistic for several random samples of two independent normally distributed populations, and plotting the F statistic yields the F distribution curve, an asymmetric curve, similar to the chi-square distribution curve.

However, unlike the chi-square-based tests, the F distribution has two sets of degrees of freedom, one for the numerator and another for the denominator. The exact shape of the F distribution curve depends on these two degrees of freedom.

This distribution is helpful in the F test and methods involving the comparison of variances, such as ANOVA.

8.15:

F Distribution

The F distribution was named after Sir Ronald Fisher, an English statistician. The F statistic is a ratio (a fraction) with two sets of degrees of freedom; one for the numerator and one for the denominator. The F distribution is derived from the Student's t distribution. The values of the F distribution are squares of the corresponding values of the t distribution. One-Way ANOVA expands the t test for comparing more than two groups. The scope of that derivation is beyond the level of this course. It is preferable to use ANOVA when there are more than two groups instead of performing pairwise t tests because performing multiple tests introduces the likelihood of making a Type 1 error.

Two estimates of the variance are made to calculate the F ratio:

  1. The variance between samples: An estimate of σ2 that is the variance of the sample means multiplied by n (when the sample sizes are the same.). If the samples are different sizes, the variance between samples is weighted to account for the different sample sizes. The variance is also called variation due to treatment or explained variation.
  2. The variance within samples: It is an estimate of σ2, the average of the sample variances (also known as a pooled variance). When the sample sizes differ, the variance within the samples is weighted. The variance is also called the variation due to error or unexplained variation.
  • SSbetween = the sum of squares representing the variation among the different samples
  • SSwithin = the sum of squares representing the variation within samples due to chance.

This text is adapted from Openstax, Introductory Statistics, Section 13.2 The F Distribution and the F-Ratio