8.15: F Distribution
The F distribution was named after Sir Ronald Fisher, an English statistician. The F statistic is a ratio (a fraction) with two sets of degrees of freedom; one for the numerator and one for the denominator. The F distribution is derived from the Student's t distribution. The values of the F distribution are squares of the corresponding values of the t distribution. One-Way ANOVA expands the t test for comparing more than two groups. The scope of that derivation is beyond the level of this course. It is preferable to use ANOVA when there are more than two groups instead of performing pairwise t tests because performing multiple tests introduces the likelihood of making a Type 1 error.
Two estimates of the variance are made to calculate the F ratio:
- The variance between samples: An estimate of σ2 that is the variance of the sample means multiplied by n (when the sample sizes are the same.). If the samples are different sizes, the variance between samples is weighted to account for the different sample sizes. The variance is also called variation due to treatment or explained variation.
- The variance within samples: It is an estimate of σ2, the average of the sample variances (also known as a pooled variance). When the sample sizes differ, the variance within the samples is weighted. The variance is also called the variation due to error or unexplained variation.
- SSbetween = the sum of squares representing the variation among the different samples
- SSwithin = the sum of squares representing the variation within samples due to chance.
This text is adapted from Openstax, Introductory Statistics, Section 13.2 The F Distribution and the F-Ratio