10.4: One-Way ANOVA: Unequal Sample Sizes

One-Way ANOVA: Unequal Sample Sizes
JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
One-Way ANOVA: Unequal Sample Sizes

5,790 Views

01:15 min
April 30, 2023

Overview

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

Equation 1

In the equation, n is the sample size,  ͞x is the sample mean, x̿  is the combined mean for all the observations, k  is the number of samples, and s2 is the variance of the sample. It should be noted that the subscript 'i' represents a specific sample in a dataset.

Observe that both the variance estimates, the variance between samples, and the variance within samples are weighted since they use the same size to calculate the F statistic. In other words, the different sample sizes in the dataset will affect the two variance estimates- the variance between samples and the variance within samples, ultimately affecting the value of the F statistic.

Transcript

Consider performing a one-way ANOVA test on a dataset with heights of students from three samples with unequal sample sizes.

The null hypothesis is that the mean heights of the three samples are equal, and the alternative hypothesis is that at least one of the mean heights is different.

Compute the F statistic using the ratio of the variance between samples and the variance within samples. Here, x̿ is the combined mean of all observations, ͞xi is the mean of the ith sample, ni is the size of the ith sample, k is the number of samples and si2 is the variance of the ith sample.

Observe that both variance estimates are weighted since they consider sample size to compute the F statistic.

From the P-value, we infer that at least one of the mean heights from the three samples is different. And hence, the null hypothesis is rejected.

Further, to determine which mean height is significantly different from the others, we may construct box plots, construct confidence intervals, or use multiple comparison tests.

Key Terms and definitions​

Learning Objectives

Questions that this video will help you answer

This video is also useful for