Back to chapter

10.4:

One-Way ANOVA: Unequal Sample Sizes

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
One-Way ANOVA: Unequal Sample Sizes

Languages

Share

Consider performing a one-way ANOVA test on a dataset with heights of students from three samples with unequal sample sizes.

The null hypothesis is that the mean heights of the three samples are equal, and the alternative hypothesis is that at least one of the mean heights is different.

Compute the F statistic using the ratio of the variance between samples and the variance within samples. Here, x̿ is the combined mean of all observations, ͞xi is the mean of the ith sample, ni is the size of the ith sample, k is the number of samples and si2 is the variance of the ith sample.

Observe that both variance estimates are weighted since they consider sample size to compute the F statistic.

From the P-value, we infer that at least one of the mean heights from the three samples is different. And hence, the null hypothesis is rejected.

Further, to determine which mean height is significantly different from the others, we may construct box plots, construct confidence intervals, or use multiple comparison tests.

10.4:

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

Equation 1

In the equation, n is the sample size,  ͞x is the sample mean, x̿  is the combined mean for all the observations, k  is the number of samples, and s2 is the variance of the sample. It should be noted that the subscript 'i' represents a specific sample in a dataset.

Observe that both the variance estimates, the variance between samples, and the variance within samples are weighted since they use the same size to calculate the F statistic. In other words, the different sample sizes in the dataset will affect the two variance estimates- the variance between samples and the variance within samples, ultimately affecting the value of the F statistic.