Back to chapter

8.9:

Expected Frequencies in Goodness-of-Fit Tests

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Expected Frequencies in Goodness-of-Fit Tests

Languages

Share

The goodness-of-fit test is performed to see if the observed results are statistically similar to the expected ones.

The goodness-of-fit test uses a null hypothesis – which assumes that the distribution is as claimed, and a contradicting alternative hypothesis.

If all the expected frequencies in a distribution are equal, such as when predicting the color of traffic lights, the expected frequency - E is expressed as the ratio of the total number of operations, six and k, the number of categories, three.

However, for unequal expected frequencies such as finding women with different hair color, E is calculated by multiplying the sum of the observed frequencies by the probability for each category.

In the earlier example of child birth, the expected and observed frequencies are used to calculate the chi-square value. Using the chi-square table, one can discover whether the difference between the expected and observed frequencies is statistically significant.

If this difference and the test statistic is large, with a small P-value, this means that the test statistic falls in the critical region. Hence, the null hypothesis is rejected. If not, fail to reject the null hypothesis.

8.9:

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).

Equation1

Hence, the expected frequency of any number appearing when casting a die will be 1/6.

However, suppose the expected frequencies of the dataset are unequal; the expected frequency is obtained by multiplying the total number of observations n and the probability p for the category.

Equation2