Back to chapter

8.11:

Introduction to Test of Independence

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Introduction to Test of Independence

Languages

Share

A test of independence determines whether a contingency table's two variables are independent.

In this case, independence means that the probability of any event involving both variables can be directly obtained by multiplying their individual probabilities.

For example, to understand the relationship between alcohol consumption and road accident fatality, arrange the data in a two-by-two contingency table. The  rows represent the subjects' sobriety or intoxication, while the columns represent the fatality or nonfatality of road accidents.

Data from randomly selected samples represent the observed frequencies arranged in the two-way table.

Here, E represents the expected frequency, indicates the number of rows, and c indicates the number of columns. The expected frequency for each cell must be atleast 5.

The chi-square test statistic is calculated using these expected and observed frequencies. The critical value and P-values are calculated using suitable degrees of freedom from the chi-square table or software.

Finally, a hypothesis test is performed to determine whether alcohol consumption and road accident fatality are independent events.

8.11:

Introduction to Test of Independence

In statistics, the term independence means that one can directly obtain the probability of any event involving both variables by multiplying their individual probabilities. Tests of independence are chi-square tests involving the use of a contingency table of observed (data) values.

The test statistic for a test of independence is similar to that of a goodness-of-fit test:

Equation1

where:

  • O = observed values
  • E = expected values (which should be at least 5)

A test of independence determines whether two factors are independent or not. The test of independence is always right-tailed because of the calculation of the test statistic. If the expected and observed values are not close together, then the test statistic is very large and way out in the right tail of the chi-square curve, as it is in a goodness-of-fit.

The number of degrees of freedom for the test of independence is:

Equation2

The following formula calculates the expected number (E):

Equation3

This text is adapted from Openstax, Introductory Statistics, Section 11.3 Test of Independence