11.3: Calculating and Interpreting the Linear Correlation Coefficient
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable, x, and the dependent variable, y. Hence, it is also known as the Pearson product-moment correlation coefficient. It can be calculated using the following equation:
where n = the number of data points.
The 95% critical values of the sample correlation coefficient table can be used to give you a good idea of whether the computed value of r is significant or not. Compare r to the appropriate critical value in the table. If r is not between the positive and negative critical values, then the correlation coefficient is significant. If r is significant, then you may want to use the line for prediction.
The Coefficient of Determination
The variable r2 is called the coefficient of determination and is the square of the correlation coefficient but is usually stated as a percent rather than in decimal form. It has an interpretation in the context of the data:
r2, when expressed as a percent, represents the percent of variation in the dependent (predicted) variable y that can be explained by variation in the independent (explanatory) variable x using the regression (best-fit) line.
1 – r2, when expressed as a percentage, represents the percent of the variation in y that is NOT explained by variation in x using the regression line. This can be seen as the scattering of the observed data points about the regression line.
This text is adapted from Openstax, Introductory Statistics, Section 12.3 The Regression Equation