Chapter 11: Correlation and Regression

Back to chapter

11.3:

Calculating and Interpreting the Linear Correlation Coefficient

JoVE Core
Statistica

È necessario avere un abbonamento a JoVE per visualizzare questo. Accedi o inizia la tua prova gratuita.

JoVE Core Statistica

Calculating and Interpreting the Linear Correlation Coefficient

Video precedente
11.2: Coefficient of Correlation

Video successivo
11.4: Regression Analysis

Lingue

Condividere

English العربية 中文 Nederlands français Deutsch עברית italiano 日本語 한국어 português русский español Türkçe

Consider the data set of carbon dioxide levels versus the annual temperature over a specific period. The scatter plot of the data points shows a probable linear pattern between the two variables.

To confirm a straight-line pattern, the linear correlation coefficient, r, is calculated.

First, x square, y square, and the product of x and y are determined and then added. The number of data points is 7.

From these values, the coefficient of correlation is calculated.

The meaning of the correlation coefficient value can be interpreted using the critical value table.

At a significance level of 0.05, and n equals 7, the critical value comes out to be 0.754.

Since the modulus of r is more than the critical value, there is sufficient evidence to support the conclusion that there is a linear correlation between the variables.

The r square value indicates that 76.2% of the variation in annual temperature can be explained by the linear relationship between carbon dioxide levels and annual temperature.

11.3:

Calculating and Interpreting the Linear Correlation Coefficient

The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable, x, and the dependent variable, y. Hence, it is also known as the Pearson product-moment correlation coefficient. It can be calculated using the following equation:

where n = the number of data points.

The 95% critical values of the sample correlation coefficient table can be used to give you a good idea of whether the computed value of r is significant or not. Compare r to the appropriate critical value in the table. If r is not between the positive and negative critical values, then the correlation coefficient is significant. If r is significant, then you may want to use the line for prediction.

The Coefficient of Determination

The variable r² is called the coefficient of determination and is the square of the correlation coefficient but is usually stated as a percent rather than in decimal form. It has an interpretation in the context of the data:

r², when expressed as a percent, represents the percent of variation in the dependent (predicted) variable y that can be explained by variation in the independent (explanatory) variable x using the regression (best-fit) line.

1 – r², when expressed as a percentage, represents the percent of the variation in y that is NOT explained by variation in x using the regression line. This can be seen as the scattering of the observed data points about the regression line.

This text is adapted from Openstax, Introductory Statistics, Section 12.3 The Regression Equation

Tags

Correlation Coefficient Pearson Product-moment Correlation Coefficient Linear Association Independent Variable Dependent Variable Equation Critical Values Significance Prediction Coefficient Of Determination R2 Variation Regression Line