Back to chapter

11.7:

Residual Plots

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Residual Plots

Languages

Share

Consider the scatter plot of airfare versus crude oil price per barrel fitted with a linear regression line. 

Here, the residual is the difference between the y-value of the data point and the predicted y-value from the regression equation.

If these residual values are plotted against the x-value- the crude oil price, the resulting graph is called the residual plot. This plot helps in deciding whether the regression equation is a good model or not.

As there is no obvious pattern other than a linear pattern in this residual plot, the regression line is a good fit.

Any other pattern, which is nonlinear, indicates that the regression equation does not qualify as a good model.

For example, predominantly positive residuals in a certain range and negative in others indicate a nonlinear trend where a linear regression equation is not a good fit. 

Also, a thickening of the residual plot, as it is viewed from left to right, indicates that the regression line is not a good model.

11.7:

Residual Plots

A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.

When the residual values are plotted against the variable x, it is called a residual plot. The pattern formed by the scatter points in such a plot can be used to draw inferences about the data set. For example, if the scatter points have a linear pattern, it confirms that the regression line is a good fit for the dataset containing x and y values. Conversely, a non-linear pattern in the residual plot with predominantly positive residuals in some ranges, whereas negative in others, indicates that the regression equation is not a good model for the given set of x and y values. Additionally, a residual plot that shows a thickening pattern when viewed from left to right indicates that the regression line isn't a good model.