Back to chapter

11.10:

Multiple Regression

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Multiple Regression

Languages

Share

Multiple regression is a statistical tool used to analyze the relationship between more than two variables.

Multiple regression can be modeled into a simple equation that estimates the linear relationship between the response or dependent variable, with more than one predictor or independent variables.

For example, the water consumption of athletes is positively correlated with both the temperature and the total amount of time practiced.

Here, the temperature and the total amount of time practiced are the predictor variables that can be independently set. The water consumption is the response variable as it depends on the other two variables.  

Since the manual calculation of the multiple regression equation is generally complex, software is used to solve it. 

The multiple coefficient of determination is calculated to measure how well the equation fits the data set. It means that the changes in temperature and the total amount of time practiced can explain 97% of the variation in water consumption.

However, as more variables are used, R2 generally increases.

In such cases, the adjusted coefficient of determination is calculated, which accounts for the sample size and the number of predictor variables.

11.10:

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.

Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot followed by a multiple linear regression equation to calculate the multiple coefficient of determination, R2. Suppose the value of  R2 is 96%; one can interpret that the different combinations of water and fertilizer explain 96% of the variation in the crop yield.

However, the value of R2 increases with the number of independent variables. So, an adjusted coefficient of determination that accounts for both – the sample size and number of variables is used during analysis.