Question 1

What is the difference between explained and unexplained deviation in regression analysis?

Accepted Answer

Explained deviation is the vertical distance between the predicted y-value and the sample mean, representing variation that the relationship between variables can explain. Unexplained deviation, or residual, is the distance between the actual data point and the predicted y-value, representing variation due to chance or other unmeasured variables. Together, they comprise total deviation.

Question 2

How does the coefficient of determination measure regression model fit?

Accepted Answer

The coefficient of determination, or r-squared, is calculated by dividing explained variation by total variation. It represents the proportion of variation in the dependent variable that the regression line can explain. A higher r-squared value indicates the regression model explains more of the variation in the data, providing a measure of how well the model fits the observed data.

Question 3

What does the slope of a regression line tell us about two variables?

Accepted Answer

The slope describes the rate of change between independent and dependent variables. It indicates how the dependent variable (y) changes on average for every one unit increase in the independent variable (x). A positive slope shows a direct relationship, while a negative slope indicates an inverse relationship between the variables.

Question 4

How do you predict y-values when variables are uncorrelated?

Accepted Answer

When variables are uncorrelated, the best-predicted value of y for any given x-value is the sample mean, y-bar. This is because no linear relationship exists between the variables, so the regression line provides no predictive advantage over using the mean. When linear correlation exists, you substitute the x-value into the regression equation for a more accurate prediction.

Question 5

What role does the y-intercept play in a regression equation?

Accepted Answer

The y-intercept describes the value of the dependent variable when the independent variable equals zero. It represents the point where the regression line crosses the y-axis on a scatter plot. Together with the slope, the y-intercept defines the complete regression equation used to predict y-values from x-values.

Question 6

How are explained, unexplained, and total variation calculated in regression?

Accepted Answer

Each deviation type is squared and summed across all data points to yield variation values. Explained variation comes from squaring differences between predicted y-values and the sample mean. Unexplained variation comes from squaring residuals. Total variation comes from squaring differences between observed y-values and the sample mean. These three variations relate through the linear correlation coefficient.

Question 7

Why might a regression model leave some variation unexplained?

Accepted Answer

Unexplained variation occurs because the relationship between two variables cannot account for all observed differences. This residual variation may result from random chance, measurement error, or the influence of other variables not included in the model. Understanding unexplained variation helps identify whether additional variables or a different model structure might improve predictions.

Spreiding

Browse More Videos

Spreiding

Browse More Videos

11.8 : Spreiding

11.1 : Correlatie

11.2 : Correlatiecoëfficiënt

11.3 : Berekening en interpretatie van de lineaire correlatiecoëfficiënt

11.4 : Regressieanalyse

11.5 : Uitbijters en invloedrijke punten

11.6 : Residuen en de kleinste-kwadratenmethode

11.7 : Residual Plots

11.9 : Voorspellingsintervallen

11.10 : Meervoudige regressie