Question 1

What is the difference between explained and unexplained deviation in regression analysis?

Accepted Answer

Explained deviation is the vertical distance between the predicted y-value and the sample mean, representing variation that the relationship between variables can explain. Unexplained deviation, or residual, is the distance between the actual data point and the predicted y-value, representing variation due to chance or other unmeasured variables. Together, they comprise total deviation.

Question 2

How does the coefficient of determination measure regression model fit?

Accepted Answer

The coefficient of determination, or r-squared, is calculated by dividing explained variation by total variation. It represents the proportion of variation in the dependent variable that the regression line can explain. A higher r-squared value indicates the regression model explains more of the variation in the data, providing a measure of how well the model fits the observed data.

Question 3

What does the slope of a regression line tell us about two variables?

Accepted Answer

The slope describes the rate of change between independent and dependent variables. It indicates how the dependent variable (y) changes on average for every one unit increase in the independent variable (x). A positive slope shows a direct relationship, while a negative slope indicates an inverse relationship between the variables.

Question 4

How do you predict y-values when variables are uncorrelated?

Accepted Answer

When variables are uncorrelated, the best-predicted value of y for any given x-value is the sample mean, y-bar. This is because no linear relationship exists between the variables, so the regression line provides no predictive advantage over using the mean. When linear correlation exists, you substitute the x-value into the regression equation for a more accurate prediction.

Question 5

What role does the y-intercept play in a regression equation?

Accepted Answer

The y-intercept describes the value of the dependent variable when the independent variable equals zero. It represents the point where the regression line crosses the y-axis on a scatter plot. Together with the slope, the y-intercept defines the complete regression equation used to predict y-values from x-values.

Question 6

How are explained, unexplained, and total variation calculated in regression?

Accepted Answer

Each deviation type is squared and summed across all data points to yield variation values. Explained variation comes from squaring differences between predicted y-values and the sample mean. Unexplained variation comes from squaring residuals. Total variation comes from squaring differences between observed y-values and the sample mean. These three variations relate through the linear correlation coefficient.

Question 7

Why might a regression model leave some variation unexplained?

Accepted Answer

Unexplained variation occurs because the relationship between two variables cannot account for all observed differences. This residual variation may result from random chance, measurement error, or the influence of other variables not included in the model. Understanding unexplained variation helps identify whether additional variables or a different model structure might improve predictions.

분산도

Browse More Videos

Understanding Statistics

Summarizing and Visualizing Data

Measure of Central Tendency

Measures of Variation

Measures of Relative Standing

Probability Distributions

Estimates

Distributions

Hypothesis Testing

Analysis of Variance

Statistics in Practice

Nonparametric Statistics

Biostatistics

Survival Analysis

Statistical Software

Control Charts

분산도

Browse More Videos

Understanding Statistics

Summarizing and Visualizing Data

Measure of Central Tendency

Measures of Variation

Measures of Relative Standing

Probability Distributions

Estimates

Distributions

Hypothesis Testing

Analysis of Variance

Statistics in Practice

Nonparametric Statistics

Biostatistics

Survival Analysis

Statistical Software

Control Charts

11.8 : 분산도

11.1 : 상관관계

11.2 : 상관계수

11.3 : 선형 상관 계수의 계산 및 해석

11.4 : 회귀 분석

11.5 : 이상치와 영향력 있는 값들

11.6 : 잔차와 최소제곱법

11.7 : 잔차도

11.9 : 예측구간

11.10 : 다중 회귀