Linear Regression Residuals and its Assumptions

Karthik
2 min readJul 5, 2020

What is a Residual ?

The difference between the predicted value and the original observed value of the dependent variable is known as a residual.There are few assumptions regarding the residuals of a linear regression, let’s discuss them.

Assumptions of Residuals

There are 4 main assumptions regarding residuals

Linear relationship between residuals and Y (dependent variable)

Error terms are normally distributed,with mean zero

Error terms are independent of each other

Homoscedasticity

Linear Relationship

The residuals and the dependent values y values must have a linear relation.We draw a scatter plot of residuals and Y value , if a linear trend is observed , that means the assumption is satisfied.

Normal Distribution

The residuals must be normally distributed with mean as 0.We draw a distribution plot of the residuals.If the residuals is not skewed ,it means that the assumption is satisfied.

Independence

The residuals should not have any dependence between each other . We should not observe any trend in the residuals obtained .A scatter plot of the residuals can help us understand this better.

No trend can be observed .

Homoscedasticity

Error terms must have constant variance. We look at the scatter plot which we drew for linearity (see above) — i.e. y on the vertical axis, and standardized residuals on the x axis. If the residuals do not fan out as the predicted values increase that means that the equal variance assumption is met.

This is a brief about Linear regression residuals. Cost functions , gradient descent and solving regression problems using Python has been explained in other posts.

--

--