Linear Regression Mean Squared Error
In linear regression, mean squared error (MSE) is a commonly used measure of the difference between the predicted values and the actual values. MSE is calculated by taking the sum of the squared differences between the predicted and actual values, and then dividing by the number of samples.
Here's the formula for MSE:
MSE = (1/n) * ∑(ŷ - y)^2
Where:
- n is the number of samples
- ŷ is the predicted value
- y is the actual value
MSE is a measure of the overall error of the model, and a lower MSE indicates a better fit. However, MSE is sensitive to outliers, meaning that a few extreme values can significantly influence the MSE.
To calculate MSE in Python, you can use the mean_squared_error()
function from the sklearn.metrics
module:
from sklearn.metrics import mean_squared_error
y_true = [1, 2, 3, 4]
y_pred = [1.5, 2.5, 2.9, 4.1]
mse = mean_squared_error(y_true, y_pred)
print(mse) # prints 0.425
In this example, the MSE is calculated as 0.425, which indicates that the predicted values are relatively close to the actual values.
Overall, MSE is a useful measure of the performance of a linear regression model, and it can be used to compare different models and choose the one with the lowest MSE. However, it's important to consider the limitations of MSE and use it in combination with other measures of model performance.
Leave a Comment