R-squared, also known as the coefficient of determination, is a statistical measure that represents the proportion of the variance in the dependent variable that is predictable from the independent variable(s). It provides a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.
The R-squared value is calculated as the square of the correlation coefficient (r) between the observed and predicted values of the dependent variable:
\[R^2 = r^2 = \left(\frac{n(\sum xy) - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}}\right)^2\]
Where:
R-squared values range from 0 to 1, where:
Let's calculate R-squared for the dataset:
X: 1, 2, 3, 4, 5
Y: 2, 4, 5, 4, 5
Therefore, approximately 64.29% of the variance in Y can be explained by the variance in X.
This scatter plot represents the example dataset. The red line indicates the best fit line, and the closeness of the points to this line visually represents the strength of the correlation and, consequently, the R-squared value.
We can create a free, personalized calculator just for you!
Contact us and let's bring your idea to life.