R-Squared Calculator

Enter observed Y values and predicted Ŷ values to calculate R² and adjusted R² — with residual sum of squares, total sum of squares, and a plain-language interpretation of your model's goodness of fit.

Observed Y Values (comma-separated)

Predicted Ŷ Values (comma-separated)

Enter your values above to see the results.

Tips & Notes

✓R² always increases (or stays the same) when you add more predictors — even useless ones. Use adjusted R² when comparing models with different numbers of predictors to avoid overfitting.
✓A high R² does not mean the model is correct. You can get R²=0.95 with a completely wrong functional form if the data happens to fit a line. Always examine residual plots alongside R².
✓R² can be negative in rare cases — when a model fits worse than a horizontal line at the mean. This signals a severely misspecified model, not a mathematical error.
✓In time-series regression, R² is often artificially high due to trends in both variables. Two unrelated trending variables (GDP and sea level) can produce R²>0.90 with no causal relationship.
✓Perfect R²=1.0 in real data is a red flag, not a success. It usually means the model is overfit, circular, or the predicted values were derived from the observed values rather than estimated independently.

Common Mistakes

✗Concluding that high R² means the model is correct or causal. R²=0.90 means 90% of variance is explained by the statistical relationship — it says nothing about whether X causes Y or whether the model form is appropriate.
✗Comparing R² across models with different numbers of predictors. Adding any variable, even a random one, increases R². Always use adjusted R² for model comparison — it penalizes for unnecessary complexity.
✗Expecting the same R² benchmark across all fields. R²=0.30 is poor for predicting physical measurements but strong for predicting human behavior. Field context determines what constitutes a good R².
✗Using R² to evaluate non-linear models without transformation. Standard R² assumes a linear model. For non-linear models, R² can be misleading — use other fit statistics or ensure you are computing it correctly for the model type.
✗Treating R² as the only diagnostic. A model with R²=0.85 can still have heteroscedasticity, non-normal residuals, or influential outliers that violate assumptions and make predictions unreliable.

R-Squared Calculator Overview

R-squared (R²) answers the most fundamental question about any regression model: how much of the variation in your outcome variable does the model actually explain? An R² of 0.75 means 75% of the variation in Y is captured by the model — the remaining 25% comes from factors not included or from random noise. R² is the universal measure of model fit, reported in every regression analysis from simple linear regression to complex machine learning models.

R² formula — proportion of variance explained:

R² = 1 − (SS_res / SS_tot) = 1 − Σ(yᵢ − ŷᵢ)² / Σ(yᵢ − ȳ)²

EX: Observed Y: [3,5,7,9,11], Predicted Ŷ: [3,5,7,9,11] → SS_res=0, SS_tot=40 → R² = 1−(0/40) = 1.00 (perfect fit)

Residual sum of squares (SS_res) — unexplained variation:

SS_res = Σ(yᵢ − ŷᵢ)²

EX: Observed [4,7,9], Predicted [5,6,10] → residuals: −1,+1,−1 → SS_res = 1+1+1 = 3

Total sum of squares (SS_tot) — total variation in Y:

SS_tot = Σ(yᵢ − ȳ)²

EX: Y=[4,7,9], ȳ=6.67 → deviations: −2.67,+0.33,+2.33 → SS_tot = 7.11+0.11+5.44 = 12.67 → R² = 1−(3/12.67) = 0.763

Adjusted R² — penalizes for adding unnecessary predictors:

Adjusted R² = 1 − [(1−R²)(n−1) / (n−k−1)]

EX: R²=0.80, n=50, k=3 predictors → Adj R² = 1−[(0.20×49)/46] = 1−0.213 = 0.787 — slightly lower, penalizing for 3 predictors

R² interpretation by field — what counts as a good fit:

R² Range	General Interpretation	Typical in Social Sciences	Typical in Engineering
0.90 – 1.00	Excellent fit	Rare and suspicious	Expected for physical models
0.70 – 0.89	Good fit	Strong result	Acceptable
0.50 – 0.69	Moderate fit	Reasonable	May need improvement
0.30 – 0.49	Weak fit	Often published	Poor
0.00 – 0.29	Very weak	May still be useful	Unacceptable

R² can be artificially inflated by adding more predictors — even irrelevant ones slightly increase R². Adjusted R² corrects for this by penalizing model complexity. When comparing models with different numbers of predictors, always use adjusted R², not raw R².

Frequently Asked Questions

R² measures the proportion of variance in Y that is explained by the model. R²=0.70 means 70% of the variation in the outcome variable is captured by the predictors — the other 30% comes from unmeasured factors or random noise. R²=0 means the model explains nothing (no better than predicting the mean every time). R²=1 means the model explains everything perfectly, with zero residual error.

R² always increases when you add predictors, even irrelevant ones. Adjusted R² penalizes for model complexity using the formula: Adj R² = 1−[(1−R²)(n−1)/(n−k−1)]. Adding a predictor that improves fit increases adjusted R²; adding an irrelevant predictor decreases it. When comparing models with different numbers of predictors, always use adjusted R² — raw R² will always favor the more complex model.

Yes, though it is rare. R² = 1 − SS_res/SS_tot becomes negative when SS_res > SS_tot — when the model's predictions are worse than simply predicting ȳ (the mean) for every observation. This signals a severely misspecified model. In practice, R² < 0 means you should abandon the current model entirely — a horizontal line at the mean would be more accurate.

It depends entirely on the field. Physical and engineering models often require R² > 0.95 — the relationship between voltage and current, for example, should be nearly perfect. In social science, economics, and behavioral research, R² = 0.30 is often considered meaningful because human behavior has many unmeasured drivers. Never judge R² without knowing the context and what factors are realistically predictable.

For simple linear regression (one predictor), R² = r². If Pearson r = 0.85, then R² = 0.7225 — the predictor explains 72.25% of outcome variance. This relationship only holds exactly for simple linear regression with one X variable. For multiple regression with several predictors, R² is still meaningful but cannot be reduced to a single correlation coefficient.

No. Maximizing R² leads to overfitting — a model that fits the current data perfectly but predicts new data poorly. Adding more predictors always increases R², even if those predictors are meaningless. Good modeling balances fit (R²) against parsimony (fewer predictors). Use adjusted R², cross-validation, or information criteria (AIC, BIC) to find the model that generalizes well, not just the one that fits the training data best.

R-Squared Calculator

Formula

Tips & Notes

Common Mistakes

R-Squared Calculator Overview

Frequently Asked Questions

What does R² actually measure?

What is the difference between R² and adjusted R²?

Can R² be negative?

What is a good R² value?

How is R² related to the correlation coefficient r?

Should I always try to maximize R²?

Related Calculators