R-Squared Calculator
Enter observed Y values and predicted Ŷ values to calculate R² and adjusted R² — with residual sum of squares, total sum of squares, and a plain-language interpretation of your model's goodness of fit.
Enter your values above to see the results.
Tips & Notes
- ✓R² always increases (or stays the same) when you add more predictors — even useless ones. Use adjusted R² when comparing models with different numbers of predictors to avoid overfitting.
- ✓A high R² does not mean the model is correct. You can get R²=0.95 with a completely wrong functional form if the data happens to fit a line. Always examine residual plots alongside R².
- ✓R² can be negative in rare cases — when a model fits worse than a horizontal line at the mean. This signals a severely misspecified model, not a mathematical error.
- ✓In time-series regression, R² is often artificially high due to trends in both variables. Two unrelated trending variables (GDP and sea level) can produce R²>0.90 with no causal relationship.
- ✓Perfect R²=1.0 in real data is a red flag, not a success. It usually means the model is overfit, circular, or the predicted values were derived from the observed values rather than estimated independently.
Common Mistakes
- ✗Concluding that high R² means the model is correct or causal. R²=0.90 means 90% of variance is explained by the statistical relationship — it says nothing about whether X causes Y or whether the model form is appropriate.
- ✗Comparing R² across models with different numbers of predictors. Adding any variable, even a random one, increases R². Always use adjusted R² for model comparison — it penalizes for unnecessary complexity.
- ✗Expecting the same R² benchmark across all fields. R²=0.30 is poor for predicting physical measurements but strong for predicting human behavior. Field context determines what constitutes a good R².
- ✗Using R² to evaluate non-linear models without transformation. Standard R² assumes a linear model. For non-linear models, R² can be misleading — use other fit statistics or ensure you are computing it correctly for the model type.
- ✗Treating R² as the only diagnostic. A model with R²=0.85 can still have heteroscedasticity, non-normal residuals, or influential outliers that violate assumptions and make predictions unreliable.
R-Squared Calculator Overview
R-squared (R²) answers the most fundamental question about any regression model: how much of the variation in your outcome variable does the model actually explain? An R² of 0.75 means 75% of the variation in Y is captured by the model — the remaining 25% comes from factors not included or from random noise. R² is the universal measure of model fit, reported in every regression analysis from simple linear regression to complex machine learning models.
R² formula — proportion of variance explained:
R² = 1 − (SS_res / SS_tot) = 1 − Σ(yᵢ − ŷᵢ)² / Σ(yᵢ − ȳ)²
EX: Observed Y: [3,5,7,9,11], Predicted Ŷ: [3,5,7,9,11] → SS_res=0, SS_tot=40 → R² = 1−(0/40) = 1.00 (perfect fit)Residual sum of squares (SS_res) — unexplained variation:
SS_res = Σ(yᵢ − ŷᵢ)²
EX: Observed [4,7,9], Predicted [5,6,10] → residuals: −1,+1,−1 → SS_res = 1+1+1 = 3Total sum of squares (SS_tot) — total variation in Y:
SS_tot = Σ(yᵢ − ȳ)²
EX: Y=[4,7,9], ȳ=6.67 → deviations: −2.67,+0.33,+2.33 → SS_tot = 7.11+0.11+5.44 = 12.67 → R² = 1−(3/12.67) = 0.763Adjusted R² — penalizes for adding unnecessary predictors:
Adjusted R² = 1 − [(1−R²)(n−1) / (n−k−1)]
EX: R²=0.80, n=50, k=3 predictors → Adj R² = 1−[(0.20×49)/46] = 1−0.213 = 0.787 — slightly lower, penalizing for 3 predictorsR² interpretation by field — what counts as a good fit:
| R² Range | General Interpretation | Typical in Social Sciences | Typical in Engineering |
|---|---|---|---|
| 0.90 – 1.00 | Excellent fit | Rare and suspicious | Expected for physical models |
| 0.70 – 0.89 | Good fit | Strong result | Acceptable |
| 0.50 – 0.69 | Moderate fit | Reasonable | May need improvement |
| 0.30 – 0.49 | Weak fit | Often published | Poor |
| 0.00 – 0.29 | Very weak | May still be useful | Unacceptable |
Frequently Asked Questions
R² measures the proportion of variance in Y that is explained by the model. R²=0.70 means 70% of the variation in the outcome variable is captured by the predictors — the other 30% comes from unmeasured factors or random noise. R²=0 means the model explains nothing (no better than predicting the mean every time). R²=1 means the model explains everything perfectly, with zero residual error.
R² always increases when you add predictors, even irrelevant ones. Adjusted R² penalizes for model complexity using the formula: Adj R² = 1−[(1−R²)(n−1)/(n−k−1)]. Adding a predictor that improves fit increases adjusted R²; adding an irrelevant predictor decreases it. When comparing models with different numbers of predictors, always use adjusted R² — raw R² will always favor the more complex model.
Yes, though it is rare. R² = 1 − SS_res/SS_tot becomes negative when SS_res > SS_tot — when the model's predictions are worse than simply predicting ȳ (the mean) for every observation. This signals a severely misspecified model. In practice, R² < 0 means you should abandon the current model entirely — a horizontal line at the mean would be more accurate.
It depends entirely on the field. Physical and engineering models often require R² > 0.95 — the relationship between voltage and current, for example, should be nearly perfect. In social science, economics, and behavioral research, R² = 0.30 is often considered meaningful because human behavior has many unmeasured drivers. Never judge R² without knowing the context and what factors are realistically predictable.
For simple linear regression (one predictor), R² = r². If Pearson r = 0.85, then R² = 0.7225 — the predictor explains 72.25% of outcome variance. This relationship only holds exactly for simple linear regression with one X variable. For multiple regression with several predictors, R² is still meaningful but cannot be reduced to a single correlation coefficient.
No. Maximizing R² leads to overfitting — a model that fits the current data perfectly but predicts new data poorly. Adding more predictors always increases R², even if those predictors are meaningless. Good modeling balances fit (R²) against parsimony (fewer predictors). Use adjusted R², cross-validation, or information criteria (AIC, BIC) to find the model that generalizes well, not just the one that fits the training data best.