Correlation Coefficient Calculator

Enter paired X and Y values to calculate the Pearson correlation coefficient (r), coefficient of determination (r²), and strength of linear relationship — with step-by-step formula breakdown and interpretation.

Enter your values above to see the results.

Tips & Notes

  • Correlation measures only linear relationships. Two variables can have r = 0 but still have a strong non-linear relationship (e.g., quadratic). Always plot your data before concluding there is no relationship.
  • Outliers can dramatically inflate or deflate r. One extreme data point can make an otherwise weak correlation appear strong, or make a strong correlation disappear. Plot your data and check for outliers first.
  • Correlation requires at least 3 data points, but meaningful correlation typically needs 20+. With n=5, any r above 0.88 is statistically significant; with n=100, even r=0.20 can be statistically significant.
  • Use Spearman rank correlation instead of Pearson when data is heavily skewed, contains outliers, or is measured on an ordinal scale (ranks rather than numerical measurements).
  • R² tells you the practical importance, not just significance. r=0.30 sounds weak, but r²=0.09 means X explains 9% of Y's variation — which may be practically meaningful depending on the field.

Common Mistakes

  • Concluding causation from correlation. r = 0.9 between two variables means they are strongly related — not that one causes the other. Always consider confounding variables and plausible mechanisms before claiming causation.
  • Using Pearson correlation for non-linear data. If a scatterplot shows a curved relationship, r can be near 0 even when the relationship is strong and predictable. Pearson only detects straight-line relationships.
  • Ignoring sample size when interpreting r. r=0.4 with n=10 is not statistically significant; r=0.2 with n=200 is. Always check p-value or confidence interval for the correlation, especially with small samples.
  • Assuming that r=0 means the variables are independent. Zero Pearson correlation only means no linear relationship. Variables can have r=0 and still be perfectly predictable from each other through a non-linear formula.
  • Comparing correlations from different datasets without considering their variances. A correlation of 0.7 in one study and 0.7 in another study does not mean the relationship is equally strong — different ranges of X and Y affect the calculated r.

Correlation Coefficient Calculator Overview

The Pearson correlation coefficient (r) measures the strength and direction of the linear relationship between two variables. It ranges from −1 to +1: a value of +1 means a perfect positive linear relationship (as X increases, Y increases proportionally), −1 means perfect negative linear relationship, and 0 means no linear relationship. Correlation is one of the most used statistics in research, business, and science — and one of the most frequently misinterpreted.

Pearson correlation coefficient formula:

r = Σ[(xᵢ−x̄)(yᵢ−ȳ)] / √[Σ(xᵢ−x̄)² × Σ(yᵢ−ȳ)²]
EX: Hours studied (X): [1,2,3,4,5] | Exam scores (Y): [50,60,70,80,90] → x̄=3, ȳ=70 → r = 1.00 (perfect positive correlation — every extra hour adds exactly 10 points)
Coefficient of determination (r²) — variance explained:
r² = r × r
EX: r = 0.85 → r² = 0.7225 → Study time explains 72.25% of the variation in exam scores. The remaining 27.75% comes from other factors.
Interpreting correlation strength:
|r| ValueInterpretationr² (Variance Explained)Practical Meaning
0.00 – 0.09Negligible0 – 1%Essentially no linear relationship
0.10 – 0.29Weak1 – 8%Small, often practically insignificant
0.30 – 0.49Moderate9 – 24%Noticeable relationship, other factors important
0.50 – 0.69Strong25 – 48%Meaningful relationship for most purposes
0.70 – 0.89Very strong49 – 79%One variable is a good predictor of the other
0.90 – 1.00Near perfect81 – 100%Variables move almost in lockstep
Correlation measures the strength and direction of a linear relationship between two variables, not causation. A strong r value — even r = 0.99 — does not mean that one variable causes the other. Both variables might be driven by a third unmeasured factor, the relationship might be coincidental, or the causation might run in the opposite direction from what is assumed. Establishing causation requires controlled experiments, not observational correlation. r² (the coefficient of determination) is often more useful than r alone — it reports what fraction of the variance in one variable is explained by its linear relationship with the other. r = 0.7 means r² = 0.49, so the linear relationship accounts for 49% of the variation. The other 51% comes from other factors or random variation. Always plot the data before interpreting r; two datasets can have identical r values but completely different scatter patterns.

Frequently Asked Questions

r = 0.8 indicates a strong positive linear relationship between the two variables. As X increases, Y tends to increase strongly. r² = 0.64 means that 64% of the variation in Y is explained by its linear relationship with X — the remaining 36% comes from other factors. In most fields, r = 0.8 is considered very strong and practically significant.

Correlation shows that two variables move together — it says nothing about why. Classic example: shoe size correlates with reading ability in children (r ≈ 0.8) because both increase with age. Older children have bigger feet AND read better; foot size does not cause reading ability. Always consider: is there a direct causal mechanism? Could a third variable (confounder) drive both? Could the causation run the other way?

r (correlation coefficient) measures direction and strength of linear relationship, ranging from −1 to +1. r² (coefficient of determination) measures what proportion of Y's variation is explained by X, ranging from 0 to 1 (or 0% to 100%). Example: r=0.6 → r²=0.36, meaning X explains 36% of Y's variation. r² is always positive regardless of correlation direction and is the more practically meaningful measure.

For a statistically reliable Pearson r, you generally need at least 20–30 paired data points. With fewer points, the estimate is unstable — a single additional data point can change r substantially. With n=5, you need r > 0.88 for statistical significance at p < 0.05. With n=30, r > 0.36 is significant. Always report the sample size alongside any correlation coefficient.

Use Spearman correlation when: data contains outliers that would distort Pearson r, data is measured on an ordinal scale (rankings rather than numbers), data is heavily skewed rather than normally distributed, or you want to detect any monotonic relationship (not just linear). Spearman converts values to ranks then computes correlation on the ranks — it is more robust but less sensitive to precise linear relationships.

Yes. Negative correlation (−1 ≤ r < 0) means as X increases, Y tends to decrease. r=−0.85 is just as strong as r=+0.85 — just in the opposite direction. Examples: more exercise hours → lower resting heart rate (negative correlation); higher altitude → lower temperature (negative); more study time → fewer errors on a test (negative). The absolute value |r| measures strength; the sign indicates direction.