Correlation is the single most important number in portfolio construction. It determines how much diversification you get, how your hedge performs, how much capital you need, and whether your multi-asset model makes any sense.
The Markowitz mean-variance portfolio has optimal weights that depend on the covariance matrix — a matrix built entirely from pairwise correlations and volatilities. The VaR of a portfolio depends on correlations. The price of a basket option depends on correlations. The value of a CDO tranche is driven almost entirely by default correlation. In every multi-asset context, correlation is what transforms a collection of individual assets into a portfolio — and getting it wrong is one of the most common sources of financial losses.
But correlation is also dangerously easy to misuse. Pearson correlation only measures linear dependence. Zero correlation does not mean independence. Correlations are unstable (they spike during crises — exactly when diversification matters most). Sample correlations from short histories are extremely noisy. This page covers what correlation measures, what it misses, and how to avoid the traps.
Covariance
Definition
The covariance between X and Y is:
Cov(X,Y)=E[(X−μX)(Y−μY)]=E[XY]−μXμY
Covariance measures the degree of linear co-movement between two variables. Positive covariance means they tend to move together; negative means they move oppositely.
Sample covariance:
Cov^(X,Y)=n−11i=1∑n(xi−xˉ)(yi−yˉ)
Properties
Cov(X,X)=Var(X)
Cov(X,Y)=Cov(Y,X) (symmetric)
Cov(aX+b,Y)=aCov(X,Y) (linear in each argument)
Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)
The last property is why covariance matters for portfolios: the risk of a portfolio is not the sum of individual risks — the cross-term 2Cov can increase or decrease total risk.
The covariance matrix
For n assets with returns (R1,…,Rn), the covariance matrix is:
Σij=Cov(Ri,Rj)
This n×n symmetric, positive semi-definite matrix encodes all pairwise linear dependencies. The portfolio variance is:
Var(Rp)=wTΣw
where w is the vector of portfolio weights. Markowitz optimisation minimises this quadratic form subject to a return target — the entire framework operates on Σ.
Estimation challenge: An n×n covariance matrix has n(n+1)/2 unique entries. For n=500 stocks, that is 125,250 parameters estimated from, say, 252 daily observations. The sample covariance matrix is singular (n>T) or poorly conditioned (n comparable to T). This is why shrinkage estimators (Ledoit-Wolf: shrink toward a structured target like the identity or single-factor model), factor models (reduce dimensionality to a few systematic factors), and random matrix theory (filter noise eigenvalues) are essential in practice.
Pearson correlation
Definition
The Pearson correlation coefficient normalises covariance to [−1,+1]:
ρ=+1: perfect positive linear relationship (Y=aX+b, a>0).
ρ=−1: perfect negative linear relationship (Y=aX+b, a<0).
ρ=0: no linear relationship (but possibly strong nonlinear dependence — see pitfalls below).
Correlation matrix
The correlation matrixC has Cij=ρij, with 1s on the diagonal. It is related to the covariance matrix by:
Σ=DCD
where D=diag(σ1,…,σn). This decomposition separates individual volatilities (in D) from the dependence structure (in C).
A valid correlation matrix must be symmetric positive semi-definite with diagonal entries equal to 1. In practice, ad hoc adjustments to correlation estimates (e.g., expert overrides, stress-test modifications) can produce matrices that violate positive semi-definiteness — requiring "nearest valid correlation matrix" algorithms.
Rank correlations
Pearson correlation measures linear dependence. Rank correlations measure monotonic dependence — they detect whether Y tends to increase when X increases, regardless of the functional form.
Spearman's rank correlation
Replace data values with their ranks R(xi), R(yi) and compute Pearson correlation on the ranks:
ρS=1−n(n2−1)6∑di2,di=R(xi)−R(yi)
Spearman's ρS detects any monotonic relationship, not just linear. If Y=eX (monotonic but nonlinear), Pearson ρ<1 but Spearman ρS=1.
Kendall's tau
τ=(2n)(concordant pairs)−(discordant pairs)
A pair (xi,yi),(xj,yj) is concordant if xi>xj and yi>yj (or both less), and discordant otherwise. Kendall's τ is more robust to outliers than Spearman's ρS and has better statistical properties for small samples.
Finance application: In copula modelling, the relationship between copula parameters and Kendall's τ is often available in closed form (e.g., for the Clayton copula: τ=θ/(θ+2)), making τ the preferred measure for calibrating copulas.
Correlation ≠ dependence
This distinction is the most important message of this page.
Zero correlation does not imply independence
If X∼N(0,1) and Y=X2, then Cov(X,Y)=E[X3]=0 (by symmetry). So ρ=0. But Y is a deterministic function of X — they are maximally dependent.
Finance example: Stock returns (X) and realised volatility (Y=∣X∣ or X2) have near-zero Pearson correlation, but are strongly dependent. The leverage effect (negative returns increase vol) adds further nonlinear dependence.
Independence implies zero correlation (converse)
If X and Y are independent, then Cov(X,Y)=0. But only the converse is guaranteed for jointly normal variables: for bivariate normals, ρ=0⟺ independent. For non-normal distributions, ρ=0 says nothing about independence.
Tail dependence
Two assets may have moderate overall correlation but very different behaviour in the tails. Tail dependence measures the probability of joint extreme events:
λU=u→1−limP(Y>FY−1(u)∣X>FX−1(u))
The Gaussian copula has λU=0 (zero tail dependence — extreme events are asymptotically independent). The Student's t-copula has λU>0 (positive tail dependence — joint crashes are possible). This difference was at the heart of the 2008 CDO crisis: models with zero tail dependence dramatically underestimated the probability of widespread simultaneous defaults.
Correlation in quant finance
Portfolio diversification
For a two-asset portfolio with weights (w,1−w):
σp2=w2σ12+(1−w)2σ22+2w(1−w)ρσ1σ2
When ρ<1, σp<wσ1+(1−w)σ2: the portfolio is less risky than the weighted average of individual risks. This is diversification. At ρ=−1, perfect hedging is possible: σp=0 for the right weight choice.
The crisis problem: Correlations tend to increase during market stress. Assets that appeared diversifying (ρ=0.3) become highly correlated (ρ=0.8) during crashes — exactly when diversification is most needed. This "correlation breakdown" is one of the most documented and dangerous phenomena in portfolio management.
Factor models and beta
In the CAPM / single-factor model:
Ri=αi+βiRm+εi
The beta is:
βi=Var(Rm)Cov(Ri,Rm)=ρi,mσmσi
Beta decomposes a stock's risk into systematic (βiRm) and idiosyncratic (εi) components. This is a regression coefficient — see Linear Regression — but it is fundamentally a correlation-based quantity.
Implied correlation
In index options, the index variance equals the weighted sum of individual variances plus covariance terms. Given individual implied vols and the index implied vol, you can solve for the implied correlation — the market's priced-in average pairwise correlation. This is traded via correlation swaps and dispersion trades.
The diversification benefit increases dramatically as correlation decreases. At ρ=−1, the portfolio is nearly risk-free.
Example 2: correlation instability
Average pairwise equity correlation in the S&P 500: approximately 0.25–0.35 in calm markets, rising to 0.60–0.80 during crises (2008, 2020). A portfolio optimised assuming ρ=0.30 will underperform badly when ρ jumps to 0.70 — the optimised weights assumed more diversification than actually exists.
Example 3: spurious correlation
Two unrelated trending time series (e.g., global temperature and S&P 500 level over 30 years) can show r>0.9 simply because both trend upward. This is spurious correlation caused by non-stationarity. The fix: compute correlation on returns (differenced data), not levels.
Common confusions and pitfalls
"Correlation = dependence." No. Correlation measures linear (Pearson) or monotonic (Spearman/Kendall) association. Variables can be strongly dependent but uncorrelated (the X vs X2 example). Always think about what type of dependence you care about.
"Correlation is stable over time." Correlations are notoriously unstable, especially in crises. Rolling-window correlations, DCC-GARCH models, and regime-switching models address this, but it remains a fundamental problem.
"The sample correlation matrix is a good estimator." For large n (many assets) relative to T (observations), the sample correlation matrix is overwhelmed by estimation noise. Eigenvalues are biased: the largest are too large, the smallest too small (some may be negative). Shrinkage and factor-based methods are essential.
"Correlation of returns = correlation of levels." Computing Pearson correlation on price levels rather than returns is almost always wrong — trending prices produce spuriously high correlations. Always use returns (or log-returns) for financial correlation analysis.
"High correlation implies causation." No. Two stocks may be highly correlated because they share a common factor (e.g., both are tech stocks), not because one causes the other to move.
Where this goes next
Correlation and dependence connect to:
Moments and Summary Statistics: Covariance is a cross-moment; the correlation matrix is the normalised version of the covariance matrix.
Normal Distribution: For bivariate normals, ρ=0⟺ independence. The conditional distribution X∣Y has mean μX+ρ(σX/σY)(Y−μY) — the regression line.
Multiple Integrals: The covariance integral ∬(x−μX)(y−μY)p(x,y)dxdy and the bivariate normal CDF Φ2(a,b;ρ) are double integrals.
Student's t-Distribution: The t-copula introduces tail dependence that the Gaussian copula misses.
Uniform Distribution: Copula theory transforms marginals to U(0,1) to isolate the dependence structure from the marginals.
Partial Derivatives: The sensitivity of portfolio variance to correlation changes (∂σp2/∂ρ=2w1w2σ1σ2) is a partial derivative — correlation risk.