# Description/ Pitfalls/ Stylized Facts

Correlation is a well-known concept for measuring the linear relationship between 2 or more variables.

It plays a major role in a number of classical approaches in finance: the CAPM as well as APT rely on correlation as a measure for the dependence of financial assets. In the multivariate BS model correlation of the log-returns is used as a measure of the dependence between assets.

The main reason for the importance of correlation in these frameworks is that the considered RV obey - under an appropriate transformation - a multivariate normal distribution.

Correlation as a measure of dependence fully determines the dependence structure for normal distributions and, more generally, ellliptical distributions while it fails to do so outside this class. Even within this class, correlation has to be handled with care; while a correlation of zero for multivariate normally distributed RVs implies independence, a correlation of zero for t-distributed (for ex) RVs does not imply independence.

More general measures of dependence help to avoid these pitfalls.

Hence, approaches relying on multivariate BMs and transformations thereof naturally determine the dependence structure via correlation.

For 2 RVs X and Y with finite and positive variances, their correlation is defined as: Properties of correlation

- It is a number in [-1, 1] and it is equal to 1 or -1 if and only if X and Y are linearly related --> Y = a + bX for constant a,b with b different from 0.              The correlation is 1 for b > 0 and -1 for b < 0.

- For constant a,b:    Corr(X+a, Y+b) = Corr(X,Y)

- If X and Y are independent --> Corr(X,Y) = 0. However, if Corr(X,Y) = 0 does not mean they are independent, they are uncorrelated. In the case when (X, Y) has a bivariate normal distribution, this implies independence of X and Y. Otherwise, this implication is typically wrong: even when X and Y are normally distributed (but (X,Y) does not have a bivariate normal distribution).

- For 2 RVs belonging to a given class of elliptical distributions, which includes normal distribution and Student t-distribution, correlation fully determines the dependence structure. However, note that uncorrelated t-distributed random variables are not independent.

- If X is m-dimensional and Y n-dimensional, then Cov(X,Y) is given by the m x n-matrix  with entries Cov(Xi, Yj). ∑ = Cov(X,Y) is called covariance matrix. It is symmetric and positive semidefinite, that is xT∑x ≥ 0 for all x that belongs to Rm

Moreover, one has: Cov(a + BX, c + DY) = B Cov(X,Y) DT

- Similarly, Corr(X,Y) has entries Corr(Xi, Yj). The correlation matrix Corr(X,X)  is symmetric and positive semidefinite.

Correlation is invariant under linear increasing transformations such that: Corr(a + bX, c + dY) = Corr(X,Y)   if bc>0, otherwise just the sign changes.

Correlation pitfalls

- A correlation of 0 is not equivalent to independence.

For (X,Y) being jointly normal, Corr(X,Y) implies independency of X and Y. In general, this is not true; even perfectly related RVs can have zero correlation.

For example: X ~ N(0,1) and Y = X2. Then Corr(X,Y) = 0 while X and Y are clearly not independent.

- Correlation is invariant under linear transformations, but not under general transformations.

For example: 2 lognormal RVs have a different correlation that the underlying normal RVs.

- For given distributions of X and Y and some given correlation in [-1, 1] it is, in general, not possible to construct a joint distribution.

- A small correlation does not imply a small degree of dependency.

Stylized Facts

- Correlation clustering: periods of high (low) correlation are likely to be followed by periods of high (low) correlation.

- Asymmetry and comovement with volatility: high volatility in falling markets goes hand in hand with strong increase in correlation, but this is not the case for rising markets. Notably, this reduces opportunities for diversification in stock market declines.

Estimating Correlation

The estimation of correlation in financial data is a delicate task as the underlying distribution typically has heavy tails.

If this is the case, it is preferable to use robust methods in comparison to nonrobust methods like the sample correlation.

Acknowledging that correlation changes over time, a number of approaches for dynamic correlation have been developed.