Correlation & Copulas¶

Generate correlated multivariate data using copula models.

Overview¶

Copulas allow you to:

Generate correlated random variables
Control the dependence structure independently from marginals
Model different types of dependence (tail dependence, asymmetric)
Create realistic multivariate distributions

Correlation Functions¶

Pearson Correlation¶

Compute Pearson correlation between two arrays:

from superstore import pearsonCorrelation

correlation = pearsonCorrelation(x, y)

Bivariate Sampling¶

Generate correlated bivariate samples:

from superstore import sampleBivariate

# Generate correlated pairs with rho=0.7
x, y = sampleBivariate(n=1000, rho=0.7)

Copula Models¶

Gaussian Copula¶

The Gaussian copula creates correlation through a multivariate normal distribution. It has no tail dependence - extreme observations are not more likely to occur together.

from superstore import GaussianCopula

# Create a Gaussian copula with correlation matrix
copula = GaussianCopula(
    correlation=[[1.0, 0.7, 0.3],
                 [0.7, 1.0, 0.5],
                 [0.3, 0.5, 1.0]]
)

# Generate 1000 correlated uniform samples
u = copula.sample(n=1000)
# u has shape (1000, 3), each column is marginally Uniform(0,1)

Properties:

Symmetric dependence
No tail dependence
Easy to parameterize with correlation matrix
Good for “normal” dependencies

Clayton Copula¶

The Clayton copula has lower tail dependence - extreme low values are more likely to occur together. Useful for:

Credit risk (joint defaults)
Insurance (correlated claims)
Portfolio risk (market crashes)

from superstore import ClaytonCopula

# Create a Clayton copula with theta=2.0
# Higher theta = stronger dependence
copula = ClaytonCopula(theta=2.0, dim=3)

# Generate samples
u = copula.sample(n=1000)

Properties:

Asymmetric dependence
Lower tail dependence (crashes happen together)
No upper tail dependence
theta > 0 for positive dependence

Frank Copula¶

The Frank copula has no tail dependence but can model both positive and negative dependence:

from superstore import FrankCopula

# Positive dependence
copula = FrankCopula(theta=5.0, dim=2)

# Negative dependence
copula = FrankCopula(theta=-5.0, dim=2)

u = copula.sample(n=1000)

Properties:

Symmetric dependence
No tail dependence
Can model negative dependence (theta < 0)
Good for weak dependencies

Gumbel Copula¶

The Gumbel copula has upper tail dependence - extreme high values are more likely to occur together. Useful for:

Flood modeling (extreme rainfall)
Insurance (extreme losses)
Finance (market bubbles)

from superstore import GumbelCopula

# Create a Gumbel copula with theta=3.0
# theta >= 1, higher = stronger dependence
copula = GumbelCopula(theta=3.0, dim=2)

u = copula.sample(n=1000)

Properties:

Asymmetric dependence
Upper tail dependence (booms happen together)
No lower tail dependence
theta >= 1

Choosing a Copula¶

Copula	Lower Tail	Upper Tail	Use Case
Gaussian	No	No	General correlation
Clayton	Yes	No	Joint crashes, defaults
Frank	No	No	Weak/negative dependence
Gumbel	No	Yes	Joint extremes (high)

Combining Copulas with Marginals¶

Copulas generate uniform marginals. Transform to desired distributions:

from superstore import GaussianCopula, sampleNormal
from scipy.stats import norm, lognorm

# Generate correlated uniforms
copula = GaussianCopula(correlation=[[1.0, 0.8], [0.8, 1.0]])
u = copula.sample(n=10000)

# Transform to different marginal distributions
x = norm.ppf(u[:, 0], loc=0, scale=1)      # Standard normal
y = lognorm.ppf(u[:, 1], s=0.5, scale=100)  # Log-normal

# x and y are now correlated with different marginals

Examples¶

Correlated Asset Returns¶

from superstore import GaussianCopula
from scipy.stats import t

# Create correlated returns for 4 assets
correlation = [
    [1.0, 0.6, 0.3, 0.2],
    [0.6, 1.0, 0.4, 0.3],
    [0.3, 0.4, 1.0, 0.5],
    [0.2, 0.3, 0.5, 1.0],
]
copula = GaussianCopula(correlation=correlation)
u = copula.sample(n=252)

# Transform to Student-t returns (fat tails)
import numpy as np
returns = t.ppf(u, df=5) * 0.02  # 2% daily vol

Credit Risk Defaults¶

from superstore import ClaytonCopula

# Strong lower tail dependence for joint defaults
copula = ClaytonCopula(theta=3.0, dim=5)
u = copula.sample(n=10000)

# Transform to default indicators
default_threshold = 0.03  # 3% default probability
defaults = u < default_threshold  # Boolean array

Insurance Claims¶

from superstore import GumbelCopula
from scipy.stats import pareto

# Upper tail dependence for extreme claims
copula = GumbelCopula(theta=2.5, dim=3)
u = copula.sample(n=5000)

# Transform to Pareto claims
claims = pareto.ppf(u, b=2.0, scale=10000)

API Reference¶

See the full API Reference for all copula classes.