Skip to content

Instantly share code, notes, and snippets.

@ofgulban
Created February 2, 2022 10:54
Show Gist options
  • Save ofgulban/a7b00fea3104d3fe951a9c0c1be22f45 to your computer and use it in GitHub Desktop.
Save ofgulban/a7b00fea3104d3fe951a9c0c1be22f45 to your computer and use it in GitHub Desktop.
Simulating the spurious correlation of ratios. Run this script multiple times and see the correlations between variables.
"""Simulate the spurious correlation of ratios.
Run this script multiple times and see the correlations between variables.
Reference
---------
- Pearson, K. (1896). Mathematical Contributions to the Theory of Evolution.
On a Form of Spurious Correlation Which May Arise When Indices Are Used in the
Measurement of Organs. Proceedings of the Royal Society of London, 60, 489–498.
<https://doi.org/10.1098/rspl.1896.0076>
"""
import numpy as np
# Sample size
N = 1000
# Select three measurements at random
x = np.random.random(N)
y = np.random.random(N)
z = np.random.random(N)
# Compute their x, y, z pair correlations
corr_xy = np.corrcoef(x, y)[0, 1]
corr_yz = np.corrcoef(y, z)[0, 1]
corr_xz = np.corrcoef(x, z)[0, 1]
# Ratio pairs
u = x / y
v = z / y
# Compute ratio correlations
corr_uv = np.corrcoef(u, v)[0, 1]
print(f"Correlation xy = {corr_xy:.2f}")
print(f"Correlation yz = {corr_yz:.2f}")
print(f"Correlation xz = {corr_xz:.2f}")
print(f"Correlation uv = {corr_uv:.3f}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment