Skip to content

Instantly share code, notes, and snippets.

@Nydhal
Last active December 31, 2018 17:30
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Nydhal/2bb2831b3713dade324af65a3aabb88a to your computer and use it in GitHub Desktop.
Save Nydhal/2bb2831b3713dade324af65a3aabb88a to your computer and use it in GitHub Desktop.
Based on this tweet: DATA SCIENCE TRICK DU JOUR https://twitter.com/nntaleb/status/1079029317814890497
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Corr_Matrix_Generation.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/Nydhal/7ca44108cacca4caabe107760fc5d66e/notebook.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"metadata": {
"id": "mZ3umbNkwSea",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"##### Imports and formatting"
]
},
{
"metadata": {
"id": "GvjS0Wo3wMlC",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"import numpy as np\n",
"\n",
"# Matrix is positive definite if Cholesky Decomposition exists\n",
"def PositiveDefiniteMatrix(V):\n",
" try: np.linalg.cholesky(V); return True\n",
" except np.linalg.LinAlgError: return False"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "AeHhM2MyejaY",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## Generating Correlation Matrices for Monte Carlo Simulations\n",
"**TRICK:** How to build a square correlation/covariance matrix that is **always** positive definite?\n",
"\n",
"**BACKGROUND:** Correlation matrices are used for simulating distributions.\n",
"\n",
"For univariate distros, we can control with the scale (or standard deviation).\n",
"\n",
"For multivariate distros (which we often need), there is no equivalent single parameter to control the matrix.\n",
"\n",
"having many parameters can produce inconsistencies (matrix no longer positive definite).\n",
"\n",
"Instead of plugging in parameter numbers, then testing for positive definite, here is a technique that allows to control a **dependence coefficient** of the matrix the first eigenvalue. In other words, a low (or flat eigenvalue vector means low average dependence, the reverse means low average dependence). (**Flat eigenvalue vector:** all eigenvalues converge to 1)\n",
"\n",
"**TECHNIQUE:** You randomize (n is sample per variable, d is the number of variables). The relation between n and d determines the dependence coefficient.\n",
"\n",
"**THEORY:** We are using the properties that small samples are more random than large ones.\n",
"\n",
"---\n",
"\n",
"m is the number of variables\n",
"\n",
"n is the number of data per variable\n",
"\n",
"runs numbere of Monte Carlo Runs"
]
},
{
"metadata": {
"id": "6H7OU7c4eZCP",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"def Generate_CorrMatrix(m,n):\n",
" M = np.random.normal(0,1,(m,n))\n",
" V = np.corrcoef(M,rowvar=False)#columns represents variables, rows contain data.\n",
" print('\\n Matrix M: \\n' + str(np.matrix(M)))\n",
" print('\\n Correlation matrix V: \\n' + str(np.matrix(V)))\n",
" print('\\n Eigenvalues: \\n' + str(np.linalg.eig(V)[0]) + '\\n')\n",
" print('\\n Positive definite?')\n",
" return PositiveDefiniteMatrix(V)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "-HUZ7XgDgTay",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### High dependence"
]
},
{
"metadata": {
"id": "AfeDmdFjWmaw",
"colab_type": "code",
"outputId": "21d9e427-1061-4153-cdb7-a31f14957473",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 323
}
},
"cell_type": "code",
"source": [
"Generate_CorrMatrix(m=3,n=4)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"\n",
" Matrix M: \n",
"[[ 0.49604558 -0.12469985 -0.48376557 0.17952923]\n",
" [ 0.12426166 -0.87510485 -0.82415182 0.63250532]\n",
" [ 0.20679043 1.27963211 -1.38098702 -0.52312303]]\n",
"\n",
" Correlation matrix V: \n",
"[[ 1. 0.03949668 0.64103297 -0.08874863]\n",
" [ 0.03949668 1. -0.74159578 -0.99878211]\n",
" [ 0.64103297 -0.74159578 1. 0.70759398]\n",
" [-0.08874863 -0.99878211 0.70759398 1. ]]\n",
"\n",
" Eigenvalues: \n",
"[ 1.31547183e+00 2.68452817e+00 -7.68290261e-17 -3.46469883e-16]\n",
"\n",
"\n",
" Positive definite?\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"False"
]
},
"metadata": {
"tags": []
},
"execution_count": 4
}
]
},
{
"metadata": {
"id": "qvhFbXLc11VE",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### Low dependence"
]
},
{
"metadata": {
"id": "zSgXmqyNskGG",
"colab_type": "code",
"outputId": "435693d3-533b-435f-ee4c-1f944872118c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1972
}
},
"cell_type": "code",
"source": [
"Generate_CorrMatrix(m=100,n=4)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"\n",
" Matrix M: \n",
"[[ 7.44257067e-01 -4.07941163e-01 -6.21043462e-01 1.49493184e+00]\n",
" [ 8.87712295e-01 5.23999197e-01 9.53517715e-01 7.11473854e-01]\n",
" [ 2.19576519e+00 5.94888053e-01 -7.82697410e-01 2.97605180e-02]\n",
" [ 1.88319942e+00 2.72331710e-01 -9.06314109e-01 8.69821533e-01]\n",
" [ 9.71816163e-01 -1.51318972e+00 6.39172532e-01 -1.67087534e+00]\n",
" [-1.85013036e-01 1.18760048e+00 2.29679462e+00 -3.08092895e-01]\n",
" [ 4.16335343e-01 4.18222613e-01 9.71624301e-01 -1.94404109e+00]\n",
" [-6.43134468e-01 5.42125298e-02 -5.13627915e-01 -1.34559720e+00]\n",
" [-9.29825545e-01 3.03277452e-01 1.57590974e-01 -1.57726177e-01]\n",
" [ 1.60757228e-01 -1.47630843e+00 -1.09106744e+00 5.21524028e-01]\n",
" [-2.20663795e-01 -8.74704880e-01 6.60263932e-02 -1.18236881e-01]\n",
" [-1.81737435e-01 1.18368252e+00 -4.92011890e-01 -1.25471747e+00]\n",
" [ 3.72031932e-01 3.44652384e-01 1.59392552e+00 -2.52252035e-01]\n",
" [ 2.74507146e-01 3.82285397e-01 2.45456678e-01 -3.05868835e-01]\n",
" [-9.35944184e-01 1.94311072e+00 -7.67482058e-01 -3.65733186e-01]\n",
" [ 5.44869343e-01 8.32422172e-01 2.18946273e+00 -8.22378054e-01]\n",
" [ 7.39324357e-01 1.26578068e+00 -2.55304708e-01 6.66221995e-02]\n",
" [ 2.49665609e-01 -6.23838643e-03 -5.47313694e-01 2.25226719e+00]\n",
" [-2.10594141e+00 1.22273971e+00 1.81145481e+00 -9.13260186e-01]\n",
" [-1.18554142e-01 -1.96256086e+00 -2.92573786e+00 1.72793283e-01]\n",
" [-2.17898684e+00 6.37373456e-01 -5.04658267e-01 -1.49475119e-01]\n",
" [-2.46616083e-03 -1.08696277e-01 -1.52217343e-02 -1.60383269e-01]\n",
" [-1.52339518e+00 3.53958115e-01 1.84463479e+00 -2.29935302e+00]\n",
" [ 2.67370878e+00 -5.75769710e-01 2.19450975e-01 -1.19057602e+00]\n",
" [-1.17646729e+00 -3.44059997e-01 -9.16394686e-01 -1.43803092e+00]\n",
" [ 2.44148402e-01 2.03164124e-01 -1.93565928e-01 -1.35132208e+00]\n",
" [-3.87333910e-01 5.02721715e-02 -1.74216232e+00 -1.38884850e+00]\n",
" [-1.73202334e-01 5.15551019e-01 -2.58590972e+00 -6.45633412e-01]\n",
" [-6.44109377e-01 1.10096616e+00 -5.18815936e-01 -1.34922312e+00]\n",
" [-8.96324991e-02 -1.00413657e+00 -9.40838335e-01 7.67878437e-01]\n",
" [ 2.40212768e-01 2.74298785e-03 -6.14983230e-01 -7.90714508e-01]\n",
" [-3.91395506e-01 1.89712695e+00 -2.58952152e+00 1.09735342e+00]\n",
" [ 8.07132845e-01 6.54849118e-01 -7.71417663e-01 4.55095758e-01]\n",
" [ 1.32034474e+00 2.42162863e-01 6.44116154e-01 6.59926301e-01]\n",
" [-1.02900481e-01 -5.52782925e-01 1.00285883e+00 -1.00397718e+00]\n",
" [ 8.79476663e-01 -1.40239248e+00 4.89912596e-01 9.99075238e-01]\n",
" [-3.80133642e-01 1.50193807e+00 1.59229285e+00 1.85231938e+00]\n",
" [-5.15015064e-01 -3.43150415e-01 8.50410068e-02 -1.23129376e+00]\n",
" [-9.02908850e-02 2.62087220e+00 -1.08379201e+00 -8.80347295e-01]\n",
" [-9.02153697e-02 2.44609329e+00 1.01503722e+00 -2.72497148e-01]\n",
" [ 2.43495657e-01 -3.43122551e-01 1.83330828e+00 -4.66803745e-01]\n",
" [ 7.24043171e-01 -2.09900913e-02 -7.81824830e-02 -9.86411695e-01]\n",
" [-2.33386816e-01 -1.22612201e+00 9.59099517e-01 1.15679446e+00]\n",
" [ 2.82912110e-01 6.03017237e-01 1.07251238e+00 -1.67985489e-01]\n",
" [-5.97339347e-01 3.14536016e-01 -1.29208806e+00 -1.01875627e+00]\n",
" [ 4.06977406e-01 -4.05885876e-01 -4.46752661e-01 -2.54343008e-01]\n",
" [-4.45903011e-01 -1.15208973e+00 -2.13888473e-01 1.20708337e+00]\n",
" [ 2.46929909e-01 2.04672053e-01 -5.82940789e-01 -3.52147900e-01]\n",
" [ 1.14890988e+00 4.14682278e-01 -1.01476878e+00 -1.32425331e-01]\n",
" [ 1.09851645e+00 4.46633342e-01 -6.55128053e-01 -1.57661670e+00]\n",
" [ 1.02520173e+00 4.11930690e-01 1.34707976e-01 9.95555428e-01]\n",
" [-2.84178746e-01 8.88604992e-03 4.70047196e-01 -1.10175954e+00]\n",
" [-2.32528980e-01 1.21251988e-01 2.21750532e-01 -1.52695561e+00]\n",
" [ 5.38177545e-01 -8.17226506e-01 1.08634877e+00 -2.37651582e+00]\n",
" [-2.49612920e+00 5.03430305e-01 3.10842395e-01 6.48705626e-01]\n",
" [ 5.86673518e-01 -4.09947664e-01 2.32181557e-01 -4.47580742e-01]\n",
" [ 5.03303346e-01 -1.02591774e+00 2.07316264e-01 1.62338518e-01]\n",
" [-9.72764286e-01 -2.01439083e+00 -9.79079945e-01 1.02023835e+00]\n",
" [-7.14951662e-01 -9.48340146e-01 -6.44965104e-01 1.40855408e-01]\n",
" [ 1.51074183e-01 -2.07518294e+00 6.72028302e-01 -2.76351699e-01]\n",
" [-2.03352591e+00 -4.04680159e-01 4.10554142e-01 1.32195970e-01]\n",
" [ 2.36905355e-01 2.41568642e-01 8.80694512e-01 -1.07875039e+00]\n",
" [-1.35912256e+00 7.62561671e-01 1.11876336e+00 1.22679992e+00]\n",
" [-1.01339018e+00 -9.16472744e-01 -7.39288968e-02 -9.92599547e-03]\n",
" [ 1.58961744e+00 4.47423372e-01 2.14907138e+00 3.54096289e-02]\n",
" [-1.63292215e-01 6.35194068e-01 2.60631628e+00 1.69180958e+00]\n",
" [ 2.03853762e+00 -5.56118607e-01 1.79652485e+00 -3.13946197e-01]\n",
" [ 4.83880124e-01 3.27403005e-01 -8.18287067e-01 -2.74069339e-01]\n",
" [ 3.93691814e-01 -6.02243131e-01 -9.04658819e-01 2.54267429e-01]\n",
" [-1.36312854e+00 -7.31385331e-01 -6.79530614e-01 -4.59816219e-01]\n",
" [-6.84912071e-01 4.37926026e-01 -1.55621753e+00 1.69675828e+00]\n",
" [-2.35359227e-01 -5.05406299e-01 1.31034472e+00 5.90890041e-01]\n",
" [ 1.46452179e+00 9.74996477e-01 -4.85487402e-01 1.83955984e-02]\n",
" [-8.77331254e-01 -6.53880120e-01 -1.35581005e+00 -1.47409389e+00]\n",
" [ 7.47195501e-01 -1.13014129e+00 -1.18182644e+00 -6.86380235e-01]\n",
" [-4.51766734e-01 -3.55618858e-01 -2.22113879e-01 -5.42652490e-01]\n",
" [ 8.34295341e-01 1.39363899e+00 -4.66266915e-01 -4.82545163e-01]\n",
" [ 6.32104853e-01 1.26480266e+00 -4.53728046e-01 1.57358707e+00]\n",
" [-1.50672962e+00 -9.58941847e-01 -1.45570029e+00 1.19597982e+00]\n",
" [ 5.50430466e-01 -4.08747783e-01 1.53765260e+00 8.78875220e-02]\n",
" [ 6.07472937e-01 6.61498377e-01 4.74552271e-01 1.98951127e+00]\n",
" [-1.61906212e+00 5.48859690e-01 3.18812153e-01 2.49961532e+00]\n",
" [-3.39355435e-01 -1.52421773e-01 1.90980405e+00 -1.99394295e+00]\n",
" [-4.54492793e-01 1.72676994e+00 5.17432384e-01 -3.16569998e-01]\n",
" [ 6.16010866e-01 1.01542901e+00 1.78960549e+00 5.42581448e-01]\n",
" [ 1.41235844e+00 1.61282975e+00 4.22999579e-01 -2.16987014e-02]\n",
" [ 7.60885326e-01 -5.57022533e-02 -1.49072248e+00 -1.85456311e+00]\n",
" [-2.61796571e-01 -2.33805080e+00 2.11749944e-01 5.49361227e-01]\n",
" [-4.53022065e-01 -6.71962666e-01 5.75895559e-01 -9.70950440e-01]\n",
" [ 9.97945724e-02 8.06397077e-01 1.60872641e+00 1.43433137e+00]\n",
" [-4.10260370e-01 1.12417187e+00 -1.05204501e+00 -1.15117349e+00]\n",
" [-2.72286780e-01 -5.29221476e-03 8.06342995e-02 7.84653884e-01]\n",
" [-1.64652038e-01 -1.45689731e+00 1.08808393e+00 -1.13411953e+00]\n",
" [ 2.37609378e-01 8.67528426e-01 -3.84856164e-01 6.09507555e-01]\n",
" [ 9.99558632e-01 5.63997569e-01 -8.08549859e-01 1.81242408e+00]\n",
" [ 2.83467599e-01 -3.80576803e-01 -2.04698929e+00 -1.67597362e+00]\n",
" [ 6.80655499e-01 -8.59413398e-01 3.24625686e-02 1.46031757e+00]\n",
" [-2.16455316e+00 7.46593662e-01 1.41972846e+00 -6.56037189e-01]\n",
" [-6.13663959e-01 -5.40153963e-01 -6.62107962e-01 9.38372495e-01]\n",
" [ 7.55988167e-01 -1.38565463e+00 -3.90831310e-01 -7.12290338e-02]]\n",
"\n",
" Correlation matrix V: \n",
"[[ 1. -0.01097257 0.01755376 0.01479656]\n",
" [-0.01097257 1. 0.10412532 0.01567658]\n",
" [ 0.01755376 0.10412532 1. -0.01258119]\n",
" [ 0.01479656 0.01567658 -0.01258119 1. ]]\n",
"\n",
" Eigenvalues: \n",
"[0.88761319 0.99343832 1.01453559 1.1044129 ]\n",
"\n",
"\n",
" Positive definite?\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 5
}
]
},
{
"metadata": {
"id": "cFI7QvYMIWyZ",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
""
],
"execution_count": 0,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment