Skip to content

Instantly share code, notes, and snippets.

@comcon1
Created February 26, 2021 12:51
Show Gist options
  • Save comcon1/e1a525b7a09d528c24f025ac240e62bd to your computer and use it in GitHub Desktop.
Save comcon1/e1a525b7a09d528c24f025ac240e62bd to your computer and use it in GitHub Desktop.
Generate point cloud around the specified point with log-normal distribution
#!/usr/bin/env python3
import numpy as np
import pandas as pd
import math
import sys
print("Using pandas ver "+pd.__version__)
try:
if len(sys.argv) != 3:
raise
sigma = float(sys.argv[1])
outf = sys.argv[2]
except:
print("Please run ./genParCloud.py [sigma] [output.csv]")
sys.exit(1)
# an example point
x = np.array([ 0.0009825491410708322, 7.598713300486712e-05, 4.224027984344428, 10.3581808483661,
0.214155408367835, 5.6896492525618285, 0.0004998229601811062, 3.98329388118285,
3.2188620413130344, 0.0008449489223685573, 0.00033669412739256114, 36.82257415654826,
0.5313070938965305, 1.97336462453e-05, 6.75165317732e-05, 1.4323111044574848])
l_x = np.log(x)
s_x = [sigma]*x.shape[0]
# generate distribution over every dimension
nsamp = 1000
pgen = pd.DataFrame(np.zeros((nsamp,x.shape[0])))
pdist = +666
it = 0
while pdist > 0.2:
it += 1
for i in range(x.shape[0]):
jj = np.random.lognormal(l_x[i]-s_x[i]**2/2., s_x[i], nsamp)
pgen[i] = jj
pdist = np.max( np.abs ( 1 - pgen.apply(np.mean) / x ) )
#print(jj)
print("Found co-distributions after %d iterations" % (it))
print( np.abs ( 1 - pgen.apply(np.mean) / x ) )
# generate point cloud based on generated distributions
nCombs = 100000
parCombs = pd.DataFrame( np.zeros((nCombs,x.shape[0])) )
for i in range(nCombs):
parCombs.iloc[i,:] = pgen.iloc[ [np.random.randint(0,nsamp) for i in range(x.shape[0])], :].to_numpy().diagonal()
print("Head of combinations table")
print(parCombs.head())
parCombs.to_csv(outf, index=True, na_rep='nan')
@comcon1
Copy link
Author

comcon1 commented Feb 26, 2021

Note1: the log-normal distribution is shifted to reproduce mean at the point.
Note2: The cycle for per-dimension distribution is used to be sure that the real statistical properties of the generated sequence is not very far from the designed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment