Skip to content

Instantly share code, notes, and snippets.

@tristanwietsma
Last active October 9, 2023 18:55
Show Gist options
  • Save tristanwietsma/8481824 to your computer and use it in GitHub Desktop.
Save tristanwietsma/8481824 to your computer and use it in GitHub Desktop.
Access glmnet through RPy2
import numpy as np
import rpy2.robjects as ro
import rpy2.robjects.numpy2ri as n2r
n2r.activate()
r = ro.r
r.library('glmnet')
# input files (for this example) need to have header and NO index column
X = np.loadtxt('./x.csv', dtype=float, delimiter=',', skiprows=1)
y = np.loadtxt('./y.csv', dtype=int, delimiter=',', skiprows=1)
y = ro.FactorVector(list(y.transpose())) # use factors
trained_model = r['cv.glmnet'](X, y, nfolds=3, family="binomial")
lambda_ = np.asanyarray(trained_model.rx2('lambda'))
cvm_ = np.asanyarray(trained_model.rx2('cvm'))
cvsd_ = np.asanyarray(trained_model.rx2('cvsd'))
lambda_min = np.asanyarray(trained_model.rx2('lambda.min'))[0]
min_cvm = cvm_[np.argwhere(lambda_ == lambda_min)[0][0]]
idx = np.argwhere(cvm_ < min_cvm + 0.1*cvsd_)
idx[0]
fit = trained_model.rx2('glmnet.fit')
beta = n2r.ri2numpy(r['as.matrix'](fit.rx2('beta')))
relvars = np.argwhere(beta[:,idx[0]].transpose()[0] > 1e-5)
print relvars.transpose()[0]
@abhishek-ghose
Copy link

I finally figured this out, leaving a note here for others. The right datatype is the FloatVector. The weight vector can be casted to it, for ex here's a list of the size of datapoints in X consisting of only 1s: rpy2.robjects.FloatVector([1.0] * numpy.shape(X)[0])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment