Skip to content

Instantly share code, notes, and snippets.

@LeiG
Last active April 8, 2022 14:06
Show Gist options
  • Save LeiG/8094753a6cc7907c716f to your computer and use it in GitHub Desktop.
Save LeiG/8094753a6cc7907c716f to your computer and use it in GitHub Desktop.
Python and .RData files
import rpy2.robjects as robjects
import pandas.rpy.common as com
import pandas as pd
## load .RData and converts to pd.DataFrame
robj = robjects.r.load('test.RData')
# iterate over datasets the file
for sets in robj:
myRData = com.load_data(sets)
# convert to DataFrame
if not isinstance(myRData, pd.DataFrame):
myRData = pd.DataFrame(myRData)
## save pd.DataFrame to R dataframe
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C':[7,8,9]},
index=["one", "two", "three"])
r_dataframe = com.convert_to_r_dataframe(df)
@Franck-Dernoncourt
Copy link

pandas.rpy module was removed in pandas 0.20

@ofajardo
Copy link

ofajardo commented Dec 27, 2018

As an alternative for those who would prefer not having to install R in order to accomplish this task (r2py requires it), there is a new package "pyreadr" which allows reading RData and Rds files directly into python without dependencies.

It is a wrapper around the C library librdata, so it is very fast.

You can install it easily with pip:

pip install pyreadr

As an example you would do:

import pyreadr

result = pyreadr.read_r('/path/to/file.RData') # also works for Rds

# done! let's see what we got
# result is a dictionary where keys are the name of objects and the values python
# objects
print(result.keys()) # let's check what objects we got
df1 = result["df1"] # extract the pandas data frame for object df1

The repo is here: https://github.com/ofajardo/pyreadr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment