Skip to content

Instantly share code, notes, and snippets.

@westonpace
Created August 26, 2020 22:50
Show Gist options
  • Save westonpace/0c5ef01e21a40de5d16608b7f12de80d to your computer and use it in GitHub Desktop.
Save westonpace/0c5ef01e21a40de5d16608b7f12de80d to your computer and use it in GitHub Desktop.
Example attempting to write pyarrow with new filesystems API
import pyarrow.fs as pafs
import pyarrow.parquet as pq
filesystem = pafs.LocalFileSystem()
subtree_filesystem = pafs.SubTreeFileSystem('C:\\', filesystem)
in_path = 'Users\\westpace\\in.parquet'
out_path = 'Users\\westpace\\out.parquet'
table = pq.read_table(in_path, filesystem=subtree_filesystem)
print(f'Read in {table.num_rows} rows')
pq.write_table(table, out_path, filesystem=subtree_filesystem)
# pq.write_to_dataset(table, out_path, filesystem=subtree_filesystem)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment