Skip to content

Instantly share code, notes, and snippets.

@jepma
Last active June 21, 2017 05:22
Show Gist options
  • Save jepma/5c3bc47f6381f0bf82ea99d033640b02 to your computer and use it in GitHub Desktop.
Save jepma/5c3bc47f6381f0bf82ea99d033640b02 to your computer and use it in GitHub Desktop.
Read parquet-file and use data via Pandas
import numpy as np
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
# read parquet-file
table = pq.read_table("FILENAME_HERE")
table_pd = table.to_pandas()
# retrieving columns
parquet_columns = table.schema
for parquet_column in parquet_columns:
print(parquet_column)
# iterate over rows:
for index, row in table_pd.iterrows():
print(row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment