Skip to content

Instantly share code, notes, and snippets.

@vepetkov
Created May 7, 2019 15:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vepetkov/b923359759497acbf06d31a2f66e9eba to your computer and use it in GitHub Desktop.
Save vepetkov/b923359759497acbf06d31a2f66e9eba to your computer and use it in GitHub Desktop.
Read a local ORC file in Python and convert it to a DF
import pandas as pd
import pyarrow.orc as orc
file0 = open('/hive/warehouse/000000_0', 'rb')
data0 = orc.ORCFile(file0)
df0 = data0.read(columns=['_col10', '_col50']).to_pandas()
df0.describe()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment