Skip to content

Instantly share code, notes, and snippets.

@coppeliaMLA
Created February 27, 2014 09:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save coppeliaMLA/9247171 to your computer and use it in GitHub Desktop.
Save coppeliaMLA/9247171 to your computer and use it in GitHub Desktop.
Hive seems to struggle with files headers when loading flat files. Here's a bit of python to trim the first line (i.e. the column header line) from every file.
import os
dir = 'put your director in here'
for filename in os.listdir(dir):
with open(dir+filename, 'r') as fin:
data = fin.read().splitlines(True)
with open(dir+filename, 'w') as fout:
fout.writelines(data[1:])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment