Skip to content

Instantly share code, notes, and snippets.

@oluies
Created January 3, 2017 08:19
Show Gist options
  • Save oluies/486e3575c1f71c8e7e161727e6700055 to your computer and use it in GitHub Desktop.
Save oluies/486e3575c1f71c8e7e161727e6700055 to your computer and use it in GitHub Desktop.
# Dictionary to map Spark data types to Hive
d = {'StringType':'STRING', 'DoubleType':'DOUBLE', 'IntegerType': 'INT', 'DateType':'DATE', 'LongType': 'BIGINT'}
# Convert to Hive schema
schemastring = ', '.join([field.name + ' ' + d[str(field.dataType)] for field in df.schema.fields])
hivetablename='mortgage_all'
output_path='path'
filename='filename'
# Create Hive table
ddl = """CREATE EXTERNAL TABLE IF NOT EXISTS %s(%s) STORED AS ORC LOCATION '%s'""" % (hivetablename, schemastring, output_path + filename)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment