Skip to content

Instantly share code, notes, and snippets.

@Wavewash
Last active August 29, 2015 14:01
Show Gist options
  • Save Wavewash/ac3dd8f7c9df69665811 to your computer and use it in GitHub Desktop.
Save Wavewash/ac3dd8f7c9df69665811 to your computer and use it in GitHub Desktop.
Convert a Quickbase csv file to a JSON file that is consumable by Solr. This python file was meant to be run as a command line utility taking in as arguments the input csv file and output file. To import the produced json file into solr I found success with the post.jar provided with the examples in solr. So an example import command would look …
__author__ = 'mo kakwan'
import sys
import csv
import json
import time
if len(sys.argv) <= 1:
print("USAGE: input.csv output.json")
sys.exit(0)
inputfile = sys.argv[1];
outputfile = sys.argv[2];
with open(inputfile, 'r') as f:
reader = csv.reader(f)
#build the formatted headers names to use in the Solr JSON file
headers = next(reader, None) # returns the headers or `None` if the input is empty
formatted_headers = []
if headers:
for heading in headers:
formatted_headers.append(heading.lower().replace(" ", "_"))
#print(formatted_headers)
solrDataObject = []
for row in reader:
i = 0
dataObject = {}
for data in row:
#change this for whatever your date line may be coming in as from your datasource
if(str(formatted_headers[i]).endswith("_dt")):
#format the date time in a format that solr is okay with
dateData = time.strptime(data, "%m-%d-%Y %I:%M %p")
dataObject[formatted_headers[i]] = time.strftime("%Y-%m-%dT%H:%M:%SZ", dateData)
else:
dataObject[formatted_headers[i]] = data
i = i + 1
solrDataObject.append(dataObject)
fo = open(outputfile, "w")
fo.write(json.dumps(solrDataObject));
fo.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment