Skip to content

Instantly share code, notes, and snippets.

@Elsaveram
Created September 3, 2018 22:53
Show Gist options
  • Save Elsaveram/3258db49eaac5e258401338ae17139a3 to your computer and use it in GitHub Desktop.
Save Elsaveram/3258db49eaac5e258401338ae17139a3 to your computer and use it in GitHub Desktop.
Convert json to pandas dataframe
import json
import pandas as pd
from glob import glob
import matplotlib.pyplot as plt
#Convert json string to a flat python dictionary
def convert(x):
ob = json.loads(x)
for k, v in ob.copy().items():
if isinstance(v, list):
ob[k] = ','.join(v)
elif isinstance(v, dict):
for kk, vv in v.items():
ob['%s_%s' % (k, kk)] = vv
del ob[k]
return ob
for json_filename in glob('*.json'):
csv_filename = '%s.csv' % json_filename[:-5]
print('Converting %s to %s' % (json_filename, csv_filename))
df = pd.DataFrame([convert(line) for line in open(json_filename, encoding='utf-8')])
df.to_csv(csv_filename, encoding='utf-8', index=False)
#Convert csv to pd df
review = pd.read_csv('yelp_academic_dataset_review.csv')
business = pd.read_csv('yelp_academic_dataset_business.csv')
checkin = pd.read_csv('yelp_academic_dataset_checkin.csv')
tip= pd.read_csv('yelp_academic_dataset_tip.csv')
user= pd.read_csv('yelp_academic_dataset_user.csv')
@aljen15
Copy link

aljen15 commented Apr 12, 2020

Hi! Is there a way to only take lets say... 20% of the datapoints from the json file into the dataframe? without opening the json file? its 50 gb and my computer just crashes

@elviejoyady
Copy link

Hello, I hope you are well.

I wanted to ask you how I could use the code to convert from json to python. I currently have several swager files in json and I don't know how to reference the code to convert these files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment