Skip to content

Instantly share code, notes, and snippets.

@rafaan
Created February 24, 2015 20:24
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save rafaan/4ddc91ae47ea46a46c0b to your computer and use it in GitHub Desktop.
Save rafaan/4ddc91ae47ea46a46c0b to your computer and use it in GitHub Desktop.
Convert Nested JSON to Pandas DataFrame and Flatten List in a Column
import json
from pandas.io.json import json_normalize
import pandas as pd
with open('C:\filename.json') as f:
data = json.load(f)
df = pd.DataFrame(data)
normalized_df = json_normalize(df['nested_json_object'])
'''column is a string of the column's name.
for each value of the column's element (which might be a list),
duplicate the rest of columns at the corresponding row with the (each) value.
'''
def flattenColumn(input, column):
column_flat = pd.DataFrame([[i, c_flattened] for i, y in input[column].apply(list).iteritems() for c_flattened in y], columns=['I', column])
column_flat = column_flat.set_index('I')
return input.drop(column, 1).merge(column_flat, left_index=True, right_index=True)
new_df = flattenColumn(normalized_df, 'column_name')
@Andrewm4894pmc
Copy link

Thank you - something like this would be great as an option to pass to json_normalize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment