Skip to content

Instantly share code, notes, and snippets.

@dvdme
Last active March 8, 2021 17:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dvdme/12122a6895bba493b318639951fd0719 to your computer and use it in GitHub Desktop.
Save dvdme/12122a6895bba493b318639951fd0719 to your computer and use it in GitHub Desktop.
"""
Example usage of the function `group_json_by_key`
"""
import json
JSON_EXAMPLE_STR = """
{
"color": {
"0": "yellow",
"1": "red"
},
"fruit": {
"0": "banana",
"1": "strawberry"
},
"grows_on": {
"0": "tree",
"1": "shrub"
}
}
"""
def group_json_by_key(group_by_key, json_str):
"""Groups json output from pandas by a given group key
When exporting a pandas DataFrame to json using the default
orient argument 'columns' it encodes a table in json.
Sometimes this is not the most desirable format.
This function groups the values from the rows in a json
object whose key is the group key.
Examples:
Input:
{
"color": {
"0": "yellow",
"1": "red"
},
"fruit": {
"0": "banana",
"1": "strawberry"
},
"grows_on": {
"0": "tree",
"1": "shrub"
}
}
Outputs:
{
"banana": {
"color": "yellow",
"grows_on": "tree"
},
"strawberry": {
"color": "red",
"grows_on": "shrub"
}
}
Args:
group_by_key (string): The key (column) to group by
json_str (string): The json string to use as input
Returns:
string: Reformated json string
Raises:
KeyError: If `group_by_key` is not present in `json_str`
"""
parsed_json = json.loads(json_str)
grouped_json = {}
if group_by_key not in parsed_json:
raise KeyError(f"{group_by_key} not present in input json")
group_dict = parsed_json[group_by_key]
for group_key, group_value in group_dict.items():
grouped_json[group_value] = {}
for item_key, item_value in parsed_json.items():
if item_key == group_by_key:
continue
grouped_json[group_value].update({item_key: item_value[group_key]})
return json.dumps(grouped_json)
if __name__ == "__main__":
ret_val = group_json_by_key("fruit", JSON_EXAMPLE_STR)
parsed = json.loads(ret_val)
print(json.dumps(parsed, indent=4, sort_keys=True))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment