Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 27 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save kousik93/6d95c4c4d37d8c731d7b to your computer and use it in GitHub Desktop.
Save kousik93/6d95c4c4d37d8c731d7b to your computer and use it in GitHub Desktop.
Golang - Arbitrary JSON Array Parsing and Type Switch

##Golang Type Switch - Arbitrary JSON Array Parsing I'm writing this mostly as a reference for myself. This could also be helpful to people who are new to GO.

####Note 1: Until Problem 3 we will assume we are dealing with a JSON for which we know the data types of key,value pairs. Only in Problem 3 we will look at how Type Switch is used to parse a 100% arbitary JSON

####Note 2: I know the following examples given here can be easily solved by declaring approprite structs and just decoding the PUT JSON into them, but, as im not able to come up with a better scenario, im going to stick with this to explain arbitrary JSON parsing.

###Back Story

In the last post, I wrote about variable naming conventions which affect access privalges of variable across packages in GO. The last post (https://goo.gl/uVTj5B) was written under the context of accessing the JSON sent via a HTTP.POST. Continuing with the same example, this is going to be about parsing the said JSON.

In GO, parsing a predefined or known JSON structure is pretty easy and straight forward. We declare a type struct of the same structure as the JSON and just decode the http.Request.Body into the structure.

Eg: JSON:

    {
    "email": "foo@gmail.com",
    "zip": "94112",
    "country": "USA"
    }

To parse this we would do

type Person struct{
    var Email string `json:email`
    var Zip string `json:zip`
    var Country string `json:country`
}

and then

func decodeJSON(rw http.ResponseWriter, r *http.Request) {
	//r.Body will give you the JSON but not in a raw string format. So dont try printing out r.Body. Not our concern now
	decoder := json.NewDecoder(r.Body)
	var person Person
	err := decoder.Decode(&person)
	if err != nil {
		panic(err)
	}
	fmt.Println(person)

However, when it comes to parsing arbitrary JSON of unknown structure, GO is a huge pain in the ar**.

Before I get to the issue, GO has a kind of a templating system(kind of generics but noe exactly) via the use of interface{}. It helps to store data of arbitrary type and then allows us to 'assert' it into variables where we have to provide the actual data type of the arbitrary data.

Problem 1 - Simple JSON

Scenario when I needed to parse JSON of arbitrary structure:

In a HTTP PUT request, the user sends us a piece of JSON and tells us to update the original data string with the new data.

Oiginal JSON:

    {
    "email": "foo@gmail.com",
    "zip": "94112",
    "country": "USA"
    }

JSON from HTTP.PUT:

    {
    "zip":"11111",
    "country":"India"
    }

Expected final JSON:

    {
    "email": "foo@gmail.com",
    "zip": "11111",
    "country": "India"
    }

###Solution Now obvoiously, if we sclae this a bit, the variations of JSON the user can provide via PUT is huge (ie. hundreds of different fields to change). Here we need to parse this arbitrary JSON. We are going to do this via interface{}. Lets have a look.

In go all we need to do is create a variable like this:

var arbitrary_json map[string]interface{}

This variable has a map key of type string and the value of type interface{} (arbitrary).

Im going to put the whole program down here and comments in the middle to explain whats going on ####Code (import "github.com/drone/routes") :

type Person struct{
    var Email string `json:email`
    var Zip string `json:zip`
    var Country string `json:country`
}

p:=Person{"foo@gmail.com", "22222", "USA"}

var arbitrary_json map[string]interface{}

func main() {
	mux := routes.New()
	mux.Put("/profile", PutProfile)
	http.Handle("/", mux)
	 http.ListenAndServe(":3000", nil)
}

func PutProfile(w http.ResponseWriter, r *http.Request) {
	received_JSON, err := ioutil.ReadAll(r.Body)	//This reads raw request body
	if err != nil {
	 	panic(err)
	}
	//Here, this is called assertion where we specify what the type of the incoming json should be
	//In our case its a map of key of type string and value of some random type
	
	json.Unmarshal([]byte(received_JSON), &arbitrary_json)
	
	//Now we know received_JSON is of type map. So we iterate.
	
	for key, value := range arbitrary_json {
		switch key:{
			case "zip":
				//This is the final part where we assert again (type cast) what the data type of the data in interface{} should be. In this case, string. Yes, we have to cover all the fields. I know! This is a nighmanre when JSON gets really big 
				p.Zip=value.(string)	
			case "country":
				p.Country=value.(string)
			case "email":
				p.Email=value.(string)
		}
	}
}

From the code above we see that we have to type cast(assert) the incoming JSON in 2 places (once for key and another for value). This allows us to parse JSON of arbitrary structure (Not 100% arbitrary though, we knew both key and value were strings here).

Now, this procedure gets much complicated as we try to handle nested JSON and JSON Arrays (Notice we have been dealing with JSON Objects only).

###Problem 2 - Nested JSON Again we consider that the user sends us a JSON string to update data. Here are the variables:

Oiginal JSON:

    {
    	"email": "foo@gmail.com",
    	"zip": "94112",
    	"music": {
        	"spotify_user_id": "someid"
    	}
    }

JSON from HTTP.PUT:

    {
	"zip":"11111",
	"music": {
        "spotify_user_id": "someotherid"
        }
    }

Expected final JSON:

    {
    	"email": "foo@gmail.com",
    	"zip": "11111",
    	"music": {
        	"spotify_user_id": "someotherid"
    	}
    }

####Solution Here, to access 'email or 'zip' values we can use the normal method specified above. To access 'spotify_user_id', we have to do the same procedure again, but 1 level deeper.

Here is the code. I have commented it so I wont be explaining it in much detail: ####Code

//Yes, we declare struct Music separately
type Music struct{
	Spotify_user_id string `json:"spotify_user_id"`
}

type Person struct{
    var Email string `json:email`
    var Zip string `json:zip`
    Music struct{
		Spotify_user_id string `json:"spotify_user_id"`
	}`json:"music"`
}

p:= Person{"foo@gmail.com", "22222", Music{"someid"} }

var arbitrary_json map[string]interface{}
var music map[string]interface{}		//To access 'spotify_user-id'

func main() {
    mux := routes.New()
    mux.Put("/profile", PutProfile)
    http.Handle("/", mux)
    http.ListenAndServe(":3000", nil)
}

func PutProfile(w http.ResponseWriter, r *http.Request) {
    received_JSON, err := ioutil.ReadAll(r.Body)    //This reads raw request body
    if err != nil {
        panic(err)
    }
    //Here, this is called assertion where we specify what the type of the incoming json should be
    //In our case its a map of key of type string and value of some random type

    json.Unmarshal([]byte(received_JSON), &arbitrary_json)

    //Now we know received_JSON is of type map. So we iterate.

    for key, value := range arbitrary_json {
        switch key:{
            case "zip":
                //This is the final part where we assert again (type cast) what the data type of the data in interface{} should be. In this case, string. Yes, we have to cover all the fields. I know! This is a nighmanre when JSON gets really big 
                p.Zip=value.(string)    
            case "email":
                p.Email=value.(string)
            case "music":
			music = value.(map[string]interface{})
			//We have to iterate through value again as valuea here will be map[string]interface{} again
			for keya, valuea := range music {
				if keya=="spotify_user_id"{
					//We have to assert type here again finally getting the value we want
					p.Music.Spotify_user_id=valuea.(string)
				}
			}
        }
    }
}

####Problem 3 - 100% arbitary JSON and JSON Arrays

Now, in the previous scenarios, we knew that the value was going to be string. So everywhere we did value.(string) and valuea.(string) to tell GO that type of value, valuea is string.

What if we don't know what type the value is ctually going to be? One would say we can use reflection to find of what data type a variable is during runtime. Great! we seem to have a solution. But lets just take it down a notch. Lets assume, the value is going to be a JSON Array. More specifically, JSON Array of type string.

Lets look at an example again:

Oiginal JSON:

    {
    	"email": "foo@gmail.com",
    	"zip": "94112",
    	"tv_shows": ["show1","show2","show3"]
    }

JSON from HTTP.PUT:

    {
	"zip":"11111",
	"tv_shows": ["anothershow1","anothershow2","anothershow3"]
    }

Expected final JSON:

    {
    	"email": "foo@gmail.com",
    	"zip": "11111",
    	"tv_shows": ["anothershow1","anothershow2","anothershow3"]
    }

####Expectation Great. Lets start parsing it.

As usual, we would create a var of type map[string]interface{} to access 'email' and 'zip' fields. Now, how do we access the 'tv_shows' field? Here lies the problem.

From the look of things, one can say why not declare a variable of type map[string][]interface{} Well, this doesnt actually work. The compiler just spits out an error. Not sure why though.

Okay approach 2. Since interfaces accepts any data, we just do map[string]interface{} and when asserting the value we just assert it as type []string instead of string. Lets have a look at some code for that.

var arbitrary_json map[string]interface{}
var tv_shows map[string]interface{}

func PutProfile(w http.ResponseWriter, r *http.Request) {
    received_JSON, err := ioutil.ReadAll(r.Body)    //This reads raw request body
    if err != nil {
        panic(err)
    }
    json.Unmarshal([]byte(received_JSON), &arbitrary_json)
    for key, value := range arbitrary_json {
        switch key:{
            case "zip":
                p.Zip=value.(string)    
            case "email":
                p.Email=value.(string)
            case "tv_shows":
            	// Here we assert as []string. This should solve the problem right? NO.
		p.tv_shows = value.([]string)
			
        }
    }
}

####Reality The above code simply does not work. Doing value.([]string) in the above code, compiles it but throws a runtime error. Like WTH right!? So how then are you going to get access to not just the array but also its contents!

Thats exatly the problem that haunted me for hours! And more what if you dont know if the values in the array were srtings or something else. How would you then type cast it?

####Solution The solution is to use something called Type Switches. Here is the official documentation: https://golang.org/doc/effective_go.html#type_switch

Before I continue, Type Switches allows you to find the data type of variables during runtime just like Reflection does, but differently. It doesnt exactly return the data type but lets us guess it(this is some really loopy stuff)

There are however, a few odd caracteristics to Type Switches:

  1. This has to be used only in a switch statement. This cant be used standalone.

  2. It doesnt actually return the data type, but allows us to guess what it could be (in case statements)

  3. The synatx makes NO SENSE WHATSOEVER!!!

Lets have a look at an example of using Type Switches.

var t interface{}
t = functionOfSomeType()
switch t := t.(type) {
case string:
    fmt.Printf("string %t\n", t)     		// t has type string
case bool:
    fmt.Printf("boolean %t\n", t)             // t has type bool
case int:
    fmt.Printf("integer %d\n", t)             // t has type int
  }

So here, basically t is a variable of unknown type. This line here is where all the magic happens and the most confusing:

switch t := t.(type)

Here, what does t.(type) mean?

From first glance and given that we are talking baout detecting variable types, we can assume that this is a statement that would probably return the type of the variable t. And looking at the case statements we can actually confirm that this is probably right.

case string:
case bool:
...

Assuming we are correct, here is what immediately came to my mind:

  1. Shouldnt t=t.(type) give you an error of type mismatch or something?
  2. Doesnt t=t.(type) overwrite t with something(dont know what but it looks like it should)
  3. WORST OF ALL, t is being used as a proper variable inside the case statements (string, int, etc)!! WTH is going on!!
  4. In the case statement, t is used as a data type definer, and inside case, its being used as a variable. My head is spinning

Okay, calm down. Im going to try explaining this based on sonething I read somewhere. Forgot where.

When you do,

switch t=t.(type)

a couple of things happen.

#####Explaination of Type Switch Apparently, the 't' at the LHS of the equation which is a interface{}, actually transforms into proper variables of type defined in the case statements. Which means if 't' is in reality a string, then inside the 'case string' block, variable 't' can be accessed as a string!!!

This means 't' can be of any data type, but its up to us to guess what it could be and write appropriate case statements to handle it. If we actually get a float and we dont impliment a 'case float' block, then nothing will happen.

Since Type Switch is so complicated in its syntax and underlying function,

t=t.(type)

cannot be written standalone. It ould throw an error. The statement(im not even sure if its conditional or assigning statement) always has to be in a switch statement. Only

switch t=t.(type)

will work. Phew, explaining that was really hard. If you dont understand, sorry. I really tried

####Code

Okay lets see how type switches actually help us sovle accessing JSON Arrays. I have left comments in the code for understanding. Not going to explain more.

var arbitrary_json map[string]interface{}
var tv_shows map[string]interface{}

func PutProfile(w http.ResponseWriter, r *http.Request) {
    received_JSON, err := ioutil.ReadAll(r.Body)    //This reads raw request body
    if err != nil {
        panic(err)
    }
    json.Unmarshal([]byte(received_JSON), &arbitrary_json)
    for key, value := range arbitrary_json {
        switch key:{
            case "zip":
                p.Zip=value.(string)    
            case "email":
                p.Email=value.(string)
            case "tv_shows":
            	switch vv := value.(type) {
            		//Here we know value is going to be []interface{}. Generally you have to impliment for every single datat type. Yes. This is horrible stuff
			case []interface{}:	
				p.Tv_shows=nil   //Clear this 
				//Now we can access the individual elements of the JSON Array
				for i, u := range vv {															if i==0{}   //Otherwise throws error 'i is not used'
					 p.Tv_shows = append(p.Tv_shows, u.(string))	// Assert and append
					 fmt.Println(p)	// There! p is updated!
				}
		}
        }
    }
}

Never in my life have I experianced such trouble trying to parse a bit of JSON. Its alomost as if GO tries to make the task as hard as possible. Hopefully, this helps.

@muly
Copy link

muly commented May 31, 2016

Thanks a lot for writing this up. this is exactly what I was wondering, and after reading your explanation, it got cleared in my mind. There are some typos and some errors in your notes. I forked this gist page and trying to fix them. I'll submit it for merge after I complete fixing them. Thanks again.

@errordeveloper
Copy link

errordeveloper commented Dec 21, 2016

This might help:

package main

import (
	"encoding/json"
	"fmt"
)

func dumpJSON(v interface{}, kn string) {
	iterMap := func(x map[string]interface{}, root string) {
		var knf string
		if root == "root" {
			knf = "%q:%q"
		} else {
			knf = "%s:%q"
		}
		for k, v := range x {
			dumpJSON(v, fmt.Sprintf(knf, root, k))
		}
	}

	iterSlice := func(x []interface{}, root string) {
		var knf string
		if root == "root" {
			knf = "%q:[%d]"
		} else {
			knf = "%s:[%d]"
		}
		for k, v := range x {
			dumpJSON(v, fmt.Sprintf(knf, root, k))
		}
	}

	switch vv := v.(type) {
	case string:
		fmt.Printf("%s => (string) %q\n", kn, vv)
	case bool:
		fmt.Printf("%s => (bool) %v\n", kn, vv)
	case float64:
		fmt.Printf("%s => (float64) %f\n", kn, vv)
	case map[string]interface{}:
		fmt.Printf("%s => (map[string]interface{}) ...\n", kn)
		iterMap(vv, kn)
	case []interface{}:
		fmt.Printf("%s => ([]interface{}) ...\n", kn)
		iterSlice(vv, kn)
	default:
		fmt.Printf("%s => (unknown?) ...\n", kn)
	}
}

func main() {
	b := []byte(`
		[{
			"Name":"Wednesday",
			"Age":6,
			"Parents": [
				"Gomez",
				"Morticia",
				{
					"meh": false,
					"set": [
						1, 
						"2",
						[
							3.000001,
							"4",
							{
								"none": false
							}
						]
					]
				}
			],
			"foo": {
				"foo": "bar",
				"baz": 1,
				"box": true
			}
		}]
	`)

	var f interface{}
	if err := json.Unmarshal(b, &f); err != nil {
		panic(err)
	}
	dumpJSON(f, "root")
}

https://play.golang.org/p/WOMMJvUXUA

@mindc
Copy link

mindc commented Apr 10, 2017

I added another case to switch statement, to handle JSON null value

	switch vv := v.(type) {
	case string:
		fmt.Printf("%s => (string) %q\n", kn, vv)
	case bool:
		fmt.Printf("%s => (bool) %v\n", kn, vv)
	case float64:
		fmt.Printf("%s => (float64) %f\n", kn, vv)
	case map[string]interface{}:
		fmt.Printf("%s => (map[string]interface{}) ...\n", kn)
		iterMap(vv, kn)
	case []interface{}:
		fmt.Printf("%s => ([]interface{}) ...\n", kn)
		iterSlice(vv, kn)
	case nil: // added case
		fmt.Printf("%s => (nil) null\n", kn)
	default:
		fmt.Printf("%s => (unknown?) ...\n", kn)
	}

https://play.golang.org/p/qIMDnIUkVB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment