Custom JSON unmarshaler for a GraphQL client

(This post was originally posted as part of the GopherAcademy Advent 2017 series.)

In this post, I will tell a story of how I had to build a custom JSON unmarshaler for the needs of a GraphQL client library in Go. I'll start with the history of how everything started, build motivation for why a custom JSON marshaler was truly needed, and then describe how it was implemented. This is going to be a long journey, so strap yourself in, and here we go!

History of GraphQL Client Library in Go

GraphQL is a data query language for APIs, developed internally by Facebook in 2012, and made publicly available in 2015. It can be used as a replacement for, or in addition to REST APIs. It some ways, it offers significant advantages compared to REST APIs, making it an attractive option. Of course, as any newer technology, it's less mature and has some weaknesses in certain areas.

In May of 2017, I set out to build the first GraphQL client library for Go. Back then, only GraphQL server libraries existed in Go. My main motivation was wanting to be able to access GitHub GraphQL API v4 from my Go code, which had just come out of early access back then. I also knew that a general-purpose GraphQL client library would be useful, enabling Go projects to access any GraphQL API. There were GraphQL clients available in other languages, and I didn't want Go users to be missing out.

I spent a week on the initial investigation and research into what a Go client for GraphQL could look like. GraphQL is strongly typed, which is a good fit for Go. However, it also has some more advanced query syntax and features that play better with more dynamic languages, so I had my share of concerns whether a good client in Go would even be viable. Fortunately, at the end of that week, I found that a reasonable client in Go was indeed possible, and pushed a working initial prototype that had most basic functionality implemented, with a plan for how to implement and address the remaining features.

I documented the history of my findings and design decisions made in this issue. I've also given a talk (slides here) about the rationale and thinking behind many of the API and design decisions that went into the library. In this post, I want to talk about something I haven't covered before: implementing a custom JSON unmarshaler specifically for the needs of the GraphQL client library, in order to improve support for GraphQL unions.

JSON Unmarshaling Task at Hand

Unmarshaling JSON into a structure is a very common and well understood problem. It's already implemented inside the encoding/json package in Go standard library. Given that JSON is such a well specified standard, why would anyone need to implement their own JSON unmarshaler?

To answer that, I need to provide a little context about how GraphQL works. The GraphQL client begins by sending a request containing a GraphQL query, for example:

query {
	me {
		name
		bio
	}
}

The GraphQL server receives it, processes it, and sends a JSON-encoded response for that query. The response contains a data object and potentially other miscellaneous fields. We're primarily interested in the data object, which looks like this:

{
	"me": {
		"name": "gopher",
		"bio": "The Go gopher."
	}
}

Notice it has the same shape as the query. That's why the graphql package was designed so that to make a query, you start by defining a Go struct variable. That variable then both defines the GraphQL query that will be made, and gets populated with the response data from the GraphQL server:

var query struct {
	Me struct {
		Name string
		Bio  string
	}
}
err := client.Query(ctx, &query, nil)
if err != nil {
	// Handle error.
}
fmt.Println(query.Me.Name)
fmt.Println(query.Me.Bio)

// Output:
// gopher
// The Go gopher.

Initially, encoding/json was used for unmarshaling the GraphQL response into the query structure. It worked well for most things, but there were some problems.

Motivation for Custom JSON Unmarshaler

There were at least 3 clear problems with encoding/json for unmarshaling GraphQL responses into the query structure. These served as motivation to write a custom JSON unmarshaler for graphql needs.

  1. The largest problem with using encoding/json became apparent when looking to support the GraphQL unions feature. In GraphQL, a union is a type of object representing many objects.

    query {
        mascot(language: "Go") {
            ... on Human {
                name
                height
            }
            ... on Animal {
                name
                hasTail
            }
        }
    }
    

    In this query, we're asking for information about Go's mascot. We don't know in advance what exact type it is, but we know what types it can be. Depending on whether it's an Animal or Human, we ask for additional fields of that type.

    To express that GraphQL query, you can create the following query struct:

    var query struct {
        Mascot struct {
            Human struct {
                Name   string
                Height float64
            } `graphql:"... on Human"`
            Animal struct {
                Name    string
                HasTail bool
            } `graphql:"... on Animal"`
        } `graphql:"mascot(language: \"Go\")"`
    }
    

    The JSON-encoded response from GraphQL server will contain:

    {
        "mascot": {
            "name": "Gopher",
            "hasTail": true
        }
    }
    

    You can see that in this case the shape of the response doesn't quite align with the query struct. GraphQL inlines or embeds the fields from Animal into the "mascot" object. The encoding/json unmarshaler will not be able to handle that in the way we'd want, and the fields in the query struct will be left unset. See proof on the playground.

    You could try to work around it by using Go's embedded structs. If you define query as:

    type Human struct {
        Name   string
        Height float64
    }
    type Animal struct {
        Name    string
        HasTail bool
    } `graphql:"... on Animal"`
    var query struct {
        Mascot struct {
            Human  `graphql:"... on Human"`  // Embedded struct.
            Animal `graphql:"... on Animal"` // Embedded struct.
        } `graphql:"mascot(language: \"Go\")"`
    }
    

    That gets you almost the right results, but there's a significant limitation at play. Both Human and Animal structs have a field with the same name, Name.

    According to the encoding/json unmarshaling rules:

    If there are multiple fields at the same level, and that level is the least nested (and would therefore be the nesting level selected by the usual Go rules), the following extra rules apply:

    1. Of those fields, if any are JSON-tagged, only tagged fields are considered, even if there are multiple untagged fields that would otherwise conflict.

    2. If there is exactly one field (tagged or not according to the first rule), that is selected.

    3. Otherwise there are multiple fields, and all are ignored; no error occurs.

    Multiple fields are ignored. So, Name would be left unset. See proof on the playground.

    An initial reaction might be that it's a bug or flaw in encoding/json package and should be fixed. However, upon careful consideration, this is a very ambiguous situation, and there's no single clear "correct" behavior. The encoding/json unmarshaler makes a sensible compromise for generic needs, not GraphQL-specific needs.

  2. To have additional control over the GraphQL query that is generated from the query struct, the graphql struct field tag can be used. It allows overriding how a given struct field gets encoded in the GraphQL query. Suppose the user happens to use a field with a name that doesn't match that of the GraphQL field:

    var query struct {
        Me struct {
            Photo string `graphql:"avatarUrl(width: 194, height: 180)"`
        }
    }
    

    The JSON-encoded response from GraphQL server will contain:

    {
        "me": {
            "avatarUrl": "https://golang.org/doc/gopher/run.png"
        }
    }
    

    As a result, query.Me.Photo will not be populated, since the field is "avatarUrl" in the response, and the Go struct has a field named "Photo", which doesn't match.

    This happens because encoding/json unmarshaler is unaware of the graphql struct field tags.

  3. Consider if the user supplied a query struct that happened to contain json struct field tags, for example:

    type query struct {
        Me struct {
            Name string `json:"full_name"`
        }
    }
    

    (Suppose the user wants to serialize the response later, or uses some struct that happens to have json tags defined for other reasons.)

    The JSON-encoded response from GraphQL server will contain:

    {
        "me": {
            "name": "gopher"
        }
    }
    

    As a result, query.Me.Name will not be populated, since the Go struct has a JSON tag calling it "full_name", but the field is "name" in the response, which doesn't match.

    This happens because encoding/json unmarshaler is affected by json struct field tags.

This motivation lead to the conclusion that for GraphQL-specific needs, a custom JSON unmarshaler is unavoidably needed.

Implementing a Custom JSON Unmarshaler

Discarding a well written, thoroughly tested, battle proven JSON unmarshaler in the Go standard library and writing one from scratch is not a decision to be taken lightly. I spent considerable time looking at my options and comparing their trade-offs.

Writing it from scratch would've been the last option to consider. I could've made a fork of encoding/json and modified it. But that would mean having to maintain a fork of encoding/json and keep it up to date with any upstream changes.

The key insight was that the process of JSON unmarshaling consists of two independent parts: parsing JSON, and populating the target struct fields with the parsed values. The JSON that GraphQL servers respond with is completely standard, specification-compliant JSON. I didn't need to make any changes there. It was only the behavior of populating target struct fields that I needed to customize.

In Go 1.5, the encoding/json package exposed a JSON tokenizer API to the outside world. A JSON tokenizer parses JSON and emits a sequence of JSON tokens, which are higher-level and easier to work with compared to the original byte stream. I could make use of this to avoid having to parse the JSON myself.

The encoding/json JSON tokenizer is available via the Token method of json.Decoder struct:

// Token returns the next JSON token in the input stream.
// At the end of the input stream, Token returns nil, io.EOF.
//
// ...
func (dec *Decoder) Token() (Token, error)

Calling Token repeatedly on an input like this:

{
	"Message": "Hello",
	"Array": [1, 2, 3],
	"Number": 1.234
}

Produces this sequence of JSON tokens, followed by io.EOF error:

json.Delim: {
string: "Message"
string: "Hello"
string: "Array"
json.Delim: [
float64: 1
float64: 2
float64: 3
json.Delim: ]
string: "Number"
float64: 1.234
json.Delim: }
io.EOF error

Great! We don't have to deal with all the low-level nuances of parsing JSON strings, escaped characters, quotes, floating point numbers, and so on. We'll be able to reuse the JSON tokenizer from encoding/json for all that. Now, we just need to build our unmarshaler on top of it.

Let's start by defining and iterating on our decoder struct that contains the necessary state. We know we're going to base it on a JSON tokenizer. To make it very clear we're only ever using the Token method and nothing else from json.Decoder, we can make it a small interface. This is our starting point:

// decoder is a JSON decoder that performs custom unmarshaling
// behavior for GraphQL query data structures. It's implemented
// on top of a JSON tokenizer.
type decoder struct {
	tokenizer interface {
		Token() (json.Token, error)
	}
}

And the exported unmarshal function will look like this:

// UnmarshalGraphQL parses the JSON-encoded GraphQL response
// data and stores the result in the GraphQL query data
// structure pointed to by v.
//
// The implementation is created on top of the JSON tokenizer
// available in "encoding/json".Decoder.
func UnmarshalGraphQL(data []byte, v interface{}) error {
	dec := json.NewDecoder(bytes.NewReader(data))
	dec.UseNumber()
	err := (&decoder{tokenizer: dec}).Decode(v)
	return err
}

We create a new JSON decoder around data, which will act as our JSON tokenizer. Then we decode a single JSON value into v, and return error, if any.

Pop Quiz: Is there a difference in behavior between unmarshaling and decoding a single JSON value?

Here's a pop quiz. Suppose you have some JSON data and and you're looking to unmarshal it into a Go variable. You could do one of two things:

err := json.Unmarshal(data, &v)
err := json.NewDecoder(r).Decode(&v)

They have slightly different signatures; json.Unmarshal takes a []byte while json.NewDecoder accepts an io.Reader. We know that the decoder is meant to be used on streams of JSON values from a reader, but if we only care about reading one JSON value, is there any difference in behavior between them?

In other words, is there an input for which the two would behave differently? If so, what would such an input be?

This was something I didn't quite know the answer to, not before I set out on this journey. But now it's very clear. Yes, the behavior indeed differs: it differs in how the two handle the remaining data after the first JSON value. Decode will read just enough to decode the JSON value and stop there. Unmarshal will do the same, but it doesn't stop there; it continues reading to check there's no extraneous data following the first JSON value (other than whitespace). If there are any additional JSON tokens, it returns an "invalid token after top-level value" error.


To stay true to unmarshaling behavior, we perform a check to ensure there are no additional JSON tokens following our top-level JSON value; if there is, that's an error:

func UnmarshalGraphQL(data []byte, v interface{}) error {
	dec := json.NewDecoder(bytes.NewReader(data))
	dec.UseNumber()
	err := (&decoder{tokenizer: dec}).Decode(v)
	if err != nil {
		return err
	}
	tok, err := dec.Token()
	switch err {
	case io.EOF:
		// Expect to get io.EOF. There shouldn't be any more
		// tokens left after we've decoded v successfully.
		return nil
	case nil:
		return fmt.Errorf("invalid token '%v' after top-level value", tok)
	default:
		return err
	}
}

Ok, now let's figure out all the remaining state we need to keep track of in the decoder struct. We will implement unmarshaling with an iterative algorithm rather than recursive, and keep all relevant state in decoder struct.

We know that the JSON tokenizer provides us with one token at a time. So, it's up to us to track whether we're in the middle of a JSON object or array. Imagine you get a string token. If the preceding token was [, then this string is an element of an array. But if the preceding token was {, then this string is the key of an object, and the following token will be its value. We'll use parseState json.Delim to track that.

We'll also keep a reference to the value where we want to unmarshal JSON into, say, a v reflect.Value field (short for "value").

What we have so far is:

type decoder struct {
	tokenizer interface {
		Token() (json.Token, error)
	}

	// What part of input JSON we're in the middle of:
	// '{' is object, '[' is array. Zero value means neither.
	parseState json.Delim

	// Value where to unmarshal.
	v reflect.Value
}

That's a good start, but what happens when we encounter a ] or } token? That means we leave the current array or object, and... end up in the parent, whatever that was, if any.

JSON values can be nested. Objects inside arrays inside other objects. We will change parseState to be a stack of states parseState []json.Delim. Whenever we get to the beginning of a JSON object or array, we push to the stack, and when we get to end, we pop off the stack. Top of the stack is always the current state.

We need to apply the same change to v, so we know where to unmarshal into after end of array or object. We'll also make it a stack and rename to vs []reflect.Value (short for "values").

Now we have something that should be capable of unmarshaling deeply nested JSON values:

type decoder struct {
	tokenizer interface {
		Token() (json.Token, error)
	}

	// Stack of what part of input JSON we're in the middle of:
	// '{' is object, '[' is array. Empty stack means neither.
	parseState []json.Delim

	// Stack of values where to unmarshal. The top of stack
	// is the reflect.Value where to unmarshal next JSON value.
	vs []reflect.Value
}

That should be enough for now. Let's look at the code for unmarshaling next.

Remember that the UnmarshalGraphQL function calls decoder.Decode method. Decode will accept v, set up the decoder state, and call decode, where the actual decoding logic will take place.

// Decode decodes a single JSON value from d.tokenizer into v.
func (d *decoder) Decode(v interface{}) error {
	rv := reflect.ValueOf(v)
	if rv.Kind() != reflect.Ptr {
		return fmt.Errorf("cannot decode into non-pointer %T", v)
	}
	d.vs = []reflect.Value{rv.Elem()}
	return d.decode()
}

decode is implemented as an iterative algorithm that uses the state in decoder struct. This is the entire algorithm at a high level:

// decode decodes a single JSON value from d.tokenizer into d.vs.
func (d *decoder) decode() error {
	// The loop invariant is that the top of the d.vs stack
	// is where we try to unmarshal the next JSON value we see.
	for len(d.vs) > 0 {
		tok, err := d.tokenizer.Token()
		if err == io.EOF {
			return io.ErrUnexpectedEOF
		} else if err != nil {
			return err
		}

		// Process the token. Potentially decode a JSON value,
		// or handle one of {, }, [, ] tokens.
		switch tok := tok.(type) {
			...
		}
	}
	return nil
}

There's a big outer loop. At the top of the loop, we call d.tokenizer.Token to get the next JSON token. The loop invariant is that the top of the vs stack is where we unmarshal the next JSON value we get from Token. The loop condition is len(d.vs) > 0, meaning we have some value to unmarshal into. When the vs stack becomes empty, that means we've reached the end of the JSON value we're decoding, so we break out and return nil error.

Each loop iteration makes a call to Token and processes the token:

  • If it's a value, it's unmarshaled into the value at the top of vs stack.
  • If it's an opening of an array or object, then the parseState and vs stacks are pushed to.
  • If it's the ending of an array or object, those stacks are popped.

That's basically it. The rest of the code are the details, managing the parseState and vs stacks, checking for graphql struct field tags, handling all the error conditions, etc. But the algorithm is conceptually simple and easy to understand at this high level.

Except... We're still missing one critical aspect of making it handle the GraphQL-specific needs that we set out to resolve originally.

Let's recall the GraphQL unions example, where the JSON-encoded GraphQL server response contained:

{
	"mascot": {
		"name": "Gopher",
		"hasTail": true
	}
}

And we're trying to unmarshal it into:

var query struct {
	Mascot struct {
		Human struct {
			Name   string
			Height float64
		} `graphql:"... on Human"`
		Animal struct {
			Name    string
			HasTail bool
		} `graphql:"... on Animal"`
	} `graphql:"mascot(language: \"Go\")"`
}

The behavior we want is to unmarshal "Gopher" string into all matching fields, which are these two:

  • query.Mascot.Human.Name
  • query.Mascot.Animal.Name

But the top of our vs stack only contains one value... What do we do?

We must go deeper. Cue the music from Inception, and get ready to replace vs []reflect.Value with vs [][]reflect.Value!

Multiple Stacks of Values

That's right, to be able to deal with having potentially multiple places to unmarshal a single JSON value into, we have a slice of slices of reflect.Values. Essentially, we have multiple (1 or more) []reflect.Value stacks. decoder now looks like this:

type decoder struct {
	tokenizer interface {
		Token() (json.Token, error)
	}

	// Stack of what part of input JSON we're in the middle of:
	// '{' is object, '[' is array. Empty stack means neither.
	parseState []json.Delim

	// Stacks of values where to unmarshal. The top of each stack
	// is the reflect.Value where to unmarshal next JSON value.
	//
	// The reason there's more than one stack is because we
	// might be unmarshaling a single JSON value into multiple
	// GraphQL fragments or embedded structs, so we keep track
	// of them all.
	vs [][]reflect.Value
}

We need to modify decode to create additional stacks whenever we encounter an embedded struct or a GraphQL fragment (field with graphql:"... on Type" tag), do some additional bookkeeping to manage multiple stacks of values, check for additional error conditions if our stacks run empty. Aside from that, the same algorithm continues to work.

I think getting the data structure to contain just the right amount of information to resolve the task was the most challenging part of getting this to work. Once it's there, the rest of the algorithm details fall into place.

If you'd like to learn even more of the low-level details of the implementation, I invite you to look at the source code of package github.com/shurcooL/graphql/internal/jsonutil. It should be easy to read now.

Payoff

Let's quickly revisit our original GraphQL unions example that wasn't working with standard encoding/json unmarshaler. When we replace json.UnmarshalJSON with jsonutil.UnmarshalGraphQL, the Name fields get populated! That's good news, it means we didn't do all that work for nothing.

See proof on the playground.

jsonutil.UnmarshalGraphQL also takes graphql struct field tags into account when unmarshaling, and doesn't get misled by json field tags. Best part is we're reusing the rigorous JSON tokenizer of encoding/json and its public API, so no need to deal with maintaining a fork. If a need to apply further GraphQL-specific changes to unmarshaling behavior arises in the future, it will be easy to do so.

Conclusion

It has been a lot of fun implementing the GraphQL client library for Go, and trying to make the best API design decisions. I enjoyed using the tools that Go gives me to tackle this task. Even after using Go for 4 years, it's still the absolutely most fun programming language for me to use, and I'm feeling same joy I did back when I was just starting out!

I'm finding GraphQL to be a pretty neat new technology. Its strongly typed nature is a great fit for Go. APIs that are created with it can be a pleasure to use. Keep in mind that GraphQL shines most when you're able to replace multiple REST API calls with a single carefully crafted GraphQL query. This requires high quality and completeness of the GraphQL schema, so not all GraphQL APIs are made equal.

Note that there are two GraphQL client packages to choose from:

I've had a chance to actually use githubv4 for real tasks in some of my Go projects, and it was a pleasant experience. That said, their GraphQL API v4 is still missing many things present in GitHub REST API v3, so I couldn't do as much with it as I would've liked. They're working on expanding it, and it'll be even better when fully complete.

If you want to play around with GraphQL or take a stab at creating your own API with it, you'll need a GraphQL server library. I would suggest considering the github.com/neelance/graphql-go project as a starting point (if you want a complete list of options, see here). Then, you can use any GraphQL client to execute queries, including the graphql package from this post.

If you run into any issues, please report in the issue tracker of the corresponding repository. For anything else, I'm @dmitshur on Twitter.

Happy holidays, and enjoy using Go (and GraphQL) in the upcoming new year!

Comments

itcuihao commented 5 years ago

good work~

Write Preview Markdown
to comment.