Skip to content

Instantly share code, notes, and snippets.

@samwelkanda
Last active March 13, 2023 07:41
Show Gist options
  • Save samwelkanda/ab28a5e51c3132a8bbff4d921e76c6e2 to your computer and use it in GitHub Desktop.
Save samwelkanda/ab28a5e51c3132a8bbff4d921e76c6e2 to your computer and use it in GitHub Desktop.

GRAPHQL

GraphQL is a specification (spec) for client-server communication. A spec describes the capabilities and characteristics of a language.

GraphQL is a query language for your APIs. A GraphQL query asks only for the data that it needs.

It was developed by Lee Byron, Nick Schrock, and Dan Schafer to solve problems with Facebooks mobile apps.

History of Data Transport

In the 1960s, remote procedure call (RPC) was invented. An RPC was initiated by the client, which sent a request message to a remote computer to do something. The remote computer sent a response to the client.

In the late 1990s, Simple Object Access Protocol (SOAP) emerged at Microsoft.SOAP used XML to encode a message and HTTP as a transport. SOAP also used a type system and introduced the concept of resource-oriented calls for data.

REST was defined in 2000 in Roy Fielding’s doctoral dissertation at University of California–Irvine. He described a resource-oriented architecture in which users would progress through web resources by performing operations such as GET, PUT, POST, and DELETE.

Initially, REST was used with XML. AJAX was originally an acronym that stood for Asynchronous JavaScript And XML, because the response data from an Ajax request was formatted as XML (it is now a freestanding word, spelled “Ajax”). This created a painful step for web developers: the need to parse XML responses before the data could be used in JavaScript. Soon after, JavaScript Object Notation (JSON) was developed and standardized by Douglas Crockford.

Why GraphQL over REST

REST is procedural. GraphQL is declarative. The client describes their data requirements, your services describe their capabilities. Your data graph maps requirements to capabiities.

Benefits of one graph:

  1. App development. Both mobile and web. Feature parity and consistency across platforms.
  2. Partner enablement(Public API's). Introduces an abstraction layer which creates loose coupling. Becomes easier to move from monolith to microservices.
  3. Business intelligence. Easy to gain insights based on how your graph is used
  4. Product management
  5. Auditing and compliance
  6. Partner enablement
  7. Access control
  8. Demand control
  9. Change management
  10. Developer tools
  11. Performance optimization
  12. Provisioning and Load prediction

10 GraphQL Values

Integrity Values -
  1. One graph. The more people using the graph, the more valuable it is.
  2. Federated implementation. Decoupled developent.
  3. Track the schema in a registry. Basically git for the schema as it evolves.
Agility values
  1. Abstract schema. Decoupled from the way the backend is implemented. It should be oriented around product needs. Add things only when needed.
  2. Use an agile approach for schema development.
  3. Iteratively improve performance.Premature optimization is the root of all evil.
  4. Use graph metadata to empower developers
Operational Values
  1. Access control and demand control

  2. Structured logging

  3. Separate the GraphQL layer from the service layer.

  4. Operations

GraphQL Servers

A GraphQL server exposes a schema that describes its API including queries to fetch data and mutations to modify data. This allows clients to specify their data requirements with queries and send it to one GraphQL endpoint, instead of collecting information from multiple endpoints as is typical with REST. A GraphQL schema is strongly typed, which unlocks great developer tooling.

GraphQL is language/server agnostic which means it can be implemented in any language of choice.

GraphQL Clients

GraphQL clients have emerged to speed the workflow for developer teams and improve the efficiency and performance of applications. They handle tasks like network requests, data caching, and injecting data into the user interface. There are many GraphQL clients, but the leaders in the space are Relay and Apollo.

Relay is Facebook’s client that works with React and React Native. Relay aims to be the connective tissue between React components and the data that is fetched from the GraphQL server. Relay is used by Facebook, GitHub, Twitch, and more.

Apollo Client was developed at Meteor Development Group and is a community-driven effort to build more comprehensive tooling around GraphQL. Apollo Client supports all major frontend development platforms and is framework agnostic. Apollo also develops tools that assist with the creation of GraphQL services, the performance enhancement of backend services, and tools to monitor the performance of GraphQL APIs. Companies, including Airbnb, CNBC, The New York Times, and Ticketmaster use Apollo Client in production.

The GraphQL Query Language

Forty-five years before GraphQL was open sourced, an IBM employee, Edgar M. Codd, released a fairly brief paper with a very long name. “A Relational Model of Data for Large Shared Databanks".

Soon after that, IBM began working on a relational database that could be queried using Structured English Query Language, or SEQUEL, which later became known only as SQL. SQL, or Structured Query Language, is a domain-specific language used to access, manage, and manipulate data in a database. SQL introduced the idea of accessing multiple records with a single command. It also made it possible to access any record with any key, not just with an ID.The commands that could be run with SQL were very streamlined: SELECT, INSERT, UPDATE, and DELETE. That’s all you can do to data. With SQL, we can write a single query that can return connected data across multiple data tables in a database.

GraphQL takes the ideas that were originally developed to query databases and applies them to the internet. A single GraphQL query can return connected data. Like SQL, you can use GraphQL queries to change or remove data. Even though they are both query languages, GraphQL and SQL are completely different. They are intended for completely different environments. You send SQL queries to a database. You send GraphQL queries to an API. SQL data is stored in data tables. GraphQL data can be stored anywhere: a database, multiple databases, file systems, REST APIs, WebSockets, even other GraphQL APIs.

SQL is a query language for databases. GraphQL is a query language for theinternet. GraphQL and SQL also have entirely different syntax. Instead of SELECT, GraphQL uses Query to request data. This operation is at the heart of everything we do with GraphQL. Instead of INSERT, UPDATE, or DELETE, GraphQL wraps all of these data changes into one data type: the Mutation. Because GraphQL is built for the internet, it includes a Subscription type that can be used to listen for data changes over socket connections. SQL doesn’t have anything like a subscription

There are three types of operations that GraphQL models:

query – a read‐only fetch. mutation – a write followed by a fetch. subscription – a long‐lived request that fetches data in response to source events.

Designing a GraphQL Schema

Before breaking ground on your new API, you need to think about, talk about, and formally define the data types that your API will expose. This collection of types is called a schema. Schema First is a design methodology that will get all of your teams on the same page about the data types that make up your application. The backend team will have a clear understanding about the data that it needs to store and deliver.

At the core of any GraphQL server is a schema. The schema defines types and their relationships. It also specifies which queries can be made against the server.

GraphQL comes with a language that we can use to define our schemas, called the Schema Definition Language, or SDL. GraphQL schema documents are text documents that define the types available in an application, and they are later used by both clients and servers to validate GraphQL requests.

GraphQL presents your objects to the world as a graph structure rather than a more hierarchical structure to which you may be accustomed. In order to create this representation, Graphene needs to know about each type of object which will appear in the graph.

This graph also has a root type through which all access begins. This is the Query class

Schema first development is a recommended approach for building applications with GraphQL that involves the frontend and backend teams agreeing on a schema first, which serves as a contract between the UI and the backend before any API development commences. GraphQL schemas are at their best when they are designed around the needs of client applications.

A Case for Schema-first GraphQL APIs

Code-first (also sometimes called resolver-first) is a process where the GraphQL schema is implemented programmatically and the SDL version of the schema is a generated artifact of that.

Advantages

  • Potentially less boilerplate
  • Automatic type generation
  • Your code and SDL are always in sync
  • There are existing solutions for all major languages
  • Great for simple data access

Disadvantages

  • Backend-Frontend collaboration is harder
  • To much code reuse
  • CRUD everywhere
  • Framework lock-in

Treating a schema as a product of business code means that every change in the backend can cause interface changes. These are fine for backend development, since everything builds up from core business entities, but they introduce code dependency between the client and server side, making it implicitly the client’s responsibility to align with changes.According to the Dependency Inversion Principle (DIP), High-level modules should not depend on low-level modules. Both should depend on abstractions. Secondly, Abstractions should not depend on details. Details should depend on abstractions.

How to apply it then to our GraphQL service architecture? The most obvious solution is to write SDL first, then give it to both the frontend and backend side to independently implement.Very low-level details are too easily introduced into schema and data formatting when using code-first solutions, which would never occur to us as a valid option if we tried to come up with the schema shape first.

It’s clear that APIs must be driven by their client use cases, ease of use, and the need to allow simultaneous frontend and backend implementation efforts. All of this comes almost for free when using schema-first GraphQL development for its contract-based approach.

Advantages

  • Schema acts as a contract between the frontend and the backend
  • Client needs come before any implementation details
  • Uses common graphql knowhow
  • Harder to expose implementation details in your API.
  • Easier to maintain
  • Flexible achitecture behind the schema
  • QA benefits

Disadvantages

Hacks

  1. Use aliases to rename keys in the response object instead of using the field name queried. You can also give an alias to the top level field of a query
{
  king: user(id: 4) {
    id
    name
    smallPic: profilePic(size: 64)
    bigPic: profilePic(size: 1024)
  }
}
  1. Fragments allow for the reuse of common repeated selections of fields, reducing duplicated text in the document. Inline Fragments can be used directly within a selection to condition upon a type condition when querying against an interface or union.
query withNestedFragments {
  user(id: 4) {
    friends(first: 10) {
      ...friendFields
    }
    mutualFriends(first: 10) {
      ...friendFields
    }
  }
}

fragment friendFields on User {
  id
  name
  ...standardProfilePic
}

fragment standardProfilePic on User {
  profilePic(size: 50)
}

Fragments must specify the type they apply to. In this example, friendFields can be used in the context of querying a User. Fragments cannot be specified on any input value (scalar, enumeration, or input object). Fragments can be specified on object types, interfaces, and unions.

Types

The core unit of any GraphQL Schema is the type. In GraphQL, a type represents a custom object and these objects describe your application’s core features.

They are: ScalarType - A scalar represents a primitive value, like a string or an integer. GraphQL provides a number of built‐in scalars(Int, Float, String, Boolean, ID, ), but type systems can add additional scalars with semantic meaning. For example, a GraphQL system could define a scalar called Time which, while serialized as a string, promises to conform to ISO‐8601. Another example of a potentially useful custom scalar is Url, which serializes as a string, but is guaranteed by the server to be a valid URL.

scalar Time
scalar Url

ObjectType - Define a set of fields, where each field is another type in the system, allowing the definition of arbitrary type hierarchies. A field of an Object type may be a Scalar, Enum, another Object type, an Interface, or a Union. Additionally, it may be any wrapping type whose underlying base type is one of those five.

type Person {
  name: String
  age: Int
  picture: Url
  relationship: Person
}

InterfaceType - defines a list of fields; Object types that implement that interface are guaranteed to implement those fields UnionType - defines a list of possible types; similar to interfaces, whenever the type system claims a union will be returned, one of the possible types will be returned. EnumType - in cases, where the type specifies the space of valid responses. InputObjectType - allows the schema to define exactly what data is expected. oftentimes it is useful to provide complex structs as inputs to GraphQL field arguments or variables

All of the types so far are assumed to be both nullable and singular. A GraphQL schema may describe that a field represents a list of another type; the List type is provided for this reason, and wraps another type. Similarly, the Non-Null type wraps another type, and denotes that the resulting value will never be null (and that an error cannot result in a null value). These two types are referred to as “wrapping types”; non‐wrapping types are referred to as “named types”. A wrapping type has an underlying named type, found by continually unwrapping the type until a named type is found.

For example, a social media application consists of Users and Posts. A blog would consist of Categories and Articles. The types represent your application’s data. A type has fields that represent the data associated with each object. Each field returns a specific type of data.

A schema is a collection of type definitions. You can write your schemas in a JavaScript file as a string or in any text file. These files usually carry the .graphql extension.

Let’s define the first GraphQL object type in our schema file—the Photo:

type Photo {
		id:	ID!
		name:	String!
		url:	String!
		description:	String
}

Between the curly brackets, we’ve defined the Photo’s fields. Each field contains data of a specific type. We have defined only one custom type in our schema, the Photo, but GraphQL comes with some built-in types that we can use for our fields. These built-in types are called scalar types.

The exclamation point specifies that the field is non-nullable, which means that the name and url fields must return some data in each query.The description is nullable, which means that photo descriptions are optional. When queried, this field could return null.

GraphQL’s built in scalar types (Int, Float, String, Boolean, ID) are very useful, but there might be times when you want to define your own custom scalar types. A scalar type is not an object type. It does not have fields. However, when implementing a GraphQL service, you can specify how custom scalar types should be validated; for example:

scalar	DateTime
type Photo {
	id:	ID!
	name:	String!
	url:	String!
	description:	String
	created:	DateTime!
}

Here, we have created a custom scalar type: DateTime. Now we can find out when each photo was created. Any field marked DateTime will return a JSON string, but we can use the custom scalar to make sure that string can be serialized, validated, and formatted as an official date and time.

Enums

Enumeration types, or enums, are scalar types that allow a field to return a restrictive set of string values. When you want to make sure that a field returns one value from a limited set of values, you can use an enum type.

Let’s create an enum type called PhotoCategory that defines the type of photo that is being posted from a set of five possible choices: SELFIE, PORTRAIT, ACTION, LANDSCAPE, or GRAPHIC:

enum PhotoCategory {
		SELFIE
		PORTRAIT
		ACTION
		LANDSCAPE
		GRAPHIC
}

You can use enumeration types when defining fields. Let’s add a category field to our Photo object type:

type Photo {
	id:	ID!
	name:	String!
	url:	String!
	description:	String
	created:	DateTime!
	category:	PhotoCategory!
}

Connections and Lists

When you create GraphQL schemas, you can define fields that return lists of any GraphQL type. Lists are created by surrounding a GraphQL type with square brackets.

  • [Int] A list of nullable integer values
  • [Int!] A list of non-nullable integer values
  • [Int]! A non-nullable list of nullable integer values
  • [Int!]! A non-nullable list of non-nullable integer values

Most list definitions are non-nullable lists of non-nullable values. This is because we typically do not want values within our list to be null.

Arguments

Arguments can be added to any field in GraphQL. They allow us to send data that can affect outcome of our GraphQL operations.

The Query type contains fields that will list allUsers or allPhotos, but what happens when you want to select only one User or one Photo? You can send that information along with my query as an argument:

type Query {
	...
	User(githubLogin:	ID!):	User!
	Photo(id:	ID!):	Photo!
}

Just like a field, an argument must have a type.

Recipes for Building GraphQL API's in Python

Design Guidelines

  1. Never expose implementation details in your API . They don't belong in our API. Instead, our API should expose the actual business domain relationships.
  2. It is easier to add elements in your API than to remove them. Deliberate carefully on what needs to go into your API.
  3. Group closely related fields together into their own type. Don't be afraid to create types that do not exist in your model as long as it helps to present your data.
  4. Look into the future to envision a time when a list-field might need to be paginated
  5. Provide the object itself instead of ID i.e create a type for the object itself e.g image. Using object references allows you to traverse relations in one query. Use object references instead of ID fields.
  6. Choose field names based on what makes sense. Not based on implementation or what was in legacy APIs.
  7. Use enums for fields which can only take a specific set of values.
  8. The API should provide business logic, not just data. Complex calculations should be done on the server, in one place, not on the client, in many places.
  9. Use a payload return type for your mutation.
  10. Mutations should provide user/business level errors via userErrors field in the mutation payload.\
  11. Most payload fields should be nullable.
  12. Design around use cases,not data. Use helper fields/behaviour driven fields to help the client. e.g isAuthorized.
  13. Stay away from building a one-size fits all schema.
  14. Use result types to define possible errors and union types to combine these into what exactly will be returned.
  15. Always start with a high-level view of the objects and their relationships before you deal with specific fields.
  16. Design your API around the business domain, not the implementation, user-interface, or legacy APIs.
  17. Most of your major identifiable business objects (e.g. products, collections, etc) should implement Node. It hints to the client that this object is persisted and retrievable by the given ID, which allows the client to accurately and efficiently manage local caches and other tricks.
  18. Write separate mutations for separate logical actions on a resource.
  19. When writing separate mutations for relationships, consider whether it would be useful for the mutations to operate on multiple elements at once.
  20. Use weaker types for inputs (e.g. String instead of Email) when the format is unambiguous and client-side validation is complex. This lets the server run all non-trivial validations at once and return the errors in a single place in a single format, simplifying the client.
  21. Use stronger types for inputs (e.g. DateTime instead of String) when the format may be ambiguous and client-side validation is simple. This provides clarity and encourages clients to use stricter input controls (e.g. a date-picker widget instead of a free-text field).

Project structure

  • The API lives in a single directory: graphql
  • API modules mirror the structure of Django apps
  • api.py imports all API modules and exposes the Schema

Single module

May contain: __init__.py - enums.py - filters.py - mutations.py - definitions of mutation classes resolvers.py - resolver functions for queries schema.py - gathers all types, mutations, queries and exposes the part of schema specific to this module types.py - defines Graphene models, mapping models to types utils.py-

Authentication

Use JSONWebTokens to authenticate users. JWT Tokens are actually a full JSON Object that has been base64 encoded and then signed with either a symmetric shared key or using a public/private key pair. The JWT can contain such information include the subject or user_id, when the token was issued, and when it expires.One thing to keep in mind though, while the JWT is signed, JWTs are usually not encrypted (although you can encrypt it optionally). This means any data that is in the token can be read by anyone who has access to the token. It is good practice to place identifiers in the token such as a user_id, but not personally identifiable information like an email or social security number.One of the benefits of JWTs is they can be used without a backing store. All the information required to authenticate the user is contained within the token itself.

Libraries include django-graphql-jwt. Implements an HTTP request header to pass the token which authorizes requests. Access to particular queries or mutations can be restricted with decorators provided by django-graphql-jwt.

Error Handling

Use unified error handlin in your API.

Database Query Optimization

Use graphene-django-optimizer. It lets you dynamically join related tables using Django's select_related and prefetch_related.

File uploads

Use graphene-file-uploads.

Pagination

Relay cursor connections

A connection is a paginated field on an object — for example, the friends field on a user or the comments field on a blog post. An edge has metadata about one object in the paginated list, and includes a cursor to allow pagination starting from that object. An edge is a line that connects two nodes together, representing some kind of relationship between the two nodes. A node represents the actual object you were looking for. The circles in the graph are called “nodes” pageInfo lets the client know if there are more pages of data to fetch. In the Relay specification, it doesn’t tell you the total number of items, because the client cache doesn’t need that info. It would be up to the developer to expose that information through another field.

An example of all four of those is the following query:

{
  user {
    id
    name
    friends(first: 10, after: "opaqueCursor") {
      edges {
        cursor
        node {
          id
          name
        }
      }
      pageInfo {
        hasNextPage
      }
    }
  }
}

Why name a list of edges a “connection” though? A connection is a way to get all of the nodes that are connected to another node in a specific way. In this case we want to get all of the nodes connected to our users that are friends. Another connection might be between a user node to all of the posts that they liked.

In graph theory, an edge can have properties of its own which act effectively as metadata. For example if we have a “liked” edge between a user and a post we might want to include the time at which the user liked that post.

type UserFriendsEdge {
  cursor: String!
  node: User
  friendedAt: DateTime
}

Finally, we need to add the connection back to our User type.

type User {
  id: ID!
  name: String
  friendsConnection(
    first: Int,
    after: String,
    last: Int,
    before: String
  ): UserFriendsConnection
}

Graphene Limitations

  • Lacks input validation
  • No standard for returning errors
  • A lot of boilerplate if the number of mutations is large.
  • Fully fledged API requires third party libraries
  • No query-cost calculation to prevent malicious queries
  • Impossible to serve multiple schemas from one url. e.g private and public API

Schema Stitching

Allows you to buld a distributed graph.

Mutations design

Naming

Name your mutations verb first. Then the object, or “noun,” if applicable. Use camelCase. E.g createUser, likePost, updateComment

Specificity

Make mutations as specific as possible. Mutations should represent semantic actions that might be taken by the user whenever possible. sendPasswordResetEmail is much better than sendEmail(type: PASSWORD_RESET)

Input Object

Use a single, required, unique, input object type as an argument for easier mutation execution on the client. Mutations should only ever have one input argument. That argument should be named input and should have a non-null unique input object type.The reason is that it is much easier to use client-side. The client is only required to send one variable with per mutation instead of one for every argument on the mutation.

mutation MyMutation($input: UpdatePostInput!) {
  updatePost(input: $input) { ... }
}

The next thing you should do is nest the input object as much as possible. nesting allows you to fully embrace GraphQL’s power to be your version-less API. Nesting gives you room on your object types to explore new schema designs as time goes on. You can easily deprecate sections of the API and add new names in a conflict free space.

Unique payload type

Use a unique payload type for each mutation and add the mutation’s output as a field to that payload type. Just like when you design your input, nesting is a virtue for your GraphQL payload. This will allow you to add multiple outputs over time and metadata fields like clientMutationId or userErrors.

Nesting

Use nesting to your advantage wherever it makes sense. Even if you only want to return a single thing from your mutation, resist the temptation to return that one type directly. It is hard to predict the future, and if you choose to return only a single type now you remove the future possibility to add other return types or metadata to the mutation.

A complete example of good mutation schema:

type Todo {
  id: ID!
  text: String
  completed: Boolean
}

schema {
  # The query types are omitted so we can focus on the mutations!
  mutation: RootMutation
}

type RootMutation {
  createTodo(input: CreateTodoInput!): CreateTodoPayload
  toggleTodoCompleted(input: ToggleTodoCompletedInput!): ToggleTodoCompletedPayload
  updateTodoText(input: UpdateTodoTextInput!): UpdateTodoTextPayload
  completeAllTodos(input: CompleteAllTodosInput!): CompleteAllTodosPayload
}

# `id` is generated by the backend, and `completed` is automatically
# set to false.
input CreateTodoInput {
  # I would nest, but there is only one field: `text`. It would not
  # be hard to make `text` nullable and deprecate the `text` field,
  # however, if in the future we decide we have more fields.
  text: String!
}

type CreateTodoPayload {
  # The todo that was created. It is nullable so that if there is
  # an error then null won’t propagate past the `todo`.
  todo: Todo
}

# We only accept the `id` and the backend will determine the new
# `completed` state of the todo. This prevents edge-cases like:
# “set the todo’s completed status to true when its completed
# status is already true” in the type system!
input ToggleTodoCompletedInput {
  id: ID!
}

type ToggleTodoCompletedPayload {
  # The updated todo. Nullable for the same reason as before.
  todo: Todo
}

# This is a specific update mutation instead of a general one, so I
# don’t nest with a `patch` field like I demonstrated earlier.
# Instead I just provide one field, `newText`, which signals intent.
input UpdateTodoTextInput {
  id: ID!
  newText: String!
}

type UpdateTodoTextPayload {
  # The updated todo. Nullable for the same reason as before.
  todo: Todo
}

input CompleteAllTodosInput {
  # This mutation does not need any fields, but we have the space for
  # input anyway in case we need it in the future.
}

type CompleteAllTodosPayload {
  # All of the todos we completed.
  todos: [Todo]
  # If we decide that in the future we want to use connections we may
  # also add a `todoConnection` field.
}

Nullability

Rules:

  • For input arguments and fields, adding non-null is a breaking change.
  • For output fields, removing non-null from a field is a breaking change.

When to use null:

  • In field arguments, where the field doesn’t make any sense if that argument is not passed. For example, a getRestaurantById(id: ID!)
  • On field arguments, where the field doesn’t make any sense if that argument is not passed. For example, a getRestaurantById(id: ID!)
  • Its almost always recommended to set the items inside the list to be non-null

When to avoid non-null:

  • In any field arguments or input types that are added to a field. Given that this is a backwards-incompatible change, it’s clearer to simply add an entirely new field, since the new required argument probably represents a significant change in functionality.
  • In object type fields where the data is fetched from a separate data source.

Ariadne

Resolvers

In Ariadne, a resolver is any Python callable that accepts two positional arguments (obj and info):

def example_resolver(obj: Any, info: GraphQLResolveInfo):
    return obj.do_something()

In Ariadne every field resolver is called with at least two arguments: the query's parent object, and the query's execution info that usually contains a context attribute. The context is GraphQL's way of passing additional information from the application to its query resolvers.

The default GraphQL server implementation provided by Ariadne defines info.context as a Python dict containing a single key named request containing a request object. We can use this in our resolver:

from ariadne import QueryType, gql, make_executable_schema

type_defs = gql("""
    type Query {
        hello: String!
    }
""")

# Create QueryType instance for Query type defined in our schema...
query = QueryType()

# ...and assign our resolver function to its "hello" field.
@query.field("hello")
def resolve_hello(_, info):
    request = info.context["request"]
    user_agent = request.headers.get("user-agent", "guest")
    return "Hello, %s!" % user_agent
    
schema = make_executable_schema(type_defs, query)

# most of your future APIs will likely pass a list of bindables instead, for example:
# make_executable_schema(type_defs, [query, user, mutations, fallback_resolvers]

Notice that we are discarding the first argument in our resolver.

Apollo Federation

Apollo Federation is an architecture for composing multiple GraphQL services into a single graph.Apollo Federation is an architecture for composing multiple GraphQL services into a single graph. It is based on a declarative composition programming model that allows proper separation of concerns. This design allows teams to implement an enterprise-scale shared data graph as a set of loosely coupled, separately maintained GraphQL services.

The @apollo/federation package provides the primitives needed to implement composable GraphQL schemas. The @apollo/gateway package provides a federated GraphQL gateway that constructs the composed schema and executes queries against it by issuing GraphQL subqueries to one or more underlying services.

Apollo Federation introduces an important new principle to modular schema design: proper separation by concern. It allows you to extend an existing type with additional fields, using GraphQL's extend type functionality. That means we can break up a schema across boundaries that correspond to features or team structure.

Example

# accounts service
type User @key(fields: "id") {
  id: ID!
  username: String!
}

extend type Query {
  me: User
}
# products service
type Product @key(fields: "upc") {
  upc: String!
  name: String!
  price: Int
}

extend type Query {
  topProducts(first: Int = 5): [Product]
}
# reviews service
type Review {
  body: String
  author: User @provides(fields: "username")
  product: Product
}

extend type User @key(fields: "id") {
  id: ID! @external
  reviews: [Review]
}

extend type Product @key(fields: "upc") {
  upc: String! @external
  reviews: [Review]
}

Then the server. There is no user code in the gateway, just a reference to each of the federated services that make up the graph.

const gateway = new ApolloGateway({
  serviceList: [
    { name: 'accounts', url: 'http://localhost:4001' },
    { name: 'products', url: 'http://localhost:4002' },
    { name: 'reviews', url: 'http://localhost:4003' }
  ]
});
const server = new ApolloServer({ gateway });
server.listen();

Now we can query the composed schema, just as if it had been implemented as a monolith.

# a query that touches all three services
query {
  me {
    username
    reviews {
      body
      product {
        name
        upc
      }
    }
  }
}

Concepts

Entities and keys

An entity is a type that can be referenced by another service. Entities create connection points between services and form the basic building blocks of a federated graph. Entities have a primary key whose value uniquely identifies a specific instance of the type. Declaring an entity is done by adding a @key directive to the type definition. The directive takes one argument specifying the key:'

type Product @key(fields: "upc") {  upc: String!
  name: String!
  price: Int
}

In this example, the @key directive tells the Apollo query planner that a particular instance of Product can be fetched if you have its upc. Keys can be any field (not just ID) and need not be globally unique.

In some cases there may be multiple ways of referring to an entity, such as when we refer to a user either by ID or by email. Therefore, the programming model allows types to define multiple keys, which indicates they can be looked up in one of several ways:

type Product @key(fields: "upc") @key(fields: "sku") {
  upc: String!
  sku: String!
  price: String
}

Keys may be complex and include nested fields, as when a user's ID is only unique within its organization:

type User @key(fields: "id organization { id }") {
  id: ID!
  organization: Organization!
}

type Organization {
  id: ID!
}

Referencing external types

Once an entity is part of the graph, other services can begin to reference that type from its own types.

# in the reviews service
type Review {
  product: Product
}

extend type Product @key(fields: "upc") {
  upc: String! @external
}

Root queries and mutations

Since Query and Mutation are regular types in GraphQL, we use the same extend type pattern to define root queries. To implement a root query, such as topProducts, we simply extend the Query type:

extend type Query {
  topProducts(first: Int = 5): [Product]
}

Value Types

A natural overlap among identical types between services is not uncommon. Rather than having a single service "own" those types, all services that use them are expected to share ownership. This form of type "duplication" across services is supported for Scalars, Objects, Interfaces, Enums, Unions, and Inputs. The rule of thumb for any of these value types is that the types must be identical in name and contents.

Objects, Interfaces, and Inputs

For types with field definitions, all fields and their types must be identical.

Scalars

For Scalar values, it's important that services share the same serialization and parsing logic, since there is no way to validate that logic from the schema level by federation tooling.

Enums

For Enum types, all values must match across services. Even if a service doesn't use all values in an Enum, they still must be defined in the schema. Failure to include all enum values in all services that use the Enum will result in a validation error when building the federated schema.

Unions

Union types must share the same types in the union, even if not all types are used by a service.

In the following example, the Product and User services both use the same ProductCategory enum, Date scalar, Error type, and ProductOrError union.

# Product Service
scalar Date

union ProductOrError = Product | Error

type Error {
  code: Int!
  message: String!
}

type Product @key(fields: "sku"){
  sku: ID!
  category: ProductCategory
  dateCreated: Date
}

enum ProductCategory {
  FURNITURE
  BOOK
  DIGITAL_DOWNLOAD
}

# User Service
scalar Date

union ProductOrError = Product | Error

type Error {
  code: Int!
  message: String!
}

type User @key(fields: "id"){
  id: ID!
  dateCreated: Date
  favoriteCategory: ProductCategory
  favoriteProducts: [Product!]
}

enum ProductCategory {
  FURNITURE
  BOOK
  DIGITAL_DOWNLOAD
}

extend type Product @key(fields: "sku"){
  sku: ID! @external
}

Computed fields

In many cases, what you need to resolve an extension field is a foreign key, which you specify through the @key directive on the type extension. With the @requires directive however, you can require any additional combination of fields (including subfields) from the base type that you may need in your resolver. For example, you may need access to a product's size and weight to calculate a shipping estimate:

extend type Product @key(fields: "sku") {
  sku: ID! @external
  size: Int @external
  weight: Int @external
  shippingEstimate: String @requires(fields: "size weight")}

If a client requests shippingEstimate, the query planner will now request size and weight from the base Product type, and pass it through to your service, so you can access them directly from your resolver in the exact same way you would if Product was contained within a single service.

Using denormalized data

In some cases, a service will be able to provide additional fields, even if these are not part of a key. For example, our review system may store the user's name in addition to the id, so we don't have to perform a separate fetch to the accounts service to get it. We can indicate which additional fields can be queried on the referenced type using the @provides directive:

type Review {
  author: User @provides(fields: "username")}

extend type User @key(fields: "id") {
  id: ID! @external
  username: String @external}

Implementing a federated graph

Apollo Federation is made up of two parts:

  • Federated services, which are standalone parts of the graph
  • A gateway which composes the overall schema and executes federated queries

To be part of a federated graph, a microservice implements the Apollo Federation spec which exposes its capabilities to tooling and the gateway. Collectively, federated services form a composed graph. This composition is done by a gateway which knows how to take an incoming operation and turn it into a plan of fetches to downstream services.

Pagination in Graphql

Relay is a framework for retrieving and caching data from a GraphQL server for a React application. It handles pagination in a really interesting way by allowing the client to paginate forward and backwards from any item returned in the collection.

Relay enables this pagination by defining a specification for how a GraphQL server should expose lists of data called Relay Cursor Connections.

The spec outlines the following components:

  • connection: a wrapper for details on a list of data you’re paginating through. A connection has two fields: edges and pageInfo.
  • edges: a list of edge types.
  • edge: a wrapper around the object being returned in the list. An edge type has two fields: node and cursor
  • node: this is the actual object, for example, user details.
  • cursor: this is a string field that is used to identify a specific edge.
  • pageInfo: contains two fields hasPreviousPage and hasNextPage which can be used to determine whether or not you need to request another set of results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment