Skip to content

Instantly share code, notes, and snippets.

@4141done
Created September 11, 2018 19:21
Show Gist options
  • Save 4141done/447648dee55cd49c5b5a5f723b7801ff to your computer and use it in GitHub Desktop.
Save 4141done/447648dee55cd49c5b5a5f723b7801ff to your computer and use it in GitHub Desktop.
ElixirConf 2018 Raw notes

ElixirConf 2018 GraphQl Training

DAY WUN SLIDES http://slides.com/wbruce/production-ready-graphql-1#/ https://github.com/sabondano/elixirconf-2018

Entry points into graph

Root types: query, mutation, subscription

Introspection

  • Absinthe will barf out the introspection query. You can run this query directly by pasting it into the playground

Errors

  • Resolution errors are isolated between siblings. IE if
user {
  name # This will come back ok
  email # this can be an error and it will be ok
}
  • If the query itself is invalid you won't see errors and data
user {
  name # This will come back ok
  foo # doesn't exist
}
  • HTTP response codes: You will still get a 200 even if there are errors. If the query is wrong then you'll get a 400. This is an example of how REST is tightly coupled to HTTP and graphql is not. We need to get used to not working based on http status codes for errors

  • In a list_of, the error will give the index in the list of the resource with the error

Resolvers

  • Resolvers are run once for each parent
  • Resolvers are never called if it is not asked for in the query
  • They have a lot in common with controllers (controllers do http, validation, and data fetch. Resolvers are just data fetch but still similar)
  • Absinthe.Resolution.project(resolution) gives the requested child fields
  • The third argument is the resolution

Arguments

  • Lots of options. Can do default values, required, etc

Mutations

  • Why input_object AND objects?
    • An input object does not allow cycles
    • Maybe a bit more like an echo changeset
    • Allows separation of things like flags that aren't actually supposed to be persisted
  • Get your head out of weird RESTY names. You're not constrained by that anymore
  • Logger won't filter out things that are not in variables

Subscriptions

  • Subscription storage is not pluggable
  • You can manually trigger subscriptions from resolver code. Good for when the data change is not being triggered by a graphql mutation

Auth

How do we communicate the token? HTTP headers are a very http concern

  • Split out any kind of db access, etc and try to write pure functions to check things

Middleware

  • Every level of the query has a list of middleware. For most it is just the Absinthe.Resolution middleware
  • You can conditionally add middleware by pattern matching on the field or the object

Things to look into more

Auth on subscriptions

Pagination

  • Using "relay style" you can pack random poop into each edge
  • edge is a wrapper for an item and some metadata
  • node is the item itself

Performance

  • Complexity analysis is totally static (not based on data). It's all about potential complexity rather than what data is actually there
  • Think of complexity limits as a guide for real, well intentioned users
  • To annotate complexity, just use the "complexity" macro in a field do block
  • The code for the async middleware is an instructive read
  • Use batch to use an arbitrary function to return data to each resolver
  • Async will do async but will not batch
  • Batching works by key and so we can batch queries for like things across the query. This is the best way to do service aggregation
  • Batches are all done concurrently
  • Recommend wrapping batching in more meaningful method names
  • Dataloader aims to solve the simple case of batches by id
  • Dataloader predates absinthe
  • We probably almost never want data loader

General tidbits

  • The comparison is more applicable to SQL maybe than REST
  • Third party integrations are not as hard as you'd think with folks who don't know GraphQL
  • Think of the relational db Layer as a "serialization" of your graph
  • SQL Schema for ease of your business logic, grapnel for ease of your users
  • Graphql itself is transport agnostic
  • Resolvers are per-field. Easy to do n+1
  • Not easy to do the right thing
  • Adds ~100ms latency
  • If it always makes sense to be scoping things through permissions. These are things you should pass into the business logic of your data
  • Instead of making every node ask if the current user is allowed to do something, you want to block an unauthorized user from walking past a certain point on the graph. Therefore it may be better to drop privdeledge fields further down the graph

General elixir

  • Good practice to list as a dependency any library which is actually called

Deep Dive into Hex

https://www.youtube.com/watch?v=cbCnTKVLuu8 Packages: Hex cli Hexpm(website/api) Hexdocs (static) Hex Core

You don’t need to open a browser to find and add mix packages to your project

mix hex.search <term> # Gives you a list of packages matching in the name or description
mix hex.info <package_name> # This includes the most recent tuple to put in your mix.exs file
mix hex.docs online <package_name> # Load the hex docs in a browser for the selected package
mix hex.docs fetch # Load local docs for all of your dependencies for offline viewing
mix hex.docs fetch <package_name> # Same but just for one.
mix hex.docs offline <package_name> # open one of the previously fetched packages.  Will fetch if you are online and don't already have it

mix hex.outdated mix hex.audit

SLI Pagerduty

https://www.youtube.com/watch?v=bL7DNGhQ5js

SLI - Service level indicator

The problem

Incoming record -> Enqueuer -> Cassandra poll -> pre processor -> REST API -> Processor How do we know if the processing speed is not fast enough?

The original solution

In this flow, the data store is not really a queue. Distributed Cassandra db Incoming record -> Enqueuer -> Cassandra poll -> pre processor -> REST API -> Processor

Incoming record -> Enqueuer -> Cassandra poll -> pre processor -> REST API -> Processor | | | | | ----------------> Metrics processor <-----------------------|

Every time the metrics processor runs, it queries Cassandra and the processor to see how many events are being processed It was run every few seconds and polling Cassandra and then polling the processor Problems with this design:

  • Polling Cassandra queue. Same db as customer data is being written to
  • Polling the processor database. Put too much load on the processor
  • Polling made the problem worse (performance). In other words SLI monitoring made the SL worse

In between

Replaced Cassandra with Kafka.

Enter Elixir

ETS all Streams What is Event Sourcing?

  • all of the records in question here are events
  • Processed event record was an incoming event with some additional metadata
  • All events were streams (Kafka distributed logs)

The solution is to diff the two streams. One stream is the initial data. The other is processed data. Store these things in ETS. If things crashed, state could be replayed from the Kafka partitions

Kafka-based solutions

Streams, etc. Allows windowing

General notes

ETS data can be paged out to disk if maybe not configured properly Kafka consumer monitoring to check for consumer lag

Using Elixir in Production for Three years

https://www.youtube.com/watch?v=oC2ZahbrCco Started using when Phoenix and Ecto were 0.xx

The Rails Way

We followed Rails conventions Started doing the same in Elixir -> Phoenix is not your application changed this Going from MVC to (APP) -> (Web interface) The question now arises - how do I structure my applications? This led to phoenix contexts Sasajuric solid ground talk

Because we have the ability to do certain things we may do them without fully understanding the theory behind this.

We should look at other similar systems and understand their problems and solutions outside the normal influences of ruby and rails.
Akka. Actors in .Net,

Elixir at a Walking Pace

https://www.youtube.com/watch?v=1IOobarmwQg&t=1s

Warehouse management system

Problem: Multiple representations of the same item across the system. Basically race conditions

Solution: To have a single representation of a single item across all systems. Enforce order

Process mailboxes allow us to process messages insures (like partitioning on a Kafka topic)

Moving parts

  • State representations

    • This is just a struct
  • Data persistence

    • For the sake of the demo using at its table
  • Logic

  • Genserver

    • Get the state from the db on init
    • Do logic, then immediately persist back to the db
    • Start link with ID
    • In the init, return {:continue, :init}
    • You should not put db calls in the gen server init because init blocks
    • Using handle_continue makes sure that the first message processed is from the start link
    • Do fetch from db in the handle_continue callback
    • Do logic, and write to db in return
  • Representing the state machine

    • In rails, they used the state machine gem
    • Here just use modules
    • Don't tie behavior to state transitions
    • Empirical data wins (i.e. if the garment is scanned by a barcode reader, that is the new state regardless of what happened)

Problems

  • Addressability
    • How do we send messages to exactly the right gen server process?
    • A: Each process gets a unique name
    • A: Use a module to map name to the PID
    • A: We sent the name instead of the PID
    • via tuples provide a way to name each process. Registry maps name to pid
    • {:via, Registry, {Registry.Items, 2312}
    • Registry is only local to a node
  • Cold start problem
    • How do we start all these processes in a timely way when there are 100,000 or more processes
    • Send to a process name optimistically. Lazily create process
    • Genserver.whereis/1 is great
    • Replace Genserver.call with a custom one that does the above PID resolution for any call
  • Picking a supervision strategy
    • Dynamic supervisor
  • Reducing memory consumption
    • We can hibernate processes
    • {:reply, :ok, stuff, :hibernate} -> compacts memory, but causes extra garbage collection. Hibernating processes are still running
    • We can set the GenServer timeout
    • IF the process doesn't receive another message in X milliseconds, it kills itself
    • Set the restart strategy to transient (otherwise your process will get restarted if you killed it via a timeout). Transient will restart on crashes but not things like timeouts
  • High availability
    • Yo dog no global registry
    • Requires a strategy for heading state when nodes lose communication and then regain it

General notes

Creating a guard clause in the module. Then import the module into the other module

  • Look into OTP handle_continue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment