Skip to content

Instantly share code, notes, and snippets.

@DasWolke
Last active March 23, 2024 16:41
Show Gist options
  • Star 90 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save DasWolke/c9d7dfe6a78445011162a12abd32091d to your computer and use it in GitHub Desktop.
Save DasWolke/c9d7dfe6a78445011162a12abd32091d to your computer and use it in GitHub Desktop.
Microservice bots

Microservice Bots

What they are and why you should use them

Introduction

Recently more and more chatbots appear, the overall chatbot market grows and the platform for it grows as well. Today we are taking a close look at what benefits creating a microservice chatbot on Discord - (a communication platform mainly targeted at gamers) would provide.

The concepts and ideas explained in this whitepaper are geared towards bots with a bigger userbase where the limits of a usual bot style appear with a greater effect

Information about Discord itself

(If you are already proficient with the Discord API and the way a normal bot works, you may skip ahead to The Concept)

To get a first idea of how a bot on Discord works, we need to look at the two integral parts of Discord that a bot uses:

The gateway and the REST API

Gateway

The gateway is used as a source of events, which were triggered by users on Discord, but also as a way to execute a small set of actions. Apart from this, the gateway also serves a second purpose in providing you with initial data when you first connect to it, this initial data contains information like the guilds (servers) of the bot and their properties. Clients connect to the gateway via WebSocket with json or etf (Erlang term format) encoding. In addition the gateway is able to compress packets sent to the client, this is done via zlib or zlib-stream. When creating a new connection to the gateway, the client is supposed to IDENTIFY itself. This IDENTIFY call creates a new session, to provide a way of ensuring some level of stability clients can RESUME a session when the underlying connection was disconnected somehow

If you are unsure on the inner workings of the gateway, you can read more about them here: https://discordapp.com/developers/docs/topics/gateway#connecting

Gateway limits

Discord imposes some limits upon the gateway in termsof connections and the number of new connections within 24h:

  • You may only call IDENTIFY 1000 times within 24H (or 2000 times if your Bot is on over 100K servers/guilds) (If you hit the limit your token is reset, forcing you to update the token of the bot and do a cold restart)
  • You may only call IDENTIFY once every 5 Seconds (account-wide)
Gateway sharding

The gateway has a method of spreading load over a pool of connections via a process called sharding.

Sharding lets you open multiple gateway connections, each handling a subset of the overall available data.

Discord requires you to use sharding once your bot is on 2500 or more servers. One shard may only have up to 2500 servers/guilds available to it. Meaning you would need at least 4-5 shards when your bot is on ~10K servers/guilds.

Discord itself usually recommends around 1000 servers/guilds per Shard. Therefore we will be using the recommended amount of shards in the examples down below.

Generally you may view the gateway as a faucet which provides you with a constant stream of events/data.

REST API

The REST API is used as a way of executing actions like sending a message, updating properties of a channel, etc. Usually the REST API is called in response to a previously received event from the gateway, like a message from a user containing a command for the bot.

REST Ratelimits

Discord implements ratelimits into it's REST API, which are used to allow clients to request REST endpoints at a reasonable rate and prevent overload and abuse of their service. Those ratelimits are usually based on a per route path, e.g. /channels, /guilds, /webhooks, but there are also ratelimits which are based on the account and not on a route path. Discords ratelimits are dynamic so clients are not supposed to hardcode them, but instead set them dynamically from the received HTTP header values.

The REST API can be viewed as a sort of intake for updating data of objects within discord like users, channels, etc.. But also for distributing events to every client that may see it.

Cache

Most of a client's state is provided during the after the initial connection has been established successfully. As objects are further created/updated/deleted, other events are sent to notify the client of these changes and to provide the new or updated data. To avoid excessive API calls, Discord expects clients to locally cache as many object states as possible, and to update them as gateway events are received. The data that is being cached consists of objects that exist on Discord like:

  • Users
  • Servers/Guilds
  • Channels
  • Server/Guild Members
  • Metadata and additional properties of the above

Style of normal Bots

Currently almost all bots on discord follow a simple concept of having gateway, cache, REST and the actual bot in one process. This style is illustrated in the following diagram, where the blue box symbolizes a process with x threads, the green box the component that connects to the discord gateway, the orange box symbolizes the component responsible for storing and accessing cached data, the yellow box the component for communicating with the rest api and the red box the code you wrote.

Don't let your memes be dreams

This style has some benefits:

  • All data that is sent by Discord is available to the process
  • Since the data is available as a part of the process, access to it is fast
  • Since almost all current libraries follow this style, setting up a bot is easy for beginners

But there are also issues with it, which get more impossible to solve the more your bot grows

  • If you do not use hot-reloading you may reconnect one shard 1000 times a day (or 2000 times if your bot is on more than 100K servers/guilds) (Discord counts each successful IDENTIFY as one connection) (Once you hit the limit, your token is reset)
  • If you want to access data received from Discord anywhere else, you will either have to add a way of accessing into your bot or use OAUTH2 if it's possible for the use-case you intent to use the data for.
  • You are hardly able to precisely try out new code in the form of canary deployments for a selected amount of users (e.g. opt-in beta bot)
  • It's impossible to dynamically scale your bot across multiple processes or even servers depending on current load
  • With almost any of the current libraries you are be unable to use individual components by their own, since they depend on the availability of others
  • The more your bot grows, the harder it will be to apply updates without either moving out most of the code to external API's that can be restarted independently or cold starting the entire bot to apply updates.
  • Since each shard has it's own cache, you will inevitably have duplicate data in your cache, which is a waste of memory that your bot could use.
  • Doing a cold start of the bot takes a lot of time depending on the size of the bot, since each Shard needs to IDENTIFY which may be done once per 5 seconds. As an example we have a bot that's on 50K servers/guilds, we use the recommended amount of Shards, so we have 50 shards. This means it would take about 255 Seconds (5.1 seconds per connection) until the bot is fully connected again. Keeping the limit on IDENTIFY calls in mind, you could restart your bot only 20 Times per day.

255 Seconds does not sound impressive right ?

You gotta keep in mind that this goes up by 5 seconds for every shard you add:

  • 50 shards (255 seconds, 20 (<100k servers)/40(>100K servers) restarts per day) ~4 minutes downtime
  • 100 shards (510 seconds, 10/20 restarts per day) ~8.30 minutes downtime
  • 200 shards (1020 seconds, 5/10 restarts per day) ~17 minutes downtime
  • 400 shards (2040 seconds, 2,5/5 restarts per day) ~34 minutes downtime This is without taking any failures Discord may experience into account.

The Concept

Now that we know how a conventional Bot is built, we can take a look at a different way of doing it: (In the diagram below each box symbolizes one component that could be one or multiple processes.)

This is the stuff that fills my memes with life

As we can see, our previous monolith is gone and we now have a distributed set of components, which are loosely coupled.

Likewise with the previously shown style, we have some benefits and limits:

Benefits:

  • You got a central place for any data discord may send you, which is easily accessible even from components outside of your bot
  • There are no restrictions on the number of restarts your Bot may do once every 24H since the Bot itself isn't directly connected to the Discord Gateway but instead gets its events from a proxy.
  • You are able to call methods of the Discord Gateway or the Discord REST API from anywhere without fearing a ratelimit collision
  • Most libraries automatically load everything related to an event (e.g. Guild, Channels, Members, etc..) when you receive it, with this approach you may only load what you really need.
  • Depending on your implementation you can implement per Server/Guild A/B Testing (e.g. via opt-in) and have a much finer control on which Users/Servers get to test new updates and which don't.
  • You are able to spread the received events over a group of Bots/Workers and allow for automatic up/down scaling based on the number of events you may receive (You could even spread events to different kinds of Workers based on type but that's up to you to decide)
  • You are not bound to use a single language for the bot itself, which means you can implement the various components in a variety of languages, like using python for gateway, go for cache, kotlin for the bot itself and javascript/nodejs for the REST API

We also have limits with this approach, although they are solvable (in contrary to hard limits enforced by discord/library):

  • Network latency: Depending on how your bot is deployed across different networks, latency in the communication between different components of it may become an issue, although it should be unnoticeable within private networks.
  • Event Transport Layer: You'll need a way to transport events from a to b, this transport layer has to be failure proof and should be as highly available as possible, since your bot will be unresponsive when it fails.

Summary

As illustrated in this whitepaper, microservice bots offer lots of new opportunities to ease the access on data and events you need in your day to day work and a general way of simplifying the access to data and events you receive from Discord.

The WeatherStack is a Discord library consisting of a set of components developed following the style of the concept shown above, so you might want to use it, if you plan on implementing a microservice bot yourself.

It's a set of components made to cover the areas listed above, which are:

If you have feedback about this whitepaper or think that there are things that were stated incorrectly feel free to leave a comment below and we can discuss it.

Thank you for reading.

Credits:

  • meew0 for giving lots of valuable feedback which was incorporated into this paper
  • Kodehawa for confirming me about the limits of a normal bot style listed above
@dijour
Copy link

dijour commented Sep 22, 2021

Hi there! Has this been successfully implemented before? I came across this solution because I was struggling with serverless functions on G-Cloud constantly cold-starting my Discord bot on every invocation. I'm trying to find a way to keep my bot "alive" so I can short-cut any cold starts and immediately perform functions with the bot (i.e. make posts, add roles, etc.). I noticed that my cloud functions could take nearly 5 minutes to execute after having to cold start the bot. Is this solution still feasible / is there a known method of implementing it with Google Cloud?

@DasWolke
Copy link
Author

Hi there! Sorry for not noticing your comment.
For a deployment in a serverless environment, you would have to run at least the gateway 24/7. This Gateway could then trigger serverless functions (your bot) / a caching worker, etc. There's also the concept of "warm" functions where you always have at least one instance of your function online, which may help you for your usecase.

@H01001000
Copy link

@DasWolke I was thinking it's better to provide a full example of how to set up the bot, like actual code linking all the parts (maybe you can just put it in one file for example, and use array pretend using message queue), maybe a client library that can decode events from CloudStorm to an event like on("message"), and an adapter can use in (your code) and SnowTransfer that let you can still use something like client.channel.createMessage('channel id', 'hi there') syntax, but it will decode to an event (JSON object) example

{
  object: channel
  objectId: channelId 
  method: createMessage
  methodData: {message: message}
}

that can use message queue or other thing send to SnowTransfer, then the adapter can decode the event to actual action and execute

client.on('event', (event) => {
    const action = adapter.decodeEvent(event);
    adapter.runAction(action);
})

So two adapter? to connect gateway to your code, and your code to rest api

@Quintenvw
Copy link

This whitepaper doesn't take the increase of max_concurrency into account, which makes the 400 shards a little deceptive, but it is well written and really addresses a major issue with the current Discord libraries.

@Malix-off
Copy link

  • Comment Reply from @Quintenvw :

    a major issue with the current Discord libraries.

    Could you precise which problem are you talking about?

@Malix-off
Copy link

Very nice whitepaper!
How would look like the concept diagram for a common Discord API wrapper library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment