Skip to content

Instantly share code, notes, and snippets.

@Christopher-Hayes
Last active February 16, 2024 18:37
Show Gist options
  • Save Christopher-Hayes/684ab3a73e0e8945384d4742e6547693 to your computer and use it in GitHub Desktop.
Save Christopher-Hayes/684ab3a73e0e8945384d4742e6547693 to your computer and use it in GitHub Desktop.
Building a Serverless SlackBot with Bolt on Vercel - Things to Know

Some gotchas from my recent experience of building a serverless Next.JS + Bolt.JS Slack App on Vercel.

Note that if you're building an app that you want to distribute to other workspaces, AFAIK you need to build an API. So, Next.JS is used here to help with the public API. The alternative to an API is using "socket mode".

Slack API with Bolt must use /slack/events endpoint

  • When building out the API, Bolt ONLY uses the /slack/events endpoint. The Slack config settings will suggest you provide a different endpoint, like /slack/commands for Slash Commands. That would work if you weren't using the Node API (via Bolt), such as the Python API. However, Bolt uses the Node API which ONLY uses /slack/events for everything. You can still use Bolt functions app.command() and similar, just remember to put the /slack/events endpoint in the Slack config.

Serverless is not supported with Bolt

  • Severless is not officially supported by Slack with the Bolt API. It is possible if you take a look at this Vercel serverless Next.JS + Bolt app. However, beware that the project is only good for an app that always responds IMMEDIATELY. If your app uses any 3rd party endpoints or does anything that take a second or two, Slack will throw an "operation_timeout" error.
  • Serverless with Bolt puts you in a bind
    • If you respond immediately to an event (to avoid Slack timing out after 3 seconds), any time-consuming code you have will get prematurely terminated by the serverless function ending. This happens because Next + Bolt sees that the endpoint gave a response, thinks the work is done, and terminates anything you have running in the callback. Also, note that Bolt with processBeforeResponse: true will purposefully delay the ack(), until the entire callback is done. On the one hand, this is good to make sure your function does not early terminate, on the other hand, ack() may not get sent within the 3 second Slack timeout period.
    • If you avoid responding immediately to give your serverless function time to finish, after 3 seconds Slack will timeout and send the user an error. Strangely, with Vercel in this case you might be able to still run everything you wanted to in the callback and post a new message, but Slack will also show the user an error all the same.
  • Fixes for the serverless issue above (with long-running tasks)
    • Send time-consuming work to a queue'ing system (ie AWS SQS), but that's adding a lot of complexity with Bolt, a framework that was supposed to make things simple!
      • I ultimately ended up using a variation of this method as described at the bottom of this document. Send the long-running job to a separate Next.JS endpoint.
    • Ditch Slack's Bolt framework. Slack has no plans to fix this long-running task issue with serverless Bolt. However, Slack's Python API does have a feature that makes serverless with long tasks actually work.
      • Alternatively - Vercel is right now working on an example Slack App that runs on Vercel serverlessly, and has all the required auth functions that Bolt has.

Helpful Links

StackOverflow - "How to avoid slack command timeout error?"

StackOverflow - ack() does not send immediately, waits for entire workflow to finish before sending

GitHub Issues - Preventing AWS Lambdas from self-terminating when an ack() is sent

Helpful Example Projects

Vercel SlackBot WITHOUT using Bolt. Beware that this is a simple example, and does not do Slack Install (OAuth) for you, Bolt would be able to handle this automatically. Vercel says they're working on another example Slack app that would also do Oauth for you.

Modified Bolt.JS for web frameworks Not sure how well this works with serverless. But, it seems to be built to better work with Next.JS and similar. Built by a guy who works on Bolt.

Next.JS + Bolt boilerplate I linked this further above. The Next.JS seems to work nicely. It works with serverless only if your app responds instantly to events. Note that this is slightly different from the "Modified Bolt.JS" project, in that it does use a "custom receiver". The modified Bolt.JS project purposefully avoids a custom receiver.

The (Very) Hacky Way I Did It

I did manage to get long-running serverless tasks to work using the "Next.JS + Bolt boilerplate" linked above. The approach is basically - create a separate job to allow the function to return quickly, do this with a Next.JS endpoint to simplify infrastructure.

  • My project is in Vercel world, while this does use AWS Lambda behind the scenes, I was not interested in setting up extra AWS infrastructure to do AWS SQS jobs.
  • The work-around in Vercel was to create another Next.JS endpoint which would run a separate Vercel function. This separate "worker" function would still use Bolt, but only for sending events to Slack, not listening to events.
  • The initial API function would send a network request to this second "worker" function with all the data from Slack about the event. It does NOT await the axios.post, this is because awaiting the post would mean waiting for the entire "worker" function to finish, defeating the whole point. Instead it "fires and forgets" about the function.
    • Something important to note - the axios.post request getting sent can actually get interrupted by the Lambda terminating before the request was sent. So, the hacky fix is to use await new Promise(resolve => setTimeout(resolve, 500)) after axios.post to ensure the request is sent off.
  • For the "worker" function I used res.end() to end the function. Don't use res.status(200), it would just hang since the initial function was already terminated and the worker function would end up timing out after 30 or 60 seconds.
  • Don't forget that this worker function is also a public endpoint, so validation should be treated the same for both endpoints.
    • In the example below I crudely used an arbitrary INTERNAL_WORKER_TOKEN to only accept requests coming from the internal function. There's probably a more robust way to do this.

Sample Code for the initial /api/slack/events function

A portion of the file at /pages/api/[[...route]].ts

// Slack Slash Command for /command-a
app.command('/command-a', async ({ ack, body, context, say }) => {
  // Let the user know we're working on it
  const workingOnItMessage = await say({
    text: `:building_construction: Working on this long running task.`
  })

  // Run a post request to /api/worker with the arguments as a JSON string
  axios.post(
    'https://your-app-url-here.vercel.app/api/worker',
    {
      command: '/command-a',
      body,
      context,
      workingOnItMessage,
      internalWorkerToken: process.env.INTERNAL_WORKER_TOKEN,
    },
    {
      headers: {
        'Content-Type': 'application/json',
      },
    }
  );

  ack()
  // HACK - Ensure that the axios.post request gets sent out
  await new Promise(resolve => setTimeout(resolve, 500))
})

Sample Code for the worker /api/worker function

A portion of the (same) file at /pages/api/[[...route]].ts

router.post('/api/worker', async (req: NextApiRequest, res: NextApiResponse) => {
  if (req.method === 'POST') {
    // Check that the request is coming from an internal serverless function
    if (!req.body.internalWorkerToken || req.body.internalWorkerToken !== process.env.INTERNAL_WORKER_TOKEN) {
      return res.end()
    }
    
    let command: string = req.body.command
    let workingOnItMessage: any = req.body.workingOnItMessage
    let slackReqBody: any = req.body.body
    let context: any = req.body.context

    if (command === '/command-a') {
      await runCommandA({ body: slackReqBody, context, workingOnItMessage })
    }

    // Force this worker function to terminate now
    res.end()
  }
}

Installation Store Database

For the installation store database, Upstash seemed like the quickest and easiest to set up with Vercel. It has a Vercel integration that worked nicely. The free plan looks perfect for small slack apps, 10K commands a day. Paired that with ioredis for the fetchInstallation, storeInstallation, and deleteInstallation functions. I did run into the issue that while the serverless function fetches the installation on startup without it being a timeout issue, if you want to fetch the installation it would re-run the network request to Upstash, which pushed the app to start having Slack timeout issues. The crude solution was to locally store the installation on the serverless function. So, from Bolt auto-running fetchInstallation to an event handler like app.command running, you would still have the installation object handy without re-running a network request to Upstash.

A more reliable way to do jobs in Vercel

Using a Next.JS endpoint isn't the most robust way to run a job. ServerlessQ looks pretty nice for serverless with a good Vercel integration. AWS SQS looks like overkill, so if I transition away from a Next.JS endpoint, ServerlessQ is the way I'm leaning.

@Christopher-Hayes
Copy link
Author

Christopher-Hayes commented Aug 17, 2023

An update - I just noticed Vercel has an official blog post / example project on this now, seems pretty capable. I trust Vercel has optimized it for performance. It doesn't use Slack's Bolt framework, but that's kind of the situation we're in with trying use the Slack API in a node+serverless environment.

This gist mentions that Vercel's example doesn't handle OAuth for you, the example below appears to be the "new example" Vercel was working on that would now handle OAuth. They have some custom code to handle it.

https://upstash.com/blog/vercel-note-taker-slackbot

@AlexIsMaking
Copy link

AlexIsMaking commented Aug 17, 2023

Great, thanks for sharing that.

Another option that I've discovered recently - Inngest lets you send a response from Vercel immediately and then process the request in the background - https://www.inngest.com/docs/guides/enqueueing-future-jobs. I've only just started using it but it's looking like a must-have tool when working in serverless environments in general.

@Christopher-Hayes
Copy link
Author

Great, thanks for sharing that.

Another option that I've discovered recently - Inngest lets you send a response from Vercel immediately and then process the request in the background - https://www.inngest.com/docs/guides/enqueueing-future-jobs. I've only just started using it but it's looking like a must-have tool when working in serverless environments in general.

Very cool, thanks! Wasn't aware of them for bg jobs.

@enesakar
Copy link

enesakar commented Jan 8, 2024

qstash would be another option: https://upstash.com/docs/qstash/overall/getstarted

@leerob
Copy link

leerob commented Feb 16, 2024

If you don't need Bolt specifically, we recently built a Slackbot using their Rest API https://vercel.com/templates/other/openai-gpt-slackbot-vercel-functions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment