He Slacks You When You’re Sleeping, He Slacks When You’re Awake

Using LLMs to drive your runbook in Slack

Dec 27, 2024

Within organizations, Slack is a universal touch point for information and coordination. The industry is starting to carve out ways for LLMs to interact with disparate systems, but Slack integrations already allow us to communicate with many different tools, all via natural language. And LLMs are great at natural language.

We’ve already built a system for converting business documents into fully-functioning APIs. Can we leverage that same system to translate a business document into a Slack agent that gives actionable guidance and steps?

Example

You’re on-call for a very important factory in the North Pole, and you receive a high priority alert about a clog in the Magic Dust Distribution Network (MDDN). You open up your runbook and spend critical minutes looking for guidance on how to debug what’s going on. You eventually stumble on the following advice to de-ice the pipes:

sudo /opt/mddn/tools/deice-pipes.sh --force

But that didn’t fix it.

You keep reading and decide to activate the backup dust reserves:

/opt/mddn/tools/engage-backup-dust.sh --duration=60m

You know this is just a temporary fix, and you don’t know what to do next, so decide to escalate. Once again, you keep reading and see that you’ve got to:

Spin up a new Slack channel
Invite the appropriate service’s on-calls
Start a Zoom meeting
Create a JIRA ticket for the retrospective document afterwards.

Are you missing any steps? In the middle of an emergency, wouldn’t it be nice if you didn’t have to flip through the runbook finding answers to your questions? Or figuring out how to adapt the commands in the doc to your specific situation? It’d be great if someone was there to walk you through exactly what to do.

Something like:

Once it’s up and running, you can even just ask general questions about systems and the agent will happily help you out, like “I just started working here and have no idea what ROD stands for”:

Runbook

The runbook is pretty straightforward. One interesting observation is that even though runbooks are written for human consumption, they have many components that map directly to components of good LLM prompts as well.

System Overview

It covers what the purpose and scope of the doc is. This is equivalent to an LLM’s persona.

Critical Systems

Quick overview of the major systems at play. Providing good context is important for both LLMs and humans.

Common Incidents

More context providing commonly seen problems and solutions. This serves a double purpose of a) giving readers quick direct solutions to a problem, if it happens multiple times and b) it gives readers examples of strategies, tools, and techniques that can be applied to new incidents.

Incident Response Procedures

Standardized response formats are important for making sure we don’t accidentally skip an important component of an update when providing one.

Command Reference

This is where we list all of the commands and tools that the team has access to. We also include important context for using those tools, such as the available workshop rooms and their sizes, so we know which rooms to reserve for incidents.

Example Incident Response

An in-depth example with explicit inputs and outputs provided. This is valuable for showing concretely how some of the more general guidance in the document gets applied in practice. It’s a more specific variant of the “Common Incidents”.

Our sample doc here has one example, but generally the more-the-merrier for both humans and LLMs when it comes to examples.

Agent

Unlike most Slackbots, which need to be @mentioned or triggered by keywords, our agent implementation passively listens to all communications in channels it has access to. It’s important that, from your perspective, the agent feels like just another team member and doesn’t need any special treatment.

It should just chime in when it feels it has something to contribute.

As a result, we send every message (along with recent history) to the LLM agent. There will be many messages that it doesn’t need to respond to. The first thing we have our agent generate is a shouldRespond bit to make sure that it doesn’t chime in when people are talking about unrelated things, like having lunch.

It’s able to discriminate quite well:

LLMs are amazing translation tools – not just between different languages, but between different modes of operation. Compounding this with Slack as a business tool, and its many preexisting integrations, allows us the streamline a wide variety of workflows.

We are just starting to scratch the surface of having business documents live as both functioning APIs and as agents you can interact with, but our early results are very promising.

We’re planning on making LOGIC, Inc.’s slack integration generally available in early-2025. If you’re interested in that, sign up for our beta and we’ll keep you posted.

Learn more about LOGIC, Inc. at https://logic.inc

BITS of LOGIC, Inc.

Discussion about this post