Prometheus Alertmanager: Dynamic Slack Config

Are you using Prometheus Alertmanager with Slack integration? It’s very likely that you need some alerts to go to one channel, and other alerts to go elsewhere. To accomplish this, you might configure a receiver for each team and create a route that matches on the alertname or label. You’ll end up with a configuration that looks something like this:

alertmanager:
  config:
    global:
      slack_api_url: "https://hooks.slack.com/services/XYZ"
    route:
      receiver: 'default-receiver'
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 4h
      group_by: [ alertname, job]
      routes:
      - receiver: 'billing-team'
        match:
          alertname: BillingAlert
      - receiver: 'database-team'
        match:
          alertname: DatabaseAlert
      - receiver: 'frontend-team'
        match:
          alertname: FrontendAlert                    
    receivers:
    - name: 'billing-team'
      slack_configs:
      - send_resolved: true
        username: "alertmanager"
        channel: "billing-alerts"
    - name: 'database-team'
      slack_configs:
      - send_resolved: true
        username: "alertmanager"
        channel: "database-alerts"
    - name: 'frontend-team'
      slack_configs:
      - send_resolved: true
        username: "alertmanager"
        channel: "frontend-alerts"
    ## more team configs ...       

With this setup, as your teams grow and change, so does your Alertmanager configuration. For me, this was bothersome. I came up with what I thought was a really good idea. I wanted to dynamically set the slack channel based on a label. Sounds good, but how? Based on documentation, the channel takes a template as a parameter. From there, I needed to figure out how to dynamically pull a specific label and use that as the channel name. My solution:

channel: '{{ index ((index .Alerts 0).Labels) "slack_channel" }}'

Nice! Let me explain it. I have alerts configured to get fired as a “group” by alertname and job, this way, every alert grouping contains alerts only for a specific job (or team). Because of this, every alert in the group will have the same labels. I can just take the first alert and go from there. With the alert at index 0, I select my desired label, slack_channel, from the label set. Here is what the entire config looks like:

alertmanager:
  config:
    global:
      slack_api_url: "https://hooks.slack.com/services/XYZ"
    route:
      receiver: 'default-receiver'
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 4h
      group_by: [ alertname, job]
      routes:
      - receiver: 'team-receiver'
        match_re:
          slack_channel: "^[@#a-z0-9][a-z0-9._-]*$"
    receivers:
    - name: 'default-receiver'
      slack_configs:
      - send_resolved: true
        username: "alertmanager"
        channel: "prom-alerts"
    - name: 'team-receiver'
      slack_configs:
      - send_resolved: true
        username: "alertmanager"
        channel: '{{ index ((index .Alerts 0).Labels) "slack_channel" }}'

We now have a receiver, named team-receiver, that can dynamically configure the team’s slack channel from a label. Teams just need to set the channel in their alert rule to get notifications wherever they desire. Another very important note, the route uses a regex to determine if the slack_channel label is present. If the label is not present, Alertmanager moves on to the next possible receiver.

Alert Rule Configuration

Now that we can receive team alerts dynamically. Let’s setup a team’s alert rule and pass the slack_channel. Here is a simple rule that checks to make sure a job is up and sends an alert to the billing channel if there is a failure.

      groups:
      - name: billing-job
        rules:
        - alert: InstanceDown
          expr: up{job="billing"} == 0
          for: 5m
          labels:
            severity: critical
            slack_channel: billing
          annotations:
            summary: "Instance {{ "{{$labels.service}}:{{$labels.instance}}" }} down"
            description: "{{ "{{$labels.instance}}" }} of job {{ "{{$labels.job}}" }} has been down for more than 5 minutes."

Summary

That was fun, we’ve just setup a single receiver to use across multiple teams. This was all made possible due to the channel configuration taking a template parameter vs. a simple string. With this information, there are likely other cool tricks you can do. I hope you found this helpful and easy to follow.

2 Comments

  1. Hi! Thanks for the article.. I was wondering whether in the first hard-coded example, you did not need to prefix the channel name with a hash, e.g.
    channel: “#frontend-alerts”
    instead of;
    channel: “frontend-alerts”
    Thanks!

  2. Nice post. I think I added a ‘fallback/default’ value to the Slack channel config:
    ‘{{ or (index ((index .Alerts 0).Labels) “SlackCodeownerChannel”) “SLACK_DEFAULT_CHANNEL” }}’
    But I am admittedly bad at Go templating and Alertmanager.

Leave a Reply to Swapna Sarangdhar Cancel reply

Your email address will not be published. Required fields are marked *