website/content/blog/webhook-notifications-on-systemd-service-failure.md
2024-09-04 21:42:09 -07:00

3.5 KiB

title date draft tags math medium_enabled
Webhook notifications on systemd service failure 2024-09-04T21:03:38-07:00 false
false false

Every morning like every good system administrator, I log onto all my machines and type the following command

systemctl --failed

This gives me a list of all my systemd services that have failed. I pray that it's empty

  UNIT LOAD ACTIVE SUB DESCRIPTION
0 loaded units listed.

Except, I don't.

Instead, I have it set up so that I receive a webhook notification via Zulip whenever a service fails. With the right infrastructure in place, it's as simple as adding a OnFailure line to all the services you want to monitor.

Step 1: Setting up the webhook.

On Zulip, I use the Slack incoming webhook integration. (Note the URL specification)

As you might guess, this style of webhook works on Slack and on Discord as well.

For our notification script we'll need two environmental variables

Name Description
SERVICE The name of the systemd service. We will automatically populate this
WEBHOOK_URL The URL to send the webhook to. This is chat application specific.

We'll need the following CLI applications installed

Name Description
curl Sends the POST request.
jq Sanitizes the log output before sending it to curl.

The script /bin/webhook-notify.sh

#!/bin/bash

if [ -z "$SERVICE" ]; then
    echo "SERVICE variable not set or empty"
    exit 1
fi

if [ -z "$WEBHOOK_URL" ]; then
    echo "WEBHOOK_URL variable not set or empty"
    exit 1
fi

if ! command -v jq &> /dev/null; then
    echo "jq is not installed"
    exit 1
fi

LOG_CONTENTS=$(systemctl status --full --no-pager ${SERVICE} | jq -Rsa .)

curl -X POST --data-urlencode "payload={\"text\": $LOG_CONTENTS}" ${WEBHOOK_URL}

Make the script executable

chmod u+x /bin/webhook-notify.sh

At this point you should be able to test out the script and make sure you get notifications. Set the two environmental variables and run the script.

Example:

export WEBHOOK_URL="https://INSERT-NAMESPCE.zulipchat.com/api/v1/external/slack_incoming?api_key=INSERT-API-KEY&stream=INSERT-STREAM-ID&topic=Systemd"
export SERVICE=NetworkManager
/bin/webhook-notify.sh

Step 2: Setup the Systemd Service

When a systemd unit fails, we are able to call another systemd service. The service that we'll call will run our script from the last step.

In /etc/systemd/system/webhook-notify@.service

[Unit]
Description=Send Systemd Notifications via Webhook

[Service]
Type=oneshot
Environment=WEBHOOK_URL="INSERT-WEBHOOK-URL-HERE"
Environment=SERVICE=%i
ExecStart=/bin/webhook-notify.sh

[Install]
WantedBy=multi-user.target

Note the @ in the filename. This is important since this service will run with the failed unit name as the argument that appears after the @. Within the script, this is the %i variable.

Example test:

sudo systemctl start webhook-notify@NetworkManager

Step 3: Add OnFailure to all the services we want to monitor

Within the [Unit] section of our Systemd service, add the following

OnFailure=webhook-notify@%i.service