Skip to main content
This guide walks you through setting up alerts for manifest deployment failures with Slack notifications. You’ll create a Slack webhook and configure an alert rule to monitor specific clusters.

What You’ll Build

By the end of this guide, you’ll have:
  • A Slack webhook integration connected to Ankra
  • An alert rule that triggers when manifest jobs fail
  • Notifications sent to your Slack channel when deployments fail
  • Automatic AI-powered root cause analysis for every failure

Prerequisites

Before you begin, ensure you have:
  • An Ankra account with access to your organization’s alerts
  • Admin access to a Slack workspace (to create an app)
  • At least one cluster with manifests deployed

Part 1: Create a Slack Webhook

Step 1: Create a Slack App

  1. Go to api.slack.com/apps
  2. Click Create New App
  3. Choose From scratch
  4. Enter the app details:
    • App Name: Ankra Alerts
    • Workspace: Select your workspace
  5. Click Create App
Create Slack App

Step 2: Enable Incoming Webhooks

  1. In your app settings, click Incoming Webhooks in the left sidebar
  2. Toggle Activate Incoming Webhooks to On
  3. Click Add New Webhook to Workspace
  4. Select the channel for alerts (e.g., #deployments or #alerts)
  5. Click Allow

Step 3: Copy the Webhook URL

After authorization, copy your webhook URL. It will look like:
https://hooks.slack.com/services/TXXXXX/BXXXXX/your-webhook-token
Keep this URL secret. Anyone with it can post to your channel.

Step 4: Create the Webhook in Ankra

  1. In Ankra, navigate to Alerts in the sidebar
  2. Click the Integrations tab
  3. Click Create Webhook
  4. Fill in the details:
FieldValue
NameSlack Deployment Alerts
URLPaste your Slack webhook URL
DescriptionNotifications for manifest failures
TemplateSelect Slack
  1. Click Create

Step 5: Test the Webhook

  1. On the webhook detail page, click Send Test
  2. Check your Slack channel for the test message
  3. If it appears, your webhook is ready!
The default Slack template includes alert name, severity, resource info, and a link to Ankra. You can customize it later if needed.

Part 2: Create the Alert Rule

Now create an alert rule that monitors manifest deployments on specific clusters.

Step 1: Start Creating an Alert

  1. Go to Alerts in Ankra
  2. Click the Alert Rules tab
  3. Click Create Alert

Step 2: Configure Basic Details

In the Details step:
FieldValue
NameManifest Deployment Failed
DescriptionAlert when manifest jobs fail on production clusters
EnabledToggle On
Click Next to continue.

Step 3: Select Resource Type

In the Resource step:
  1. Resource Type: Select Cluster Resource
  2. Cluster Resource Kind: Select Manifest
This tells Ankra to monitor manifest resources specifically.

Step 4: Select Specific Clusters

Under Clusters, select the clusters you want to monitor:
  1. Click the cluster dropdown
  2. Check the boxes for your target clusters (e.g., production-cluster, staging-cluster)
  3. Only selected clusters will trigger this alert
Leave all clusters unchecked to monitor manifests across your entire organization.

Step 5: Configure the Condition

In the Conditions section, set up the failure condition: Condition 1:
FieldValue
Condition TypeJob Status
OperatorEquals
ValueFailed
This condition triggers when any manifest job enters a Failed state.
You can add multiple conditions with AND/OR logic:Example: Alert on both failures and timeouts
  • Condition 1: Job Status = Failed
  • OR
  • Condition 2: Job Status = Timeout
Example: Alert only when stuck for a while
  • Condition 1: Job Status = Failed
  • AND
  • Condition 2: Stuck in State > 5 minutes
Click Next to continue.

Step 6: Set Severity and Cooldown

In the Severity step:
FieldValueDescription
SeveritycriticalHigh priority for deployment failures
Cooldown15 minutesPrevent repeated alerts for the same issue
Click Next to continue.

Step 7: Attach the Slack Webhook

In the Notifications step:
  1. Under Webhooks, find your Slack Deployment Alerts webhook
  2. Check the box to enable it
  3. The alert will now send notifications to Slack when triggered
Click Create to save the alert rule.

What Happens When a Manifest Fails

When a manifest deployment fails on one of your selected clusters:

Step-by-Step Breakdown

  1. Ankra detects the failure - Job status changes to Failed
  2. Alert rule evaluates - Condition matches (Job Status = Failed)
  3. Cooldown check - If no recent alert for this resource, proceed
  4. Webhook fires - Slack receives the notification immediately
  5. AI Analysis Resource created - Ankra creates a tracking resource for the analysis
  6. AI Analysis Job scheduled - A background job starts collecting data
  7. Data collection - The job gathers:
    • Pod status and container states
    • Kubernetes events (warnings, errors)
    • Container logs (last 50 lines)
    • Job results and error messages
  8. AI Incident generated - Claude AI analyzes the data and produces:
    • Root cause identification
    • Severity assessment
    • Recommended actions
    • Links to affected resources
View AI Incidents: Navigate to Alerts → AI Incidents tab to see all generated analyses. Each incident includes the AI’s findings, affected resources, and suggested next steps. You can also ask follow-up questions directly in the incident view.

Example Slack Notification

Your Slack channel will receive a message like:
🔴 Alert: Manifest Deployment Failed
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Severity: critical
Resource: nginx-config (Manifest)
Cluster: production-cluster

Triggered at: Dec 15, 2024, 2:34 PM

🔗 View in Ankra

Verifying Your Setup

Test your configuration end-to-end:
  1. Create a test manifest with an intentional error (e.g., invalid YAML)
  2. Deploy the manifest to one of your monitored clusters
  3. Watch for the failure in the Operations page
  4. Check Slack for the alert notification
  5. Review the AI Incident for root cause analysis

Additional Alert Scenarios

To alert when manifests enter an unhealthy state (not just job failures):
FieldValue
Condition TypeResource State
OperatorEquals
ValueDown
This triggers when a manifest’s health status becomes “Down”.
To alert when manifests are stuck in a deploying state:
FieldValue
Condition TypeStuck in State (duration)
OperatorGreater than
Value10 (minutes)
This triggers when a manifest has been in a transitional state for too long.
To alert only after multiple failures:
FieldValue
Condition TypeFailed Job Count
OperatorGreater than or equal
Value3
This triggers after 3 or more failed jobs on the same manifest.

Troubleshooting

Alert not triggering

  • Verify the alert rule is Enabled
  • Check that the correct clusters are selected
  • Ensure the manifest is in a cluster you’re monitoring
  • Confirm the condition matches (Job Status = Failed)

Slack message not appearing

  • Test the webhook using the Send Test button
  • Verify the webhook URL is correct
  • Check that the Slack app still has channel access
  • Look for errors in the webhook’s delivery history

Too many alerts

  • Increase the cooldown period (e.g., 30 minutes)
  • Add more specific conditions to filter alerts
  • Consider monitoring only production clusters

  • Alerts - Full alert rule configuration reference
  • Webhooks - Template variables and custom payloads
  • AI Incidents - Automatic root cause analysis
  • Operations - View job history and logs

Still have questions? Join our Slack community and we’ll help out.