Alert on Manifest Failures

This guide walks you through setting up alerts for manifest deployment failures with Slack notifications. You’ll create a Slack webhook and configure an alert rule to monitor specific clusters.

What You’ll Build

By the end of this guide, you’ll have:

A Slack webhook integration connected to Ankra
An alert rule that triggers when manifest jobs fail
Notifications sent to your Slack channel when deployments fail
Automatic AI-powered root cause analysis for every failure

Prerequisites

Before you begin, ensure you have:

An Ankra account with access to your organization’s alerts
Admin access to a Slack workspace (to create an app)
At least one cluster with manifests deployed

Part 1: Create a Slack Webhook

Step 1: Create a Slack App

Go to api.slack.com/apps
Click Create New App
Choose From scratch
Enter the app details:
- App Name: Ankra Alerts
- Workspace: Select your workspace
Click Create App

Step 2: Enable Incoming Webhooks

In your app settings, click Incoming Webhooks in the left sidebar
Toggle Activate Incoming Webhooks to On
Click Add New Webhook to Workspace
Select the channel for alerts (e.g., #deployments or #alerts)
Click Allow

Step 3: Copy the Webhook URL

After authorization, copy your webhook URL. It will look like:

https://hooks.slack.com/services/TXXXXX/BXXXXX/your-webhook-token

Keep this URL secret. Anyone with it can post to your channel.

Step 4: Create the Webhook in Ankra

In Ankra, navigate to Alerts in the sidebar
Click the Integrations tab
Click Create Webhook
Fill in the details:

Field	Value
Name	`Slack Deployment Alerts`
URL	Paste your Slack webhook URL
Description	Notifications for manifest failures
Template	Select Slack

Click Create

Step 5: Test the Webhook

On the webhook detail page, click Send Test
Check your Slack channel for the test message
If it appears, your webhook is ready!

The default Slack template includes alert name, severity, resource info, and a link to Ankra. You can customize it later if needed.

Part 2: Create the Alert Rule

Now create an alert rule that monitors manifest deployments on specific clusters.

Step 1: Start Creating an Alert

Go to Alerts in Ankra
Click the Alert Rules tab
Click Create Alert

Step 2: Configure Basic Details

In the Details step:

Field	Value
Name	`Manifest Deployment Failed`
Description	Alert when manifest jobs fail on production clusters
Enabled	Toggle On

Click Next to continue.

Step 3: Select Resource Type

In the Resource step:

Resource Type: Select Cluster Resource
Cluster Resource Kind: Select Manifest

This tells Ankra to monitor manifest resources specifically.

Step 4: Select Specific Clusters

Under Clusters, select the clusters you want to monitor:

Click the cluster dropdown
Check the boxes for your target clusters (e.g., production-cluster, staging-cluster)
Only selected clusters will trigger this alert

Leave all clusters unchecked to monitor manifests across your entire organization.

Step 5: Configure the Condition

In the Conditions section, set up the failure condition: Condition 1:

Field	Value
Condition Type	Job Status
Operator	Equals
Value	Failed

This condition triggers when any manifest job enters a Failed state.

Optional: Add additional conditions

You can add multiple conditions with AND/OR logic:Example: Alert on both failures and timeouts

Condition 1: Job Status = Failed
OR
Condition 2: Job Status = Timeout

Example: Alert only when stuck for a while

Condition 1: Job Status = Failed
AND
Condition 2: Stuck in State > 5 minutes

Click Next to continue.

Step 6: Set Severity and Cooldown

In the Severity step:

Field	Value	Description
Severity	`critical`	High priority for deployment failures
Cooldown	`15 minutes`	Prevent repeated alerts for the same issue

Click Next to continue.

Step 7: Attach the Slack Webhook

In the Notifications step:

Under Webhooks, find your Slack Deployment Alerts webhook
Check the box to enable it
The alert will now send notifications to Slack when triggered

Click Create to save the alert rule.

What Happens When a Manifest Fails

When a manifest deployment fails on one of your selected clusters:

Step-by-Step Breakdown

Ankra detects the failure - Job status changes to Failed
Alert rule evaluates - Condition matches (Job Status = Failed)
Cooldown check - If no recent alert for this resource, proceed
Webhook fires - Slack receives the notification immediately
AI Analysis Resource created - Ankra creates a tracking resource for the analysis
AI Analysis Job scheduled - A background job starts collecting data
Data collection - The job gathers:
- Pod status and container states
- Kubernetes events (warnings, errors)
- Container logs (last 50 lines)
- Job results and error messages
AI Incident generated - Claude AI analyzes the data and produces:
- Root cause identification
- Severity assessment
- Recommended actions
- Links to affected resources

View AI Incidents: Navigate to Alerts → AI Incidents tab to see all generated analyses. Each incident includes the AI’s findings, affected resources, and suggested next steps. You can also ask follow-up questions directly in the incident view.

Example Slack Notification

Your Slack channel will receive a message like:

🔴 Alert: Manifest Deployment Failed
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Severity: critical
Resource: nginx-config (Manifest)
Cluster: production-cluster

Triggered at: Dec 15, 2024, 2:34 PM

🔗 View in Ankra

Verifying Your Setup

Test your configuration end-to-end:

Create a test manifest with an intentional error (e.g., invalid YAML)
Deploy the manifest to one of your monitored clusters
Watch for the failure in the Operations page
Check Slack for the alert notification
Review the AI Incident for root cause analysis

Additional Alert Scenarios

Alert on manifest state changes

To alert when manifests enter an unhealthy state (not just job failures):

Field	Value
Condition Type	Resource State
Operator	Equals
Value	Down

This triggers when a manifest’s health status becomes “Down”.

Alert on stuck deployments

To alert when manifests are stuck in a deploying state:

Field	Value
Condition Type	Stuck in State (duration)
Operator	Greater than
Value	10 (minutes)

This triggers when a manifest has been in a transitional state for too long.

Alert on repeated failures

To alert only after multiple failures:

Field	Value
Condition Type	Failed Job Count
Operator	Greater than or equal
Value	3

This triggers after 3 or more failed jobs on the same manifest.

Troubleshooting

Alert not triggering

Verify the alert rule is Enabled
Check that the correct clusters are selected
Ensure the manifest is in a cluster you’re monitoring
Confirm the condition matches (Job Status = Failed)

Slack message not appearing

Test the webhook using the Send Test button
Verify the webhook URL is correct
Check that the Slack app still has channel access
Look for errors in the webhook’s delivery history

Too many alerts

Increase the cooldown period (e.g., 30 minutes)
Add more specific conditions to filter alerts
Consider monitoring only production clusters

Alerts - Full alert rule configuration reference
Webhooks - Template variables and custom payloads
AI Incidents - Automatic root cause analysis
Operations - View job history and logs

Still have questions? Join our Slack community and we’ll help out.

Get Started

Guides

Clusters

Stacks & Add-ons

GitOps

Kubernetes Resources

Platform Features

Team & Settings

Integrations

Alert on Manifest Failures

What You’ll Build

Prerequisites

Part 1: Create a Slack Webhook

Step 1: Create a Slack App

Step 2: Enable Incoming Webhooks

Step 3: Copy the Webhook URL

Step 4: Create the Webhook in Ankra

Step 5: Test the Webhook

Part 2: Create the Alert Rule

Step 1: Start Creating an Alert

Step 2: Configure Basic Details

Step 3: Select Resource Type

Step 4: Select Specific Clusters

Step 5: Configure the Condition

Step 6: Set Severity and Cooldown

Step 7: Attach the Slack Webhook

What Happens When a Manifest Fails

Step-by-Step Breakdown

Example Slack Notification

Verifying Your Setup

Additional Alert Scenarios

Troubleshooting

Alert not triggering

Slack message not appearing

Too many alerts

Get Started

Guides

Clusters

Stacks & Add-ons

GitOps

Kubernetes Resources

Platform Features

Team & Settings

Integrations

​What You’ll Build

​Prerequisites

​Part 1: Create a Slack Webhook

​Step 1: Create a Slack App

​Step 2: Enable Incoming Webhooks

​Step 3: Copy the Webhook URL

​Step 4: Create the Webhook in Ankra

​Step 5: Test the Webhook

​Part 2: Create the Alert Rule

​Step 1: Start Creating an Alert

​Step 2: Configure Basic Details

​Step 3: Select Resource Type

​Step 4: Select Specific Clusters

​Step 5: Configure the Condition

​Step 6: Set Severity and Cooldown

​Step 7: Attach the Slack Webhook

​What Happens When a Manifest Fails

​Step-by-Step Breakdown

​Example Slack Notification

​Verifying Your Setup

​Additional Alert Scenarios

​Troubleshooting

​Alert not triggering

​Slack message not appearing

​Too many alerts

​Related

What You’ll Build

Prerequisites

Part 1: Create a Slack Webhook

Step 1: Create a Slack App

Step 2: Enable Incoming Webhooks

Step 3: Copy the Webhook URL

Step 4: Create the Webhook in Ankra

Step 5: Test the Webhook

Part 2: Create the Alert Rule

Step 1: Start Creating an Alert

Step 2: Configure Basic Details

Step 3: Select Resource Type

Step 4: Select Specific Clusters

Step 5: Configure the Condition

Step 6: Set Severity and Cooldown

Step 7: Attach the Slack Webhook

What Happens When a Manifest Fails

Step-by-Step Breakdown

Example Slack Notification

Verifying Your Setup

Additional Alert Scenarios

Troubleshooting

Alert not triggering

Slack message not appearing

Too many alerts

Related