Alerts
Configure and manage alert rules for your containers.
KubeWatch alerts watch your metrics continuously and notify you when something goes wrong. You define the condition; KubeWatch evaluates it against incoming data and fires the alert when the condition is met for a specified duration.
Alert states
| State | Color | Meaning |
|---|---|---|
| Firing | Red | The condition is currently true and has been for the full duration |
| Acknowledged | Yellow | A team member has seen the alert but it hasn't resolved yet |
| Resolved | Green | The condition returned to normal |
Creating an alert rule
Navigate to Alerts → New Rule and fill in:
| Field | Description | Example |
|---|---|---|
| Name | Human-readable rule name | High CPU on prod-web |
| Metric | The metric to evaluate | cpu_percent |
| Operator | Comparison operator | >, <, ==, >=, <= |
| Threshold | The value to compare against | 90 |
| Duration | How long condition must be true before firing | 5m |
| Scope | All containers, or a specific container/agent | agent: prod-k8s |
| Channels | Where to send notifications | Slack, Email, PagerDuty, Webhook |
Example rules
cpu_percent > 90 for 5m → High CPU usage
memory_percent > 80 for 5m → Memory pressure
restart_count > 5 → Crash-looping container
network_rx_bytes < 100 for 10m → Suspected network partition
postgres.active_connections > 100 → DB connection pool exhausted
Notification channels
Before an alert can notify you, you need to configure at least one notification channel in Integrations:
- Email, sends to one or more addresses; uses your configured SMTP settings or KubeWatch's mail relay on the hosted plan
- Slack, posts a message to a channel via Slack webhook; include the container name, metric value, and a link to the dashboard
- PagerDuty, creates/resolves incidents via the PagerDuty Events API v2
- Webhook, sends a POST request with the alert payload as JSON to any URL
You can assign multiple channels to a single alert rule.
Acknowledging an alert
When an alert fires, open the Alerts page, find the firing alert, and click Acknowledge. This moves it to the Acknowledged state and suppresses repeat notifications for 4 hours (configurable).
Acknowledging an alert does not resolve it, the alert remains Acknowledged until the underlying condition resolves on its own.
Alert history
Every state transition (pending → firing → acknowledged → resolved) is recorded in the alert history. View the history for a specific rule by clicking the rule name → History tab. The history shows timestamp, previous state, new state, and the metric value that triggered the transition.
Muting alerts
You can mute an alert rule for a time window (e.g., during a planned maintenance window):
- Open the rule
- Click Mute
- Set the mute duration (up to 72 hours)
Muted rules will not fire or send notifications. The mute expiry time is shown on the rule card.