Alerts

Configure and manage alert rules for your containers.

KubeWatch alerts watch your metrics continuously and notify you when something goes wrong. You define the condition; KubeWatch evaluates it against incoming data and fires the alert when the condition is met for a specified duration.

Alert states

StateColorMeaning
FiringRedThe condition is currently true and has been for the full duration
AcknowledgedYellowA team member has seen the alert but it hasn't resolved yet
ResolvedGreenThe condition returned to normal

Creating an alert rule

Navigate to Alerts → New Rule and fill in:

FieldDescriptionExample
NameHuman-readable rule nameHigh CPU on prod-web
MetricThe metric to evaluatecpu_percent
OperatorComparison operator>, <, ==, >=, <=
ThresholdThe value to compare against90
DurationHow long condition must be true before firing5m
ScopeAll containers, or a specific container/agentagent: prod-k8s
ChannelsWhere to send notificationsSlack, Email, PagerDuty, Webhook

Example rules

cpu_percent > 90 for 5m          → High CPU usage
memory_percent > 80 for 5m       → Memory pressure
restart_count > 5                → Crash-looping container
network_rx_bytes < 100 for 10m   → Suspected network partition
postgres.active_connections > 100 → DB connection pool exhausted

Notification channels

Before an alert can notify you, you need to configure at least one notification channel in Integrations:

  • Email, sends to one or more addresses; uses your configured SMTP settings or KubeWatch's mail relay on the hosted plan
  • Slack, posts a message to a channel via Slack webhook; include the container name, metric value, and a link to the dashboard
  • PagerDuty, creates/resolves incidents via the PagerDuty Events API v2
  • Webhook, sends a POST request with the alert payload as JSON to any URL

You can assign multiple channels to a single alert rule.

Acknowledging an alert

When an alert fires, open the Alerts page, find the firing alert, and click Acknowledge. This moves it to the Acknowledged state and suppresses repeat notifications for 4 hours (configurable).

Acknowledging an alert does not resolve it, the alert remains Acknowledged until the underlying condition resolves on its own.

Alert history

Every state transition (pending → firing → acknowledged → resolved) is recorded in the alert history. View the history for a specific rule by clicking the rule name → History tab. The history shows timestamp, previous state, new state, and the metric value that triggered the transition.

Muting alerts

You can mute an alert rule for a time window (e.g., during a planned maintenance window):

  1. Open the rule
  2. Click Mute
  3. Set the mute duration (up to 72 hours)

Muted rules will not fire or send notifications. The mute expiry time is shown on the rule card.