Scaling API

Manage autoscaling policies, read the decision log, and roll back actions.

The Scaling API manages per‑workload autoscaling policies and exposes the decision log. See the Auto Scaling guide for concepts.

Base URL: https://YOUR_KUBEWATCH_URL (your KubeWatch instance, shown in your dashboard)

All tenant requests require Authorization: Bearer <token> or X-API-Key: <key>. The agent‑facing endpoints at the bottom authenticate with the agent's own bearer token.

List policies

GET /api/v1/scaling/policies

Response 200:

[
  {
    "id": "scp_1718000000000000000",
    "name": "checkout-api autoscaler",
    "runtime": "kubernetes",
    "targetRef": "prod/checkout-api",
    "strategy": "pods",
    "enabled": true,
    "dryRun": false,
    "minReplicas": 2,
    "maxReplicas": 10,
    "stepSize": 1,
    "cooldownScaleUpSeconds": 30,
    "cooldownScaleDownSeconds": 300,
    "cpuTargetPercent": 70,
    "memoryTargetPercent": null,
    "nodePressureTargetPercent": null,
    "maxReplicasPerHost": null,
    "approvalRequired": false,
    "currentReplicas": 4,
    "desiredReplicas": 5,
    "applyHealth": "applied",
    "lbPoolRef": null,
    "lastEvaluatedAt": "2026-06-30T08:12:00Z"
  }
]

Create a policy

POST /api/v1/scaling/policies

strategy is derived from runtime and is not settable. At least one of cpuTargetPercent / memoryTargetPercent is required. Policies are created in dry‑run unless dryRun: false is passed.

{
  "name": "checkout-api autoscaler",
  "runtime": "kubernetes",
  "targetRef": "prod/checkout-api",
  "cpuTargetPercent": 70,
  "minReplicas": 2,
  "maxReplicas": 10,
  "stepSize": 1,
  "cooldownScaleUpSeconds": 30,
  "cooldownScaleDownSeconds": 300,
  "approvalRequired": false
}

Response 201: { "id": "scp_..." }

For Docker policies set "runtime": "docker", targetRef as host-pool/group, and optionally maxReplicasPerHost.

Update a policy

PUT /api/v1/scaling/policies/{id}

Send only the fields to change (others are left untouched). runtime and strategy are immutable. Common uses: take a policy live ({"dryRun": false}), pause it ({"enabled": false}), or retune thresholds.

Delete a policy

DELETE /api/v1/scaling/policies/{id} → 204. The decision history is retained.

Roll back

POST /api/v1/scaling/policies/{id}/rollback

Restores the previous state. On Kubernetes this re‑applies the prior HPA/NodePool spec; on Docker it enqueues a docker.scale back to the previous replica count (a scale‑down rollback starts fresh containers). Returns 400 when there is nothing to roll back.

Decision log

GET /api/v1/scaling/decisions?policyId={id}&limit=100

Append‑only record of every action. action is one of dry_run, apply, rollback, apply_failed.

[
  {
    "id": 412,
    "policyId": "scp_1718000000000000000",
    "ts": "2026-06-30T08:12:04Z",
    "runtime": "kubernetes",
    "triggerMetric": "cpu_percent",
    "triggerValue": 88.4,
    "threshold": 70,
    "direction": "up",
    "fromReplicas": 4,
    "toReplicas": 5,
    "action": "apply",
    "dryRun": false,
    "renderedManifest": "apiVersion: autoscaling/v2\nkind: HorizontalPodAutoscaler\n...",
    "outcome": "Applying HPA spec: 2 to 10 replicas, CPU 70% target."
  }
]

Agent endpoints

These power the command channel (Docker) and declarative sync (Kubernetes). They authenticate with the agent bearer token, and the agent always initiates the connection, no inbound port is opened.

Method & path	Purpose
`GET /api/v1/agents/{id}/commands?wait=25`	Long‑poll for a pending command (returns `204` after the wait window)
`POST /api/v1/agents/{id}/commands/{cid}/result`	Report a command's outcome (`succeeded` / `failed`)
`GET /api/v1/agents/{id}/scaling/desired`	Pull the live K8s manifests this agent should server‑side‑apply
`POST /api/v1/agents/{id}/scaling/status`	Report applied‑object status (mirrored to the policy)

Commands carry a server‑generated command_id and a 2‑minute expires_at; delivery is at‑least‑once and the agent's executor is idempotent on command_id.