# Prometheus Integration

### Overview

[Prometheus](https://prometheus.io/) is an open-source systems monitoring and alerting toolkit that collects metrics from configured targets, evaluates alert rules, and routes firing alerts to an external Alertmanager. When a rule condition is breached, Alertmanager groups and delivers a structured webhook payload to ITOC360.

This integration supports automatic alert creation on firing events and automatic resolution when Alertmanager sends a resolved notification.

### Integration Flow

1. Prometheus scrapes metrics from configured targets at a defined interval.
2. Prometheus evaluates alert rules continuously. When a condition is met, it sends the alert to Alertmanager.
3. Alertmanager groups the alerts and delivers a webhook POST request to ITOC360 endpoint.
4. When the alert condition clears, Alertmanager sends a `resolved` notification and the ITOC360 automatically closes the alert.

### Webhook Payload Schema

The payload delivered to ITOC360 follows the standard Prometheus Alertmanager webhook format (version 4).

```json
{
  "receiver": "string",
  "status": "firing | resolved",
  "alerts": [
    {
      "status": "firing | resolved",
      "labels": {
        "alertname": "string",
        "severity": "string",
        "env": "string"
      },
      "annotations": {
        "summary": "string",
        "description": "string"
      },
      "startsAt": "ISO8601 timestamp",
      "endsAt": "ISO8601 timestamp",
      "generatorURL": "string",
      "fingerprint": "string"
    }
  ],
  "groupLabels": {},
  "commonLabels": {
    "alertname": "string",
    "severity": "string"
  },
  "commonAnnotations": {
    "summary": "string",
    "description": "string"
  },
  "externalURL": "string",
  "version": "4",
  "groupKey": "string",
  "truncatedAlerts": 0
}
```

***

### Setup

#### Step 1 — Create an Alert Source on the ITOC360

1. Navigate to **Sources** → **Add Source**.
2. Search for **Prometheus** and select it.
3. Give the source a name and click **Save**.
4. Copy ITOC360 **URL** and **Token**.

#### Step 2 — Install and Configure Alertmanager

Prometheus does not deliver alerts directly to external systems. You must run a Prometheus Alertmanager instance and point Prometheus at it.

Install Alertmanager using your preferred method (binary, Docker, Helm). Then configure it to forward alerts to ITOC360:

**`alertmanager.yml`**

```yaml
global:
  resolve_timeout: 5m

route:
  receiver: itoc360-webhook
  group_wait: 10s
  group_interval: 1m
  repeat_interval: 4h

receivers:
  - name: itoc360-webhook
    webhook_configs:
      - url: "https://api.itoc360.app/functions/v1/events?token=<x-itoc360-token>"
        send_resolved: true
```

> `send_resolved: true` is required for automatic alert resolution on the ITOC360.

#### Step 3 — Configure Prometheus

Point Prometheus to your Alertmanager and define your rule files in `prometheus.yml`:

```yaml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - <alertmanager-host>:9093

rule_files:
  - /etc/prometheus/rules/*.yaml
```

#### Step 4 — Create Alert Rules

Create rule files in the configured `rule_files` directory. Each file defines one or more alert groups.

**Example: `rules/production.yaml`**

```yaml
groups:
  - name: production-critical
    interval: 1m
    rules:
      - alert: HighCPUUsage
        expr: |
          100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High CPU usage detected"
          description: "Instance {{ $labels.instance }} CPU usage is above 90% for more than 5 minutes."
```

> The `severity` label in `labels` is used by the ITOC360 for priority mapping (see table below).

#### Step 5 — Verify the Integration

After starting Prometheus and Alertmanager:

1. Open Prometheus UI at `http://<prometheus-host>:9090/alerts` — active alerts should appear there first.
2. Open Alertmanager UI at `http://<alertmanager-host>:9093` — confirm the alert is routed to the webhook receiver.
3. Confirm the alert appears on the ITOC360 under the source you created.

### Sample Payload

The following is a real payload captured during integration testing.

**ALERT (firing):**

```json
{
  "receiver": "itoc360-webhook",
  "status": "firing",
  "alerts": [
    {
      "status": "firing",
      "labels": {
        "alertname": "HighCPUUsage",
        "instance": "server-01:9100",
        "severity": "critical"
      },
      "annotations": {
        "summary": "High CPU usage detected",
        "description": "Instance server-01:9100 CPU usage is above 90% for more than 5 minutes."
      },
      "startsAt": "2026-03-10T09:46:06.815Z",
      "endsAt": "0001-01-01T00:00:00Z",
      "generatorURL": "http://prometheus:9090/graph?g0.expr=...",
      "fingerprint": "34e164e9af873ac1"
    }
  ],
  "groupLabels": {},
  "commonLabels": {
    "alertname": "HighCPUUsage",
    "severity": "critical"
  },
  "commonAnnotations": {
    "summary": "High CPU usage detected"
  },
  "externalURL": "http://alertmanager:9093",
  "version": "4",
  "groupKey": "{}:{}",
  "truncatedAlerts": 0
}
```

**RESOLVE (resolved):**

```json
{
  "receiver": "itoc360-webhook",
  "status": "resolved",
  "alerts": [
    {
      "status": "resolved",
      "labels": {
        "alertname": "HighCPUUsage",
        "severity": "critical"
      },
      "annotations": {
        "summary": "High CPU usage detected"
      },
      "startsAt": "2026-03-10T09:46:06.815Z",
      "endsAt": "2026-03-10T10:01:00.000Z",
      "fingerprint": "34e164e9af873ac1"
    }
  ],
  "version": "4"
}
```

### Field Mapping Reference

| Payload Field                       | Description                                                           |
| ----------------------------------- | --------------------------------------------------------------------- |
| `status`                            | Top-level event type: `firing` → ALERT, `resolved` → RESOLVE          |
| `alerts[0].fingerprint`             | Unique identifier per alert label set — used for fingerprint matching |
| `alerts[0].labels.alertname`        | Name of the alert rule that fired                                     |
| `alerts[0].labels.severity`         | Severity label from the rule definition — used for priority mapping   |
| `alerts[0].annotations.summary`     | Short human-readable alert title                                      |
| `alerts[0].annotations.description` | Detailed description of the alert condition                           |
| `alerts[0].startsAt`                | ISO 8601 timestamp when the alert started firing                      |
| `alerts[0].endsAt`                  | ISO 8601 timestamp when resolved (`0001-...` means still active)      |
| `commonLabels`                      | Labels shared across all alerts in this group                         |
| `commonAnnotations`                 | Annotations shared across all alerts in this group                    |
| `groupKey`                          | Alertmanager grouping key for the delivered alert batch               |

### Priority Mapping

ITOC360 maps the `severity` label from the alert rule to an internal priority level.

| Prometheus `severity` Label | ITOC360 Priority |
| --------------------------- | ---------------- |
| `critical`                  | CRITICAL         |
| `error`                     | HIGH             |
| `warning`                   | MEDIUM           |
| `info`                      | LOW              |
| *(not set)*                 | MEDIUM (default) |

> You control the `severity` label in your alert rule definitions. Use consistent values across your rule files for predictable priority routing.

### RESOLVE Detection

ITOC360 automatically resolves an alert when Alertmanager sends a payload with `"status": "resolved"`. This requires `send_resolved: true` in your Alertmanager webhook configuration (set in Step 2).

The resolved event is matched to the original alert using the `fingerprint` field, which Alertmanager generates deterministically from the alert's label set. As long as the labels do not change between firing and resolution, the fingerprint will match and the alert will be closed.

### Security

The webhook URL contains a source token generated by ITOC360. Keep this token secret and do not commit it to public repositories. If the token is exposed, rotate it from the ITOC360 source settings.

### Troubleshooting

* If alerts appear in Prometheus but not in ITOC360, open the Alertmanager UI and confirm that the alert is routed to the `itoc360-webhook` receiver.
* If alerts are created but not resolved automatically, verify that `send_resolved: true` is set in `alertmanager.yml`.
* If priority mapping does not work as expected, confirm that the alert rule includes a supported `severity` label.
* Check Alertmanager logs for webhook delivery errors such as invalid URL, timeout, or authentication failure.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.itoc360.com/integrations/inbound-integrations/observability-and-apm/prometheus-integration.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
