# MCP analytics

Every authenticated MCP request the Zuplo MCP Gateway handles produces a set of
structured analytics events. The events power the MCP tab on the Zuplo Portal's
**Analytics** page and feed the same data into Zuplo's standard log and metrics
pipelines. This page explains why each event exists, the dimensions that scope
the data, and the operational questions the dashboard exists to answer.

## What the analytics are for

A platform team running an MCP Gateway usually wants to answer a small number of
recurring questions:

- Is the gateway healthy right now? What's the success rate, and where are the
  failures coming from — the gateway, the upstream, or the client?
- Which capabilities (tools, prompts, resources) are users actually exercising,
  and which are slow or error-prone?
- Who is using the gateway and how heavily?
- Did the upstream OAuth flow finish for the user who just complained, or did
  they hit a connect-required state nobody resolved?
- When latency went up, was it the gateway or the upstream?

The analytics event taxonomy is shaped to answer each of those questions without
leaving the dashboard.

## The three event families

Every MCP analytics event belongs to one of three families. The split matters
because each family answers a different kind of question.

- **`mcp_request`** events fire at the route boundary. They record the
  acceptance or rejection of an inbound MCP request before any JSON-RPC routing
  happens — what authentication and authorization decisions the gateway made and
  why. These are the events that tell you "the gateway rejected this request"
  versus "the gateway accepted it and something downstream went wrong."
- **`capability_invocation`** events fire on every parsed JSON-RPC call. They
  record what the client asked for (the `mcpMethod` and `capabilityName`) and
  what happened — success, error, latency. This family feeds the
  top-capabilities tables and the per-tool error-rate views.
- **`auth_event`** entries record the OAuth lifecycle: tokens issued and
  validated, consent approvals, upstream connections established, and token
  revocations. This family powers the "did the user actually finish OAuth"
  question.

Together the three families let an operator pivot from a failed tool call to the
OAuth event that issued the token, to the request boundary that accepted the
request, without leaving the analytics surface.

## Outcomes drive the chart colors

Every event carries an `outcome` value in one of seven classes — `success`,
`failure`, `denied`, `application_error`, `connect_required`, `partial`,
`cancelled`. Outcome class drives chart colors and the success-rate KPI.
Failures break down further by `failureOrigin` (gateway, upstream, client) for
the failure-origin chart and KPI.

The split between `denied`, `application_error`, and `failure` matters: a 401 on
the route is a `denied`, an upstream returning an MCP-level error inside a 200
response is an `application_error`, and an actual operational failure (timeout,
network error, malformed response) is a `failure`. The operator sees the same
red chart slice in all three cases but can pivot to the right next question by
clicking the slice.

## Dimensions you'll filter by

Each event carries the route and identity fields that scope it:

- `operationId` (surfaced as `virtualServerName`) — the route's identity
- `upstreamServerId` (surfaced as `upstreamServerName`) — the upstream's id
- `subjectId` — the authenticated user
- `authProfileId` and `upstreamAuthMode` — which OAuth surface produced the call
- `httpMethod`, `transport`, `mcpMethod`, `clientName` — protocol shape
- `latencyMs`, with the gateway and upstream slices when both halves are
  measured
- `reasonCode` and `errorType` on failure events — stable programmer-friendly
  strings like `missing_token`, `invalid_audience`, `connect_required`,
  `upstream_timeout`

Reason codes appear in both analytics events and the structured-log
counterparts, which lets a single string cross-reference the two data sources
when an operator is debugging.

The dashboard's drill-in model uses these dimensions. Clicking any value in a
breakdown table (a user, an upstream, a capability) scopes the entire dashboard
to that value; clicking again toggles the filter off. Multiple drill-ins compose
with AND semantics — clicking a user, then an upstream, then a capability type
narrows the view to that combination.

## How the dashboard answers each question

The Portal renders the MCP analytics in a fixed order so the layout doesn't
reshape when filters or time ranges change. The order maps to the recurring
operator questions, top to bottom:

| Panel                                                                    | Question it answers                                 | Key dimension                                  |
| ------------------------------------------------------------------------ | --------------------------------------------------- | ---------------------------------------------- |
| Headline cards — total events, success rate, p95 latency, failure origin | Is the gateway healthy right now?                   | `outcome`, `failureOrigin`, `latencyMs`        |
| Events Over Time                                                         | When did volume or errors change?                   | event family × `outcome` over time             |
| Top Capabilities (Most Calls / Most Errors / Slowest) + type filter      | What are users doing, and what's broken?            | `capabilityName`, capability type, `latencyMs` |
| Top Users                                                                | Who is using the gateway, and whose calls fail?     | `subjectId`                                    |
| Top MCP Routes + Top Upstream Servers                                    | Which route or upstream carries the traffic?        | `operationId`, `upstreamServerId`              |
| MCP Methods, Top Clients, Transport                                      | What protocol shape is flowing?                     | `mcpMethod`, `clientName`, `transport`         |
| JSON-RPC Error Codes + Failure Origins                                   | Is a failure ours, the upstream's, or the client's? | JSON-RPC error code, `failureOrigin`           |
| Top Reason Codes                                                         | What's the single most direct path into the logs?   | `reasonCode`                                   |

A few panels carry detail worth calling out:

- The **p95 latency** card splits gateway and upstream slices beneath the
  headline number — the fastest way to tell "the gateway is slow" from "the
  upstream is slow."
- **Events Over Time** always renders failure outcomes in red, so an error spike
  has a characteristic shape (a red bar in a previously-green window) that's
  easy to spot and click into.
- **Top Users** renders email-style subjects as the email — so
  `auth0|google-apps|alex@example.com` shows as `alex@example.com` — and shows
  other subject formats as-is.
- **Top Reason Codes** shares the same `reasonCode` value with the structured
  logs, so a code copied here cross-references directly into a log query.

## Where to find it

The MCP analytics dashboard is a tab on the Zuplo Portal's **Analytics** page.
At the account scope,
[Analytics → MCP](https://portal.zuplo.com/+/account/analytics) aggregates
across every project on the account that has MCP routes. At the project scope,
[Analytics → MCP](https://portal.zuplo.com/+/account/project/analytics) shows
that project's events only.

The MCP tab appears automatically once any MCP request has been recorded for the
project. New projects show the empty state until the first MCP request lands.

## Reference: event types

The dashboard is built from a fixed set of event types. New types may be added
over time, but the families and outcome classes above stay stable.

### `mcp_request`

Boundary events at the MCP route. Examples include `mcp_request_accepted` and
`mcp_request_rejected`. Carries `operationId`, `subjectId` (when known),
`httpMethod`, `transport`, the `reasonCode` on rejection, and the `latencyMs`
spent at the boundary.

### `capability_invocation`

Per-capability events emitted by
[`McpProxyHandler`](../code-config/mcp-proxy-handler.mdx). Each invoked call
emits two events: an `mcp_capability_invoked` event before the upstream fetch
(carrying the parsed `mcpMethod` and `capabilityName`), and an
`mcp_capability_completed` event afterward (carrying `outcome`, `mcpStatus`,
`latencyMs`, and any JSON-RPC error details).

### `auth_event`

OAuth and upstream-auth lifecycle events. Examples include
`mcp_auth_downstream_token_issued`, `mcp_auth_downstream_token_validated`,
`mcp_auth_upstream_connection_established`, and `mcp_auth_consent_approved`.
Carries the same identity fields as the other families when applicable, plus
`authProfileId` and `upstreamAuthMode`.

## Forwarding the underlying data

The same events that back the dashboard also flow through Zuplo's standard
analytics pipeline. Every event corresponds to a structured log entry — see
[Logging](./logging.mdx) for the MCP-specific log fields. Log destinations
supported include Datadog, AWS CloudWatch, Google Cloud Logging, Splunk, Sumo
Logic, New Relic, Loki, Dynatrace, and VMware Log Insight; see
[Logging](../../articles/logging.mdx) for the full list of destinations and how
to enable them.

For metrics, see the built-in
[metrics plugins](../../articles/metrics-plugins.mdx) (Datadog, Dynatrace, New
Relic, OpenTelemetry). The OpenTelemetry plugin specifically exports traces and
logs for the MCP request, every inbound policy, the handler, and the upstream
fetch.

## Related

- [Logging](./logging.mdx) — the structured-log counterpart, including the field
  model and OpenTelemetry export.
- [`McpProxyHandler` reference](../code-config/mcp-proxy-handler.mdx) — the
  handler whose capability instrumentation drives the dashboard's top-capability
  views.
- [Troubleshooting](../troubleshooting.mdx) — the operator playbook when an
  analytics chart shows something concerning.
