MCP Evals (Beta)

We are working on a GUI for MCP evals. Feature is a work in progress and incomplete. We highly recommend users use the MCPJam Evals CLI for now.

MCP Evals CLI

Run MCP evals via CLI. We are working on maturing the CLI before building our GUI.

MCPJam Evals CLI

Start testing your MCP servers immediately

How MCP E2E Testing Works

E2E testing simulates real user workflows by testing complete chains of interactions. For MCP servers, this means testing how they work when used by actual LLMs and agents in real-world scenarios.

Why Test MCP Servers Differently?

APIs are consumed by other APIs or web clients
MCP servers are consumed by LLMs and agents (like Claude Desktop, Cursor)
We need to simulate the actual user environment where MCP servers operate

How It Works

Setup: Connect your MCP server to a testing agent
Simulate: Configure the agent to behave like a real user (e.g., Claude Desktop)
Test: Have the agent run realistic user queries
Verify: Check the agent’s trace to confirm correct tool usage

Example: PayPal MCP Test

1. Connect PayPal MCP server to testing agent
2. Ask: "Create a refund for order ID 412"
3. Agent runs the query using MCP tools
4. Verify: Did it call create_refund tool successfully?

An LLM judge can analyze the agent’s trace to determine if the test passed.

Getting started

Features

Developer Guide

MCP Evals CLI

MCPJam Evals CLI

How MCP E2E Testing Works

Why Test MCP Servers Differently?

How It Works

Example: PayPal MCP Test

Getting started

Features

Developer Guide

​MCP Evals CLI

MCPJam Evals CLI

​How MCP E2E Testing Works

​Why Test MCP Servers Differently?

​How It Works

​Example: PayPal MCP Test

MCP Evals CLI

How MCP E2E Testing Works

Why Test MCP Servers Differently?

How It Works

Example: PayPal MCP Test