- Discover workflows that are breaking your server and get actionable ways on resolving them.
- Benchmark your server’s performance and catch regressions in future changes.
- Programatically test queries on a MCP server with a command. No more doing QA one by one.
E2E testing (beta)
We built a CLI that performs MCP evals and End to End (E2E) testing. The CLI creates a simulated end user’s environment and tests popular user flows. An example of E2E test for PayPal MCP:- Connect the PayPal MCP server to testing agent. To simulate Claude Desktop, we can configure the agent to use a Claude model with a default system prompt.
- Query the agent to run a typical user query like “Create a refund for order ID 412”
- Let the testing agent run the query.
- Check the testing agents’ tracing, make sure that it called the tool
create_refund
and successfully created a refund.