This page covers tests from the formal specification. See Output Schema for how generated schemas are used.
Purpose
- Learning instrument — a developer or AI agent reading tests should understand the tool’s capabilities by reading tests alone
- Output schema source — tests provide parameter values for real API calls, and captured responses are fed into the
OutputSchemaGenerator
Tool Test Format
Tests are defined as an array inside each tool definition, alongsidemethod, path, description, and parameters:
tests array is part of the main block and must be JSON-serializable.
Resource Query Tests
Resources can also have tests. Since resources use SQL queries instead of HTTP requests, test values correspond to query parameters:Test Fields
| Field | Type | Required | Description |
|---|---|---|---|
_description | string | Yes | What this specific test demonstrates |
{paramKey} | matches parameter type | Yes (per required param) | Value for each {{USER_PARAM}} parameter (tools) or query parameter (resources) |
Writing Good Descriptions
Parameter Values
For tools: each{{USER_PARAM}} parameter’s position.key becomes a key in the test object. The value must pass the parameter’s z validation. Optional parameters may be omitted. Fixed and server parameters are never included.
For resources: each query parameter’s key becomes a key in the test object. The value must match the declared type.
Design Principles
1. Express the Breadth
Tests should cover the range of what is possible:2. Teach Through Examples
Each test should teach one capability or variation:3. No Personal Data
| Allowed | Not Allowed |
|---|---|
| Public smart contract addresses | Private wallet addresses |
| Well-known token contracts (USDC, WETH) | Personal wallet addresses |
| Public blockchain data (block numbers, tx hashes) | Email addresses, names |
| Standard chain IDs (1, 137, 42161) | API keys, tokens, passwords |
4. Reproducible Results
Prefer well-established tokens/contracts over newly deployed ones, and historical data queries over latest-block queries when possible. The response structure should remain stable.Test Count Guidelines
| Scenario | Minimum | Recommended |
|---|---|---|
| No parameters | 1 | 1 |
| 1-2 parameters | 1 | 2-3 |
| Enum/chain parameters | 1 | 2-4 (different enum values) |
| Multiple optional parameters | 1 | 2-3 (with/without optionals) |
| Resource queries | 1 | 1-2 |
Response Capture Lifecycle
Execute API Call
Construct the full request and execute. A delay between calls (default: 1s) prevents rate limiting.
Record Response
Store full response with metadata (namespace, toolName, testIndex, timestamp, responseTime).
Generate Output Schema
The
OutputSchemaGenerator analyzes response.data structure and produces a schema definition.Captured Response Format
Capture File Structure
Test Execution Modes
| Mode | Description | Use Case |
|---|---|---|
| Capture | Execute against real API, store responses | Schema development, output schema generation |
| Validation | Execute and compare against declared output.schema | Verify schemas remain accurate over time |
| Dry-Run | Validate test definitions without API calls | During flowmcp validate |
Validation Rules
| Code | Severity | Rule |
|---|---|---|
| TST001 | error | Each tool must have at least 1 test |
| TST002 | error | Each test must have a _description field of type string |
| TST003 | error | Each test must provide values for all required {{USER_PARAM}} parameters |
| TST004 | error | Test parameter values must pass the corresponding z validation |
| TST005 | error | Test objects must be JSON-serializable |
| TST006 | error | Test objects must only contain keys matching {{USER_PARAM}} parameter keys or _description |
| TST007 | warning | Tools with enum parameters should test multiple enum values |
| TST008 | info | Consider adding tests that demonstrate optional parameter usage |