← home · ← tools · View as Markdown (.md)

ml-evaluate

station__ml-pipeline__ml-evaluate external (needs EXECUTION_BACKEND_URL) ml-pipeline non-pv Constructa Configa

Evaluate a trained ML model on held-out test data. Returns AUC-ROC, precision, recall, F1 score, accuracy, and confusion matrix (TN/FP/FN/TP).

Taxonomy

Linnaean classification joined from the algovigilance taxonomy index via the parent config's rank.

Rank	Value
domain	`Substrata`
kingdom	`Constructa`
phylum	`Configa`
class	`station-config`
order	`ml`
family	`mcp-tool-config`

Characteristics:

substrate: config
domain: pv
lifecycle: continuous
authority: read
compounding: producer
io: agent-request → tool-response

Input schema

model_id stringrequired — Model ID to evaluate.
test_samples arrayrequired — Test samples with labels.

Example call

POST /api/mcp
Content-Type: application/json

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "station__ml-pipeline__ml-evaluate",
    "arguments": {
      "model_id": "",
      "test_samples": []
    }
  }
}

How to invoke from a client

From any MCP-aware client, add https://algovigilance.com/api/mcp as an MCP server, then call this tool by name. From a raw HTTP client, send the JSON-RPC body above to /api/mcp.

Agent-friendly formats

Working inside an AI assistant? Use the Copy for AI button at the top of this page (or view the raw Markdown) to paste a clean, token-budgeted version of this tool's contract into your conversation.

All tools (3059 live)
/api/mcp — endpoint
/AGENTS.md — agent guide
/tools/ml-pipeline__ml-evaluate/raw.md — this page's Markdown twin