TOOL-001 — API-doc → Sales-play Translator

Tier 3 Specialist Tool · Stateless · Reads API documentation, produces sales play candidates for technical buyers · Closes Domain 3 gap from v26 architecture eval
Tier 3 · Tool Specced · v29 Domain 3 · API/Dev GTM Sonnet
Purpose

Reads a product's API documentation and produces 1–3 candidate sales play definitions for technical buyer personas. The output lands in SalesPlayLibrary with state draft; humans pick up from there to co-define and approve. The tool's job is to translate API surface area into buyer-relevant language — not to invent positioning, just to surface the use cases the API enables and frame them in a way a sales team can actually run with.

Closes the Domain 3 gap (20% coverage in v26 eval). Today no service or agent can read API docs and produce plays for technical buyers; this is an LLM-shaped task that no spec-driven function can encode.
Input schema
{ "api_doc_input": { "type": "openapi_spec" | "markdown_url" | "raw_markdown", "content": "...", // OpenAPI JSON, URL, or raw markdown "doc_version": "string", // e.g., "v2.4.0" "doc_publication_date": "ISO 8601" }, "context": { "product_family": "string", // e.g., "background-checks", "kyc", "verifications" "current_icp_summary": "string", // 2-3 sentence summary of current ICP from AGT-201 "current_active_plays": [ // current play context to avoid duplication { "play_id": "uuid", "name": "string", "segment": "string" } ], "target_buyer_persona_hint": "string" // optional — "developer", "platform_team", "compliance_engineer", etc. }, "constraints": { "max_plays_to_propose": 3, // hard cap; tool never returns more "include_segment": "string" // optional — restrict proposals to one segment } }
Input is validated by the calling agent before invocation. Malformed input results in tool rejection with a structured error response — the tool never silently coerces.
Output schema
{ "tool_call_id": "uuid", "candidate_plays": [ { "name": "string", // human-readable play name "hypothesis": "string", // 2-3 sentence thesis "target_buyer_persona": "string", // dev / platform / compliance / etc. "api_capabilities_referenced": [ // which endpoints/capabilities the play depends on { "capability": "string", "doc_section": "string" } ], "target_definition": { "icp_signals": ["string"], // observable signals: tech stack, job postings, etc. "lifecycle_stage_fit": "string" // pre-trial, post-trial-stalled, mid-implementation, etc. }, "suggested_cadence_outline": { // not finished cadence — outline only "channel_mix": ["email", "linkedin", "developer_event"], "touch_count_estimate": 6, "notable_assets_needed": ["string"] }, "success_criteria_outline": { "primary_metric": "string", // e.g., "API key activation within 30 days" "qualifying_signal": "string" // what would tell us the play is working }, "confidence_self_rating": "high" | "medium" | "exploratory", "ungrounded_assumptions": ["string"] // explicit list of assumptions NOT backed by API docs } ], "input_doc_summary": "string", // 2-sentence summary of what the docs cover "capabilities_not_translated": ["string"], // capabilities the tool noticed but didn't propose plays for, with reasons "tool_metadata": { "model": "claude-sonnet-4-6", "input_tokens": 0, "output_tokens": 0, "cost_usd_estimate": 0.0, "latency_ms": 0 } }
Hard rule: ungrounded_assumptions must be populated for every candidate play. The tool cannot claim a play is grounded in API docs when its real basis is general market knowledge. This separation is what makes the output usable downstream — humans can trust the API-grounded parts and treat the assumption parts as starting hypotheses.
Called by
CallerInvocation context
AGT-901 Pipeline Brain"What plays does the new product API enable?" — usually after a product launch or major API surface change. Brain calls TOOL-001, then optionally chains to TOOL-003 (Sales Play Composer) to refine the most promising candidate into a fully structured SalesPlayLibrary draft.
AGT-902 Account Brain"For Account X (uses our API heavily, just upgraded their tech stack), what API-anchored plays could land?" — account-specific variant. Less common than AGT-901 invocation but supported.
RevOps direct (workspace UI)RevOps drops a new product API doc into the workspace, calls TOOL-001 with constraints, reviews the candidates. Direct invocation is supported and logged the same way as agent-mediated calls.
Prompt design principles
  1. Ground in the docs, separate the assumptions. The tool's prompt explicitly instructs separation between API-grounded claims and market/positioning assumptions. The output's ungrounded_assumptions field is non-negotiable.
  2. Buyer language, not feature language. Output describes what the buyer can do, not what the API endpoint does. "Verify identities for high-volume gig platform onboarding" not "POST /verifications endpoint accepts batch input."
  3. Refuse if docs are too thin. If the input API documentation doesn't contain enough material to ground at least one play, the tool returns 0 candidates with a structured "insufficient_input_signal" reason rather than fabricating plays.
  4. Don't propose what already exists. The current_active_plays input is checked against; the tool flags overlapping proposals and de-duplicates against existing active set.
Cost ceiling
ConstraintValue
Per-call input budget50K tokens (API doc may be substantial; OpenAPI specs can be 30K+ tokens)
Per-call output budget5K tokens (candidate plays + metadata)
Default modelSonnet — synthesis-heavy task; Haiku tested but quality below acceptable
Per-call cost estimate~$0.20–$0.30 per call at Sonnet pricing
Monthly cap (default)$300/mo — bounds usage to ~1,000 calls/month
Frequency expectationLow — product launches and major API changes are infrequent. Most months will see < 50 calls.
Eval criteria
CriterionMeasurementPass threshold
Schema complianceOutput validates against output schema100% (hard)
API groundingFor each candidate play, % of api_capabilities_referenced items that map to a real endpoint/capability in the input docs (manual reviewer check)≥ 95%
Assumption disclosure% of plays where ungrounded_assumptions is non-empty when the play extends beyond API docs (manual reviewer check)100% (hard) when extension exists
Hallucinated capability rate% of plays referencing API capabilities the docs don't actually have0% (hard) — any hallucinated capability is automatic eval fail
Promotion rate (operational)% of TOOL-001-generated drafts that survive co-definition to active≥ 25%
P95 latencyEnd-to-end tool call≤ 8s
Eval suite: 8 retrospective scenarios — 4 historical product launches where we know retrospectively which plays worked, 2 API documentation samples with known capability gaps, 2 edge cases (very thin docs, very deep docs). Scored alongside the brain harness on the same cadence.
Failure modes
SymptomCauseAction
Tool fabricates a capability the API doesn't haveModel hallucination on thin docs, or general market knowledge bleedHard fail in eval. Tighten prompt to explicitly require capability citations against doc sections. If chronic, hold on Sonnet; Opus tested as fallback.
Tool returns 0 plays consistentlyRefusing too aggressively, or input docs systematically too thinAudit input docs vs. refusal reasons. If refusal is correct, the gap is in product docs, not the tool. Otherwise tune refusal threshold.
Output plays look plausible but never get promotedPlays grounded in capabilities but disconnected from real buyer painTune context.current_icp_summary input to give the tool stronger ICP grounding. Refresh prompt with examples of plays that did get promoted.
P95 latency creepInput docs growing in size; tool processing larger contextsImplement input chunking strategy — summarize large OpenAPI specs to relevant subsections before invocation. Budget cap blocks runaway calls.
Cost spikingOperator running it repeatedly during exploratory sessions, no cachingEnable prompt caching at workspace level for the system prompt + ICP context. Operator iteration on the same product family hits cache.
Source-trace integration

When TOOL-001 is called by AGT-901 or AGT-902, the calling agent's BrainAnalysisLog row captures the tool invocation: tool_call_id, input doc reference, output candidate count, cost. The candidate plays drafted into SalesPlayLibrary inherit the brain's proposal_id for cohort retrospective lineage. Operator-direct calls (no brain) write a workspace audit record but no SalesPlayLibrary draft until the operator explicitly accepts.