TOOL-003 — Sales Play Composer

Tier 3 Specialist Tool · Stateless · Composes structured sales play definitions from a hypothesis + Tier 1 brain-ready views · Closes Domain 1's "create plays, not just execute" gap
Tier 3 · Tool Specced · v29 Domain 1 · Pipeline Generation Sonnet
Purpose

Given a hypothesis (e.g., "MM healthcare under-covered, vertical-specific play needed") and the relevant Tier 1 brain-ready views, TOOL-003 composes a fully structured play definition: target_definition, suggested_cadence, success_criteria. Output lands in SalesPlayLibrary with state draft, ready for human pickup into the co-definition workflow. This is the lever that turns a brain proposal from "we should do something here" into "here's the candidate play, refine and approve."

Closes the Domain 1 gap from v26 eval ("create plays, not just execute"). The Brain Agents identify gaps and frame hypotheses; TOOL-003 turns hypotheses into structured drafts that match SalesPlayLibrary's schema and AGT-302's execution contract.
Input schema
{ "hypothesis": "string", // 2-3 sentence thesis from the calling brain "scope": "segment" | "account_specific", "segment": "string", // required when scope=segment "account_id": "uuid", // required when scope=account_specific "context_views": { // brain-ready view snapshots from the caller "icp_profile_summary": "string", // current ICP profile for the segment "win_loss_pattern": "string", // trailing pattern from WinLossLog "account_priority_distribution": "string", "current_active_plays": [ // existing plays to avoid duplication { "play_id": "uuid", "name": "string", "hypothesis": "string" } ], "voc_themes": ["string"] // optional VoC themes if relevant }, "constraints": { "must_use_existing_lever_only": true, // hard: no inventing new agent capabilities "max_touch_count": 10, // hard: cadence cannot exceed AGT-302's existing limits "anchor_to_tool_001_output": "uuid" // optional — chain from a TOOL-001 candidate }, "originating_proposal_id": "uuid" // from calling brain's BrainAnalysisLog }
The must_use_existing_lever_only = true constraint is non-negotiable. TOOL-003 never proposes a play that requires new agent capabilities, new schema, or new approval flows. Plays must compose from existing AGT-201 / 203 / 301 / 302 / 303 / 305 levers.
Output schema
{ "tool_call_id": "uuid", "draft_play": { "name": "string", // human-readable play name (under 80 chars) "hypothesis": "string", // refined from input, may be edited "scope": "segment" | "account_specific", "segment": "string", // populated per scope "account_id": "uuid", // populated per scope "target_definition": { "icp_signals": ["string"], "lifecycle_stage_fit": "string", "exclusion_criteria": ["string"] // who's out of scope despite matching primary }, "suggested_cadence": { "cadence_type": "abm" | "standard" | "nurture" | "vertical-specific", "channel_mix": ["email","linkedin","phone","event","content"], "touch_count": 0, // bounded by max_touch_count "step_outline": [ // ordered cadence outline { "step": 1, "channel": "string", "trigger": "string", "asset_needed": "string" } ], "agt_302_template_reference": "string" // existing template to extend, if applicable }, "success_criteria": { "primary_metric": "string", // single primary success metric "primary_metric_target": "string", "secondary_metrics": ["string"], "qualifying_signal": "string", // what tells us early it's working "retire_signal": "string" // what would cause us to retire it }, "originating_proposal_id": "uuid" // inherited from input }, "lever_grounding": [ // every play element must trace to existing levers { "play_element": "string", "agent_lever": "string", "spec_section": "string" } ], "duplication_check": { "checked_against_count": 0, // active plays compared "overlap_flagged": false, "overlapping_play_id": "uuid", // if overlap_flagged "differentiation_rationale": "string" // if overlap exists, why this is still distinct }, "ungrounded_assumptions": ["string"], // explicit, parallels TOOL-001 and TOOL-002 "self_assessed_promotion_likelihood": "high" | "medium" | "low", "tool_metadata": { "model": "claude-sonnet-4-6", "input_tokens": 0, "output_tokens": 0, "cost_usd_estimate": 0.0, "latency_ms": 0 } }
Hard rule: lever_grounding must cover every element of the play. No element of the suggested cadence, target definition, or success criteria can exist without a citation to an existing agent's spec section. If the tool can't ground an element, it omits the element rather than fabricating it.
Called by
CallerInvocation context
AGT-901 Pipeline BrainMost common caller. Brain identifies a coverage gap or play hypothesis, calls TOOL-003 to compose the structured play definition, draft lands in SalesPlayLibrary inheriting the brain's proposal_id.
AGT-902 Account BrainFor account-specific plays (scope=account_specific). Brain identifies an account-level expansion or retention play, calls TOOL-003 to compose the structured definition.
Chained from TOOL-001When a brain runs the full chain: TOOL-001 reads API docs and proposes 1-3 candidate plays at outline level → brain selects the most promising → calls TOOL-003 with the candidate as anchor_to_tool_001_output to fully compose it.
RevOps direct (workspace UI)RevOps writes a hypothesis manually, calls TOOL-003 with explicit context views. Used when leadership wants to pressure-test a hypothesis they generated, not the brain.
Design principles
  1. Compose, don't invent. Every cadence step, every target signal, every success metric must exist in the existing service vocabulary. If the play requires a capability AGT-302 doesn't support, the tool either omits that element or returns the play with explicit "requires_new_capability" flag (which causes the calling brain to route to recommend_human_query instead of drafting).
  2. Honor the volume cap upstream. SalesPlayLibrary enforces the volume cap on activation, not draft. TOOL-003 may produce drafts that would exceed the cap if all activated; that's by design — the workspace lets humans pick the best subset.
  3. Refuse over-specification. The output is a draft for co-definition, not a finished play. The tool intentionally leaves step_outline at outline-level rather than full-asset content; humans add the polish in the workspace.
  4. Self-assessed promotion likelihood is honest, not optimistic. The tool flags drafts as "low" likelihood when the hypothesis is exploratory or the lever grounding is thin. Calibrating the brain to draft fewer-but-better plays is preferable to draft inflation.
  5. De-duplicate at draft time. The current_active_plays input is checked; overlapping drafts are flagged with a differentiation rationale so RevOps can choose to retire the original or take the new one as a replacement.
Cost ceiling
ConstraintValue
Per-call input budget30K tokens (hypothesis + multiple brain-ready view summaries + active plays context)
Per-call output budget4K tokens (structured play + lever grounding + metadata)
Default modelSonnet — composition task with structural constraints; Haiku tested but lever-grounding accuracy below threshold
Per-call cost estimate~$0.15 per call at Sonnet pricing
Monthly cap (default)$300/mo — bounds usage to ~2,000 calls/month
Frequency expectationModerate — brain-driven invocations during planning cycles + ad-hoc draft compositions during operational queries.
Eval criteria
CriterionMeasurementPass threshold
Schema complianceOutput validates against SalesPlayLibrary draft schema100% (hard)
Lever grounding completeness% of play elements with valid lever_grounding citation100% (hard) — ungrounded elements should be omitted, not fabricated
Citation validity% of lever_grounding citations that point to a real agent spec section (manual reviewer check against current spec set)≥ 95%
Promotion rate (operational)% of TOOL-003-generated drafts that survive co-definition to active in SalesPlayLibrary≥ 30% (matches the AGT-901 eval threshold; TOOL-003 is the mechanism by which AGT-901's threshold gets met)
Promotion likelihood calibrationDrafts the tool self-rated "high" vs "medium" vs "low" should differ in actual promotion rate by > 20pp between tiersCalibration check; if ratings don't predict outcomes, ratings are noise
Duplication detection% of overlapping drafts correctly flagged in duplication_check when active plays cover similar ground≥ 90%
P95 latencyEnd-to-end tool call≤ 6s
Failure modes
SymptomCauseAction
Tool drafts plays that require capabilities AGT-302 doesn't supportLever-grounding constraint not enforced strongly enoughHard fail in eval. Tighten prompt with explicit list of supported AGT-302 capabilities. Tool should return play with "requires_new_capability" flag rather than fabricate.
Drafts uniformly self-rated "high" likelihood, but actual promotion rate is much lowerOptimism bias in self-assessmentRecalibrate via eval. Add few-shot examples of plays that the brain rated "high" but failed co-definition, so the tool learns to be more discriminating.
High volume of drafts, low promotion rateBrain (caller) producing too many hypotheses; tool doing its job by drafting them allIssue is upstream, not in TOOL-003. AGT-901 calibration should produce fewer hypotheses; tool will draft fewer plays as a result.
Drafts duplicate active plays despite duplication checkActive play context not being passed in inputCaller responsibility. Check current_active_plays is populated at invocation; if the field is empty, tool can't deduplicate.
Cost spikeBrain invoking TOOL-003 many times during a single planning sessionEnable prompt caching at workspace level. Per-call cap blocks runaway. Audit BrainAnalysisLog for invocation patterns.
Interaction with SalesPlayLibrary

TOOL-003 output is mapped into the SalesPlayLibrary schema by the calling brain. The brain writes the draft row with state=draft; the brain does not write directly through TOOL-003. This preserves the architectural invariant that the tool is stateless. The brain owns the write because the brain owns the BrainAnalysisLog row that lineage-anchors the draft.

Promotion gate is unchanged: draftunder_review requires human pickup, under_reviewactive requires SLM + RevOps joint approval, hard volume cap enforced at activation. TOOL-003 has zero role in promotion — it only produces the initial draft.