Tier 3 Specialist Tools

Tier 3 Specialist Tools — Index

Stateless, narrow, callable LLM functions · Called by L9 Brain Agents (or by L1–L8 services for one-shot augmentation) · Not "agents" in the cadence/state sense — functions an agent can call · Index of the Tier 3 category

Tier 3 · Tools v37 · 4 prototyped, 10 specced 15 tools across 4 waves RevOps · technical owners per tool

What Tier 3 is. Tier 3 is a category, not a layer. Tools live alongside the layered services and agents but follow different rules: stateless, no schema ownership, no cadence, no invocation paths beyond "another component called me." A Brain Agent (Tier 2) or occasionally a service (Tier 1) calls a Tool, the Tool runs once, returns a structured result, and is done. Logged for audit but not authoritative. Tools fill the cognition gaps that don't justify a full agent — the surgical instruments of the OS.

Why Tier 3 is a separate category

In the v26 architecture eval the recommendation was three tiers: Tier 1 deterministic services, Tier 2 LLM-native brain agents, Tier 3 specialist tools. The first two get layers (L1–L8 and L9 respectively). Tier 3 doesn't — tools are not part of the layered DAG because:

No state to coordinate. Tools are pure functions. Same input → same output (modulo model nondeterminism, which the eval bounds).
No table ownership. Single-writer-per-table doesn't apply — tools don't write tables. They optionally log calls for audit, but the audit log is itself a Tier 2 or Tier 1 concern.
No cadence. Tools never run on a schedule. They run when called, return, and are done. This is what makes them safe to add and remove without architectural ripple.
Different governance. Tools have model selection, prompt management, and eval — but no approval gates, no schema migrations, no cadence policy. Lighter-weight artifact.

Tools are functions an agent can call. Treating them as layered components creates the wrong governance overhead. Treating them as opaque LLM calls creates the wrong audit posture. The Tier 3 framing splits the difference: callable contracts with eval discipline, but no layer-level integration burden.

The Tier 3 tool contract

Every Tier 3 tool spec must define:

Element	What it captures
Purpose	What the tool does in one sentence. If the tool needs more than one sentence to describe, it should probably be split.
Input schema	Strict JSON-shaped input. Validated by the calling agent before invocation; tool rejects malformed input.
Output schema	Strict JSON-shaped output. Calling agent depends on the contract; schema changes are breaking changes that require coordinated deployment with callers.
Model tier	Haiku (narrow scope, fast) / Sonnet (synthesis-heavy) / Opus (rare, only when measurably necessary). Default is Haiku unless the tool's nature demands otherwise.
Called by	Explicit list of which Tier 1 services and Tier 2 brains may call the tool. Out-of-list callers should be a code-review concern.
Cost ceiling	Per-call token budget + monthly invocation budget. Hard limits.
Eval criteria	Tool-specific eval; runs alongside the brain harness or independently. Tools have lower bar than brains because their scope is narrower.
Failure mode	What happens when the tool returns a bad result. Calling agent's responsibility to handle, but the tool spec must declare its known failure modes.

Index of specced tools (v38 — 15 tools)

First wave (v29, 4 tools): closed the biggest cognition gaps — Domain 3 (API/dev-led GTM) twice, Domain 1 ("create plays, not just execute") once, UBB consumption forecasting once. Second wave (v31, 4 tools): closed Domain 4 (outbound deliverability), Domain 2 (in-call guidance + competitive narrative), and the post-sales feature-level adoption gap. Third wave (v32, 4 tools): TTV/timing analysis, champion movement detection, pricing sensitivity, onboarding health prediction. Fourth wave (v37, 2 tools): cohort retention forecaster + segment-LTV decomposer, specced alongside AGT-903 Strategy Brain to support multi-quarter portfolio reasoning. Fourteen tools total — further additions remain case-by-case based on observed gaps.

TOOL-001 Domain 3 · API/Dev GTM Sonnet

API-doc → Sales-play Translator

Reads a product's API documentation and produces 1–3 candidate sales play definitions for technical buyer personas. Output goes to SalesPlayLibrary as draft; humans co-define from there.

Called by: AGT-901 Pipeline Brain, AGT-902 Account Brain · Use case: new product launch, API surface change, identifying buyer-relevant capabilities · Closes: Domain 3 gap from v26 eval (the largest single gap in the system).

Candidate	Domain	Trigger to add
Procurement Negotiation Pattern Recognizer	Domain 1 / Domain 2	If post-launch eval of TOOL-011 shows procurement-stage analysis is a distinct cognition need.
Multi-thread Quality Scorer	Domain 2	Score deal-level multi-threading quality (stakeholder coverage, persona breadth) beyond what AGT-401 deal health scoring captures. Add if AGT-901 surfaces thin multi-threading as a recurring win-rate driver.
QBR Action-Item Outcome Tracker	Post-sales	Track QBR commitments through to outcomes, surfacing accountability patterns. Add after AGT-603/AGT-704 adoption shows signal.
Renewal Negotiation Risk Profiler	Post-sales	Renewal-specific equivalent of TOOL-011 Pricing Sensitivity. Sharper renewal-stage focus. Add if renewal motion produces distinct sensitivity patterns from new-logo motion.
Industry-Specific Play Refiner	Domain 1	Vertical-specific narrative refinement for plays exiting SalesPlayLibrary. Add when active plays grow past 30 across verticals and quality lift is observed as a need.

Pattern	Caller	Tool	Example
Brain calls tool during query	AGT-901 / AGT-902	Any	AGT-902 reading per-account view, calls TOOL-004 to forecast overage timing for the account
Service calls tool as enrichment	AGT-201, AGT-503, AGT-402	TOOL-002, TOOL-004	AGT-201 calls TOOL-002 on account update events to enrich dev-persona signal before ICP rescore
Operator calls tool directly	RevOps via workspace UI	TOOL-001, TOOL-003	RevOps drops a new product API doc into a workspace input; TOOL-001 generates 3 candidate plays for review
Tool chains	Brain calling multiple tools in sequence	AGT-901 → TOOL-001 → TOOL-003	Read API docs (TOOL-001 produces candidate plays) then refine the most promising into a structured play definition (TOOL-003)

Tool	Default model	Per-call budget	Monthly cap (default)
TOOL-001 API-doc translator	Sonnet	50K input + 5K output	$300/mo (low frequency, high context)
TOOL-002 Dev-persona enricher	Haiku	10K input + 1K output	$200/mo (high frequency, narrow scope)
TOOL-003 Sales play composer	Sonnet	30K input + 4K output	$300/mo (moderate frequency)
TOOL-004 Consumption forecasting	Haiku	15K input + 2K output	$400/mo
TOOL-005 Outbound deliverability	Haiku	8K input + 2K output	$150/mo
TOOL-006 Real-time call guidance	Sonnet	30K input + 1.5K output (per invocation)	$2,000/mo — ~600 customer calls; per-call ~$3
TOOL-007 Competitive narrative writer	Sonnet	15K input + 2K output	$200/mo
TOOL-008 Adoption pattern recognizer	Haiku	10K input + 1.5K output	$300/mo (high frequency — daily batch from AGT-501)
TOOL-009 Activation/TTV analyzer	Haiku	8K input + 1.5K output	$200/mo
TOOL-010 Champion movement detector	Haiku	10K input + 2K output	$250/mo
TOOL-011 Pricing sensitivity analyzer	Sonnet	20K input + 2.5K output	$300/mo
TOOL-012 Onboarding health predictor	Haiku	10K input + 2K output	$200/mo
TOOL-013 Cohort retention forecaster	Sonnet	40K input + 4K output	$200/mo (low frequency, called by AGT-903 only)
TOOL-014 Segment-LTV decomposer	Sonnet	30K input + 4K output	$150/mo (low frequency, called by AGT-903 only)