TOOL-006 — Real-time Call Guidance

Tier 3 Specialist Tool · Stateless · Live-call assist for AE/CSM — surfaces objection handling, MEDDPICC gaps, next-step suggestions during the call · Closes Domain 2 (in-call) gap from v26 eval
Tier 3 · Tool Specced · v31 Domain 2 · Win-rate Sonnet Higher infra complexity than other tools
Purpose

Live-call assist. As a sales call is happening, takes streaming transcript chunks + the full deal/account context (from AGT-902 composite view) and produces real-time guidance: objection-handling hints, MEDDPICC gap callouts, next-step suggestions, mention of competitive intel relevant to what was just said. Output appears in a sidecar UI the rep watches during the call — the rep decides whether to use any specific suggestion.

Closes Domain 2 (in-call) gap from v26 eval. Today AGT-407 Conversation Intelligence is retrospective only — analyzes the transcript after the call. TOOL-006 operates during the call. The two complement each other: in-call guidance + retrospective coaching.
Higher infrastructure complexity than other Tier 3 tools. Requires live audio integration with the recording platform (Gong / Zoom / Chorus) for streaming transcript access, plus a sidecar UI surface in the rep's call window. Tool spec is here; integration is engineering work that depends on the recording-platform partnerships and is not gated by other Tier 3 tools.
Input schema
{ "call_session_id": "uuid", // recording-platform session "rep_user_id": "uuid", "rep_role": "AE" | "SDR" | "AM" | "CSM" | "SE", "opportunity_id": "uuid | null", // null for pre-deal calls "account_id": "uuid", "call_context": { "call_type": "discovery" | "demo" | "evaluation" | "negotiation" | "renewal" | "qbr" | "general", "scheduled_duration_min": 0, "elapsed_seconds": 0 }, "account_context": { ... }, // full per-account brain-ready composite // (per AGT-902 composite spec) "deal_context": { // populated when opportunity_id != null "current_stage": "string", "deal_health_score": 0, "competitor_detected": "string | null", "meddpicc_state": { "metrics_documented": true | false, "economic_buyer_identified": true | false, "decision_criteria_documented": true | false, "decision_process_mapped": true | false, "paper_process_understood": true | false, "identified_pain": true | false, "champion_qualified": true | false, "competition_known": true | false } }, "transcript_window": { "last_60_seconds_transcript": "string", // rolling window, redacted of PII "speaker_segments": [ { "speaker": "rep" | "prospect", "text": "string", "elapsed_at": 0 } ] }, "suggestion_history_in_session": [ // what's already been suggested this call { "suggested_at_elapsed": 0, "suggestion_type": "string", "was_used": null | true | false } ] }
Output schema
{ "tool_call_id": "uuid", "guidance": [ { "guidance_type": "objection_response" | "meddpicc_gap" | "competitor_callout" | "next_step_prompt" | "discovery_question" | "champion_validation", "priority": "high" | "medium" | "low", "trigger": "string", // what in the recent transcript triggered this "suggestion_text": "string", // 1-2 sentence guidance the rep can use verbatim "rationale": "string", // 1 sentence why this matters now "supporting_context": "string" // where this came from in account_context } ], "suppress_until_next_window": true | false, // when no useful guidance, stay quiet "tool_metadata": { "model": "claude-sonnet-4-6", "input_tokens": 0, "output_tokens": 0, "cost_usd_estimate": 0.0, "latency_ms": 0 } }
Hard rule: If no high-quality guidance is available for the current window, return empty guidance array with suppress_until_next_window = TRUE. Silence is better than noise — a rep watching a sidecar full of low-value suggestions stops watching it. Quality >> volume.
Called by
CallerInvocation pattern
Live-call sidecar UI (recording platform integration)Triggered every 30–60 seconds during an active call. Recording platform pushes transcript chunks; sidecar service composes the input, calls TOOL-006, displays guidance to the rep.
Rep can pause/resume guidanceRep-side control. Invocations stop when paused. Used in customer-trust-sensitive moments (off-the-record discussions).
Design principles
  1. Quality over volume. Mediocre suggestions during a live customer call distract the rep, harm the relationship, and erode trust in the tool. The tool returns silence (suppress_until_next_window=TRUE) when the best available guidance is mediocre.
  2. Suggestion debouncing. The tool reads suggestion_history_in_session and avoids repeating the same suggestion within a session. If a suggestion was offered and not used, the tool downgrades that suggestion type's priority for the rest of the call.
  3. Hard latency budget. P95 ≤ 3 seconds end-to-end. A 5-second suggestion is useless — the call has moved on. Latency is the operational quality bar.
  4. Privacy first. Transcripts arrive PII-redacted from the recording platform. Tool does not read raw call audio. Tool output is logged for retrospective coaching but the live transcript is not retained beyond the rolling window.
  5. Rep autonomy preserved. Tool suggests; rep decides. No auto-action. Rep flags was_used post-call (or via in-call quick-mark UI) for calibration.
Cost ceiling
ConstraintValue
Per-call (one tool call) input budget30K tokens (account context is the bulk; transcript window is small)
Per-call output budget1.5K tokens (compact guidance)
Default modelSonnet — needs to reason across account context + recent transcript + MEDDPICC state in real time
Per-call cost estimate~$0.10 per tool invocation
Per-customer-call cost estimate~$3 (one customer call ≈ 30 tool invocations at 60s cadence over 30 min)
Monthly cap (default)$2,000/mo — supports ~600 customer calls/month with full guidance
Frequency expectationHighest cost per use of any tool, though invocations bounded by call volume. Prompt caching essential — account context cache on session opening saves 50%+ on subsequent windows.
Per-customer-call cost is the right unit, not per-tool-call. RevOps configures opt-in: which call types get live guidance (typically discovery, evaluation, negotiation, renewal — not internal calls or first-touch outreach). Volume constraints prevent cost runaway.
Eval criteria
CriterionPass threshold
Schema compliance100% (hard)
P95 latency≤ 3s (hard) — missed latency = unusable tool
Suggestion groundingEvery suggestion's supporting_context traces to a real field in account_context or recent transcript — 100% (hard)
Suggestion adoption rate% of suggestions reps mark was_used = TRUE — calibration signal, ≥ 25% target. Below that, suggestions are noise.
Suppression discipline% of windows where output is empty (suppress=TRUE) when no high-quality guidance exists. Eval reviewers verify the tool stays quiet appropriately. ≥ 30% suppression rate expected (most call windows don't need active guidance).
Privacy compliance0 instances of guidance referencing PII redacted from transcript (hard)
Failure modes
SymptomAction
Latency drift > 3sHard fail. Switch to Haiku for low-stakes call types as fallback. Sonnet preserved for high-stakes (negotiation, executive).
Reps stop checking sidecar (low adoption)Suggestion quality issue, not tool issue. Recalibrate via post-call surveys + adoption rate by suggestion_type.
Tool suggests something that contradicts customer privacy or contractHard incident. Audit prompt for context bleed. Tighten input filtering on PII redaction.
Cost spike on high-volume call daysPer-call session cap (e.g., 50 invocations max per session); prompt caching enforced; opt-in per call-type list throttle.
Recording platform integration drops mid-callTool stops being called; sidecar shows offline state. Rep continues without guidance — existing rep behavior unchanged.
Interaction with AGT-407 Conversation Intelligence

TOOL-006 and AGT-407 are complementary: TOOL-006 operates during the call (live, ephemeral), AGT-407 operates after the call (retrospective, persisted in ConvIntelligence). TOOL-006 output is logged in a separate CallGuidanceLog with foreign keys to ConvIntelligence.conv_intelligence_id — AGT-701 (Rep Coaching) reads both for the full picture: what guidance was offered live, what was used, what AGT-407 saw retrospectively.