Reads per-account LTV roll-ups bucketed by segment / vertical / ICP-tier (sourced from AGT-201 icp_outcome_brain_view + AGT-501 cohort_brain_view + AGT-503 expansion_strategy_brain_view + AGT-105 capacity_strategy_brain_view) and decomposes observed LTV deltas across buckets into driver components: initial ACV, tenure-driven retention, expansion realization, CAC, and segment-mix shift. Output is per-bucket LTV with confidence-banded driver attribution and a comparative ranking. Used by AGT-903 Strategy Brain in icp_retrospective ("which dimensions of the rubric correlate with realized LTV?"), capacity_reallocation ("if I deploy 10 reps, where's the highest LTV-per-rep payoff?"), and vertical_entry ("what does opportunistic-fintech LTV look like vs. core-vertical LTV?").
n_accounts < 25, fewer than 2 buckets passed (decomposition is comparative by definition), or cac_per_logo_mean = null for buckets when "cac" is requested as a driver. Refusal is structured; the tool returns refusal_reason rather than fabricating attribution on thin data.load_bearing flag is mechanically derived: it is true only when the driver's contribution band excludes zero and p10/p90 share sign. The LLM cannot mark a driver load-bearing on narrative grounds — the bands have to support it. Drivers with bands crossing zero are reported with load_bearing = false regardless of point estimate magnitude.| Driver | What it captures | Caveat |
|---|---|---|
| initial_acv | Difference in average initial ACV at signup. Mechanical — first-period contract value on the bucket. | Often the largest single contributor. Easy to over-weight in narrative; the executive question is usually "and then what?" — tenure/expansion drivers carry the rest. |
| tenure | Difference in observed tenure (months retained), holding initial ACV constant via stratification. | Tenure for buckets with short observed windows is right-censored — tool reports observed tenure, not extrapolated lifetime. AGT-903 caller may layer TOOL-013 projections for forward tenure if needed. |
| expansion | Difference in expansion realization (NRR > 100% multiplier on initial ACV). | Volatile in small buckets — bands typically wide. Tool surfaces band; brain should not narrate as "expansion is the driver" when band crosses zero. |
| cac | Difference in CAC per logo. Subtractive on LTV (higher CAC reduces LTV). | CAC data is bucket-conditional. If CAC is unavailable for a bucket, driver is excluded with note in data_quality. Tool does not estimate missing CAC. |
| segment_mix | Compositional effect: shift in sub-segment distribution within the bucket explains some/all of the gap. | Most-overlooked driver. When a "high-LTV bucket" is actually a different mix of sub-segments, attribution to ICP fit or sales motion is misleading. Tool computes segment_mix as a residual decomposition step; if it's load-bearing the brain must say so. |
| Caller | Invocation context |
|---|---|
| AGT-903 Strategy Brain | Primary caller. Used in icp_retrospective (which ICP dimensions correlate with realized LTV?), capacity_reallocation (where's the highest LTV-per-rep payoff?), vertical_entry (what's opportunistic-vertical LTV vs core-vertical LTV?), nrr_durability (is concentration in a few high-LTV accounts driving the headline number?). Brain assembles bucket roll-ups from multiple Tier 1 strategy_brain_views, calls TOOL-014 with the assembled bucket set, integrates output into the StrategyRecommendationLog memo. |
| AGT-704 Business Review Orchestrator | Indirect — only via AGT-903 narrative jobs (annual planning, mid-year review). AGT-704 does not call TOOL-014 directly. |
| RevOps direct (workspace UI) | For ad-hoc LTV-decomposition investigations — e.g., "what's driving the 30% LTV gap between MM-T1 and MM-T2?" Bypasses AGT-903 only when the question is descriptive (not strategic-recommendation-shaped). |
| Constraint | Value |
|---|---|
| Per-call input budget | 30K tokens (up to 12 buckets × multi-driver components × bootstrap inputs; bounded at 30K) |
| Per-call output budget | 4K tokens (per-bucket rank + driver attribution per comparison + interpretation) |
| Default model | Sonnet — multi-bucket attribution and operational interpretation justify the step up from Haiku. Numerical decomposition runs in code. |
| Per-call cost estimate | ~$0.20 per call at Sonnet pricing |
| Monthly cap (default) | $150/mo — bounds usage to ~750 calls/month, well above expected AGT-903 query volume |
| Frequency expectation | Similar to TOOL-013: AGT-903 invokes 10–30 queries/month, of which roughly half call TOOL-014 (often paired with TOOL-013 in the same query). Annual planning and capacity-reallocation cycles concentrate usage. |
| Criterion | Measurement | Pass threshold |
|---|---|---|
| Schema compliance | Output validates against output schema | 100% (hard) |
| Decomposition closure | For each comparison: |ltv_gap_observed − sum(driver_contributions_p50) − unattributed_residual_p50| < 1% of |ltv_gap_observed| | 100% (hard) — decomposition must add up |
| Load-bearing flag honesty | For 20 cases with deliberately inserted band-crossing-zero drivers, % where load_bearing = false on those drivers | 100% (hard) |
| Driver attribution calibration | For 20 retrospective cases with known true drivers (synthetic + simulated), % where the load-bearing driver flagged by the tool matches the synthetic ground truth | ≥ 75% |
| Refusal correctness | For deliberately undersized buckets (< 25) or missing CAC when requested, % where tool refuses with structured refusal_reason | 100% (hard) |
| Mandatory alternative | % of non-refusal outputs with non-empty credible_alternative_finding and non-empty what_would_change_the_finding | 100% (hard) |
| Segment-mix surfacing | For 10 cases where segment_mix is constructed to be load-bearing, % where output identifies segment_mix in load_bearing = true drivers and mentions it in primary_finding | ≥ 90% |
| P95 latency | End-to-end (decomposition + bootstrap + LLM interpretation) | ≤ 6s |
| Symptom | Cause | Action |
|---|---|---|
| Output marks expansion as load-bearing when contribution band crosses zero | LLM overriding the mechanical load_bearing rule | Hard fail. Eval enforces; rule output is source of truth for the flag. |
| Decomposition residual is > 5% of observed LTV gap | Bug in decomposition math or missing driver | Hard fail. Decomposition must close. If a driver is missing, it should be added or the gap should be classified as unattributed_residual. |
| Tool reports CAC contribution when bucket CAC was null on input | Tool fabricating CAC where data was absent | Hard fail. Drivers without supporting data are excluded with note, never estimated. |
| Output ignores segment_mix when bucket dimensions clearly span sub-segments | System prompt not enforcing segment_mix evaluation | Eval includes 10 segment-mix-load-bearing fixtures; ≥ 90% surfacing required. |
| Output produces a single-narrative interpretation without an alternative | System prompt not enforcing the mandatory credible-alternative field | Hard fail. Eval enforces non-empty credible_alternative_finding. |
| AGT-903 cost spike traceable to TOOL-014 being called per-bucket-pair instead of per-comparison | Brain calling tool many times for related comparisons in one session | Caller responsibility. AGT-903 should pass full bucket sets and use comparison_mode to do all comparisons in one call. |
Layered with TOOL-013. TOOL-014 reports observed LTV with right-censoring honesty; for forward-projected lifetime LTV, AGT-903 is expected to layer TOOL-013 cohort retention forecasts on top. The two tools are not fungible: TOOL-013 projects retention curves forward, TOOL-014 attributes observed LTV gaps. AGT-903 calls one or both depending on whether the strategic question is retrospective (TOOL-014 alone often sufficient) or forward-looking (both).
Single-threaded audit trail. Same as TOOL-013. AGT-903 calls TOOL-014 via Anthropic tool-use; the call ID lands in tool_calls_made on BrainAnalysisLog; driver-attribution numbers cited in StrategyRecommendationLog memos trace back via that ID. AGT-903's source-citation discipline (≥ 95% of numerical claims must cite) treats TOOL-014 outputs as first-class citation targets.
Refusal contagion. When TOOL-014 refuses (thin buckets, missing CAC), AGT-903 either surfaces the gap and declines or proceeds without TOOL-014 input and marks affected claims as speculation confidence. Silently dropping the refusal and producing a confident driver narrative is a hard fail for AGT-903 (Sev-2 incident treatment per its spec).
Options-discipline propagation. When TOOL-014 surfaces multiple plausible load-bearing drivers (e.g., both expansion and segment_mix have bands excluding zero), AGT-903's options-enumeration in the StrategyRecommendationLog memo should reflect both readings. The mandatory credible_alternative_finding field is the explicit handoff the brain uses to populate the alternative option.