TOOL-012 — Onboarding Health Predictor

Tier 3 Specialist Tool · Stateless · Predicts which onboarding accounts will hit sustained adoption vs. stall, from early activity signals · Predictive complement to TOOL-008 (descriptive) and TOOL-009 (timing analysis)

Tier 3 · Tool Specced · v32 Post-sales · Onboarding Haiku

Purpose

From the first 30–60 days of telemetry on a newly-onboarded account, predicts whether the account will hit sustained adoption (renewable, expandable) or stall (likely to surface_only or churn). Output is a probability classification + early warning indicators + intervention recommendations sized to onboarding stage. Distinct from TOOL-008 (which describes current adoption pattern) and TOOL-009 (which measures timing against milestones) — TOOL-012 is predictive, projecting forward from early signals to outcome at month 6–12.

Closes the predictive gap in the post-sales activation toolkit. AGT-601 onboarding orchestrator coordinates milestones; TOOL-008/009 describe and time current state; TOOL-012 forecasts where this account is heading. The intervention window during onboarding is the highest-leverage moment in the customer lifecycle — getting prediction right early matters more than getting any other post-sales signal right.

Input schema

{
  "account_id": "uuid",
  "contract_start_date": "ISO 8601",
  "days_since_contract_start": 0,          // typically 14-90 for invocation
  "early_activity_signals": {
    "first_value_event_at": "ISO 8601 | null",  // first meaningful product action
    "active_users_trailing_7d": 0,
    "active_users_trailing_30d": 0,
    "feature_breadth_trailing_30d": 0,
    "integration_setup_status": "not_started" | "in_progress" | "complete",
    "kickoff_call_attended": true | false,
    "kickoff_attendance_quality": "full_team" | "partial" | "minimal" | null
  },
  "onboarding_state": {                    // from AGT-601 OnboardingLog
    "current_milestone": "string",
    "milestones_completed_count": 0,
    "milestones_overdue_count": 0,
    "csm_engagement_count_trailing_30d": 0,
    "ie_engagement_count_trailing_30d": 0,  // implementation engineer
    "map_status": "active" | "stalled" | "abandoned" | null
  },
  "comparison_cohort": {                   // similar accounts, similar onboarding age
    "cohort_definition": "string",
    "cohort_size": 0,
    "cohort_outcome_distribution": {       // retrospective outcomes for the cohort
      "sustained_adoption_pct": 0.0,
      "surface_only_pct": 0.0,
      "early_churn_pct": 0.0
    },
    "cohort_typical_first_value_days": 0,
    "cohort_typical_30d_active_users_pct": 0.0
  }
}

Output schema

{
  "tool_call_id": "uuid",
  "trajectory_classification": "strong_activation" | "on_track_activation" |
                               "concerning_trajectory" | "early_stall_risk" |
                               "active_stall",
  "projected_6_month_outcome": {
    "sustained_adoption_probability": 0.0,
    "surface_only_probability": 0.0,
    "churn_probability": 0.0,
    "projection_confidence": "high" | "medium" | "low"
  },
  "early_warning_indicators": [
    {
      "indicator_type": "no_first_value_event" |
                        "active_users_below_threshold" |
                        "kickoff_attendance_weak" |
                        "integration_stalled" |
                        "milestone_overdue" |
                        "champion_disengaged" |
                        "csm_engagement_imbalance",
      "severity": "low" | "medium" | "high" | "critical",
      "evidence": "string",
      "cohort_comparison": "string"
    }
  ],
  "intervention_recommendations": [
    {
      "intervention_type": "exec_sponsor_intro" |
                           "ie_intensification" |
                           "csm_check_in" |
                           "champion_re_engagement" |
                           "kickoff_redo" |
                           "milestone_reset" |
                           "executive_business_review_pull_in" |
                           "no_action_needed",
      "urgency": "high" | "medium" | "low",
      "intervention_window_remaining_days": 0,
      "rationale": "string"
    }
  ],
  "key_observations": [
    {
      "observation": "string",
      "supporting_metric": "string"
    }
  ],
  "ungrounded_assumptions": ["string"],
  "data_completeness": "high" | "medium" | "low",
  "tool_metadata": {
    "model": "claude-haiku-4-5",
    "input_tokens": 0, "output_tokens": 0,
    "cost_usd_estimate": 0.0,
    "latency_ms": 0
  }
}

Trajectory taxonomy

Trajectory	Indicators	Implication
`strong_activation`	First-value event < cohort p25 days + active users > cohort typical + kickoff full team + milestones on track + CSM/IE engagement balanced	High probability sustained adoption. Standard onboarding cadence sufficient. Document for cohort baselining.
`on_track_activation`	Cohort-typical signals across early indicators. Some weak signals balanced by strong ones.	Standard onboarding cadence sufficient. Monitor; no special intervention.
`concerning_trajectory`	One or two indicators below cohort p25 (e.g., kickoff attendance weak OR first-value delayed) but other signals normal.	Targeted intervention warranted on the specific weak dimension. Often single-conversation fix — CSM check-in, integration unblock.
`early_stall_risk`	Multiple weak indicators clustered. First-value not yet hit by day 30+ AND active users low AND/or champion disengaged.	Multi-pronged intervention. Exec sponsor outreach + IE intensification + champion re-engagement. Highest-leverage moment to bend the curve.
`active_stall`	Onboarding effectively stopped: no activity, no engagement, no milestone progress for 21+ days.	Escalate to SLM. Diagnose: champion left, scope mismatch, internal customer crisis. May require contract-level intervention (scope reduction, term reset, or qualified-out).

Called by

Caller	Invocation context
AGT-601 Onboarding Orchestrator	Weekly batch on accounts within their first 90 days. Most common caller. Output drives intervention queueing for CSM and IE.
AGT-501 Customer Health Monitor	For accounts in onboarding window, prediction augments early health-score with a forward-looking signal distinct from current behavioral score.
AGT-902 Account Brain	For "is this onboarding going well?" queries during executive business reviews or when escalation is being considered.
AGT-704 Business Review Orchestrator	Aggregate cohort-level onboarding-trajectory summary for retention health section of MBR/QBR.

Design principles

Predictive, not descriptive. Trajectory classification projects forward to month 6+ outcome. Distinct from TOOL-008 (current state) and TOOL-009 (timing). All three can be called for the same account; they answer different questions.
Cohort-anchored prediction. Probabilities should reflect actual cohort outcome distribution, not abstract priors. Tool consumes cohort_outcome_distribution input and adjusts based on this account's deviation from cohort signals. Without cohort baseline, projection_confidence drops to medium or low.
Intervention-window-aware. The same weak signal at day 14 vs. day 60 has different intervention implications. Tool factors days_since_contract_start — very early stall signals get a longer intervention window flag than late-onboarding stall signals.
Numerical work in code, prediction characterization in LLM. Same hybrid pattern as TOOL-004/008/009. Probability estimation uses cohort comparison + signal weighting in code; LLM produces the trajectory label, observations, and intervention rationale.
Honest about projection uncertainty. Predicting 6-month outcome from 30-day signal is noisy. Tool reports projection_confidence honestly; AGT-601 weights interventions by confidence so low-confidence predictions don't trigger heavy interventions.

Cost ceiling

Constraint	Value
Per-call input budget	10K tokens
Per-call output budget	2K tokens
Default model	Haiku
Per-call cost	~$0.015
Monthly cap	$200/mo (~13,000 calls)
Frequency	Weekly batch from AGT-601 across active onboarding cohort. Bounded by accounts in onboarding window (typically < 10% of total customers at any time).

Eval criteria

Criterion	Pass threshold
Schema compliance	100% (hard)
Trajectory classification accuracy	20 historical accounts with known 6-month outcomes; ≥ 70% match between TOOL-012 prediction at day 30–60 and retrospective outcome.
Probability calibration	For accounts predicted with sustained_adoption_probability bands of 0.7–0.9: actual sustained-adoption rate falls within ±10pp of midpoint. Calibration check across 30+ predictions.
Early-stall detection lead time	For accounts that retrospectively stalled by month 4: % where TOOL-012 flagged early_stall_risk or active_stall by day 60. ≥ 65%.
False positive rate on strong onboardings	For accounts that retrospectively achieved sustained adoption: ≤ 15% incorrectly flagged as concerning_trajectory or worse at day 60.
Cohort-baseline degradation	For 5 cases without cohort baseline: tool returns projection_confidence = low; does not produce inflated probability point estimates. 100% (hard).
Intervention window honesty	For 5 cases at day 80+ of onboarding: intervention_window_remaining_days reflects shrinking window, not unbounded optimism. 100% (hard).
P95 latency	≤ 2s