UsageMeteringLog — Production Schema

Layer 8: Revenue Operations · Canonical product telemetry source for ASC 606 recognition, consumption billing, and usage-driven GTM signals · Single-writer: ingestion service · Multi-reader
L8 · Schema Production · v26 Owns: UsageMeteringLog Finance + Data Eng (jointly owned)
Purpose

UsageMeteringLog is the single canonical metering source for the OS. Product telemetry (consumption units, seat utilization) lands here, and every downstream agent that needs usage data reads from this table — never from the product system directly. This concentrates the external integration risk in one place (the ingestion service that writes UsageMeteringLog), keeps consumers source-agnostic, and gives Finance and the auditor exactly one place to validate ASC 606 source data.

Single-writer invariant. The ingestion service is the only writer. AGT-804, AGT-802, AGT-503, AGT-501 are read-only. No agent writes back to this table.
Why this is production-critical
DependentRead patternWhat breaks if data is wrong
AGT-804 Revenue RecognitionMonthly batch on period close, plus intra-period for consumption SKUsASC 606 recognized revenue is wrong → Magic Number wrong → board reporting wrong → audit finding
AGT-802 Billing & InvoicingEnd-of-period batch for consumption invoice generationCustomer is over- or under-billed → credit memo storm, customer trust loss
AGT-503 Expansion TriggerEvent-driven on consumption-overage and seat-util ≥ 80% threshold breachExpansion plays fire late or incorrectly → missed renewal/expansion revenue
AGT-501 Customer HealthDaily batch on per-account max(seat_utilization_pct)Health scores miscalibrated → AGT-502 churn risk wrong → AM intervention wrong
Because four agents depend on this table for materially different decisions (recognition vs. billing vs. expansion vs. health), data quality on UsageMeteringLog is the single highest-leverage data quality investment in L8.
Schema definition
FieldTypeReqNotes
usage_idUUIDREQPrimary key. Generated at ingestion.
account_idUUIDREQFK to Accounts. Indexed.
sku_idVARCHAR(64)REQFK to PricingConfig.skus. Indexed.
sku_typeENUMREQconsumption / seat / hybrid. Drives downstream branching.
period_startTIMESTAMPTZREQFirst instant of metering period (inclusive).
period_endTIMESTAMPTZREQLast instant of metering period (exclusive). Partition key.
period_granularityENUMREQhourly / daily / weekly / monthly. Most consumption SKUs are monthly; AGT-503 may need daily for near-real-time overage.
units_consumedNUMERIC(20,4)REQ≥ 0. Units consumed in the period (API calls, GB-hours, background checks, etc.). Hard reject on negative.
commit_unitsNUMERIC(20,4)optCommitted amount per OrderLineItems for this SKU/period. NULL if no commit (pure pay-as-you-go).
overage_unitsNUMERIC(20,4)REQDEFAULT 0. MAX(0, units_consumed - commit_units) when commit_units present, else 0. CHECK constraint enforces.
unit_price_usdNUMERIC(12,4)optIn-commit unit price at metering time. Captured for billing/recognition reproducibility — audit-critical.
overage_unit_price_usdNUMERIC(12,4)optOverage tier price (often differs from in-commit price). Audit-critical.
overage_amount_usdNUMERIC(14,4)optoverage_units × overage_unit_price_usd. Computed at ingest, persisted for audit reproducibility.
active_seatsINTEGERoptFor sku_type IN (seat, hybrid). Distinct active users in the period.
licensed_seatsINTEGERoptFor sku_type IN (seat, hybrid). Licensed quantity per OrderLineItems.
seat_utilization_pctNUMERIC(5,2)optComputed: active_seats / licensed_seats × 100. Range [0, 100]. CHECK constraint.
source_systemVARCHAR(64)REQProduct system identifier (e.g., product-prod-us-east-1). Enables multi-region telemetry without ambiguity.
ingest_event_idVARCHAR(128)REQIdempotency key from source. UPSERT key. Re-ingest = no-op.
ingest_batch_idUUIDREQBatch identifier for the ingestion run. Enables batch-level rollback.
received_atTIMESTAMPTZREQWhen row landed in UsageMeteringLog.
effective_atTIMESTAMPTZREQWhen the measurement was taken at source. Distinct from period_end — effective_at is the wall clock of measurement, period_end is the measurement window boundary.
record_versionINTEGERREQDEFAULT 1. Increments on correction. Consumers read MAX(record_version) per natural key.
prior_version_idUUIDoptPoints to the row this one supersedes. NULL on initial insert.
correction_reasonTEXToptRequired (NOT NULL) when record_version > 1. Free text from source or Finance.
audit_statusENUMREQDEFAULT pending_recon. Values: pending_recon / verified / disputed. Verified after monthly reconciliation pass.
created_atTIMESTAMPTZREQDEFAULT NOW().
updated_atTIMESTAMPTZREQDEFAULT NOW(). Updated only on audit_status transitions; usage data itself is append-only via record_version model.
Keys, constraints, partitioning
ElementDefinitionWhy
Primary keyusage_idSurrogate. Stable across corrections.
Unique constraint(account_id, sku_id, period_start, period_end, source_system, ingest_event_id, record_version)Enforces idempotent ingest + correction lineage.
Idempotency UPSERT key(source_system, ingest_event_id)Source replay cannot create duplicates.
PartitionBy period_end, monthlyMost reads filter by period range. Partition pruning is large.
Index 1(account_id, period_end DESC)AGT-501/503 lookups by account.
Index 2(sku_id, period_end DESC)AGT-804 batch processing by SKU.
Index 3(audit_status, period_end)Reconciliation jobs scan unverified rows.
Index 4(received_at)Ingestion lag monitoring.
Foreign keysaccount_id → Accounts; sku_id → PricingConfig.skus; prior_version_id → UsageMeteringLog.usage_id (self)Referential integrity.
CHECK: units_consumedunits_consumed ≥ 0Negative consumption is a source-system bug. Reject.
CHECK: period boundsperiod_end > period_startInverted period is an upstream bug. Reject.
CHECK: overage mathoverage_units = GREATEST(0, units_consumed - COALESCE(commit_units, units_consumed))Overage cannot drift from definition.
CHECK: seat utilseat_utilization_pct BETWEEN 0 AND 100 when not NULLOut-of-range util is a source bug.
CHECK: correction lineage(record_version = 1 AND prior_version_id IS NULL) OR (record_version > 1 AND prior_version_id IS NOT NULL AND correction_reason IS NOT NULL)Corrections must cite their predecessor and a reason.
Idempotency & correction model

UsageMeteringLog is append-only at the row level. Initial inserts and corrections both write new rows. The data is never destructively updated — the record_version + prior_version_id chain is the audit trail.

ScenarioBehavior
Source emits same event twice (replay)UPSERT on (source_system, ingest_event_id) matches existing row. No new row written. No double-count.
Source emits correction for a prior periodNew row written with record_version = prior + 1, prior_version_id = original_usage_id, correction_reason = '...'. Original row preserved.
Consumers reading historical dataAlways filter to (MAX(record_version)) per (account_id, sku_id, period_start, period_end, source_system). View UsageMeteringLog_current exposes this filter as a default read path.
Late-arriving data (data for July arrives in August)Row inserted with received_at > period_end + threshold. Monitoring alert at >24h late. Consumers tolerant; AGT-804 may need to amend a closed period → emits ASC 606 prior-period adjustment.
Out-of-order replay (correction arrives before original)Hard reject. record_version must be monotonic. If the source system has weak ordering, the ingestion service must buffer until the original arrives.
ASC 606 implication: prior-period corrections are reportable. AGT-804 must distinguish current-period activity from prior-period adjustments in RevenueRecognitionLog. The correction flow here exists specifically so AGT-804 can do that cleanly.
Quality checks at ingestion
RuleSeverityAction
units_consumed ≥ 0HARDReject to DLQ. Page Finance + DataEng on first occurrence.
period_end > period_startHARDReject to DLQ. Source-system bug.
period_end ≤ NOW() + 1 hourHARDReject. Future-dated data is a clock skew or source bug.
ingest_event_id NOT NULL AND length > 0HARDReject. Without an idempotency key, replay safety is gone.
account_id, sku_id resolve to active rowsHARDReject. Orphan FK is a config drift.
seat_utilization_pct BETWEEN 0 AND 100 when not NULLHARDReject. Out-of-range value.
record_version monotonic per natural keyHARDReject. Buffer or fail loudly.
received_at ≤ period_end + 24hSOFTAccept. Emit late-arrival warning to monitoring.
effective_at within [period_start, period_end + 1h]SOFTAccept. Emit drift warning.
units_consumed within 10x of trailing 30d median for (account_id, sku_id)SOFTAccept. Emit anomaly warning. Likely customer activity, but may be a source system bug worth a human look.
HARD failures land in UsageMeteringDLQ with full payload + rejection reason. DLQ has its own monitoring; entries pending >1 hour escalate to oncall. SOFT warnings emit to a monitoring topic; they do not block ingestion.
Reconciliation against source-of-truth

Ingestion can drop or duplicate rows for any number of reasons (network failure, deployment race, source-system event-emission gap). The defense is a periodic reconciliation pass that compares UsageMeteringLog totals to a separate source-of-truth query on the product database.

CadenceWhat's comparedTolerance
DailyPer-account-per-SKU sum(units_consumed) for prior day, UsageMeteringLog vs product DB± 0.1% or ≤ 1 unit, whichever is greater
MonthlyPer-account-per-SKU sum(units_consumed) for closed month± 0.01%. Stricter because feeds AGT-804 recognition.
QuarterlyFull UsageMeteringLog vs source-of-truth, all SKUs, all accounts± 0.001%. Audit-grade.
If monthly reconciliation exceeds tolerance, AGT-804's monthly recognition close holds until reconciliation passes. AGT-704's MBR pulls AGT-702 metrics that derive from AGT-804 — monthly close gate cascades into business review staleness.

On reconciliation pass, all in-window rows transition audit_status: pending_recon → verified. On reconciliation failure, RevOps + Finance + DataEng triage; rows may transition to disputed pending resolution.

Retention & cold storage
TierWindowAccess pattern
Hot (operational)0 to 13 monthsAll consumers (AGT-501/503/802/804) read freely. SLA-bound queries.
Warm (recent history)13 to 36 monthsReporting and analyst queries. AGT-702 trend lookback. Slower SLA.
Cold (audit retention)36 months to 7 yearsAudit queries only. Restored on demand. ASC 606 + tax retention requirement.
Purge> 7 yearsHard delete subject to legal hold checks. Documented in retention runbook.
Tier transitions are automated. Tier transitions never modify usage data — they move bytes, not values. record_version chains preserved across tiers.
Write path
  1. Source emit — product system emits a metering event to the ingestion queue (Kafka / Kinesis / SQS, whichever is operational). Event includes natural key (account_id, sku_id, period_start, period_end, source_system, ingest_event_id) and measured values.
  2. Ingestion service — consumes from queue, runs hard quality checks, performs UPSERT on idempotency key. On hard failure, writes to UsageMeteringDLQ and emits alert.
  3. Soft-warning emit — soft failures persisted but emitted to a monitoring topic for human review.
  4. Threshold subscription — the ingestion service additionally publishes to a UsageThresholdEvents topic when consumption-overage or seat-utilization thresholds are crossed. AGT-503 subscribes here for near-real-time expansion triggers (bypassing scheduled cycles per spec v25).
  5. Reconciliation jobs — daily/monthly/quarterly reconciliation jobs read UsageMeteringLog + source DB, write reconciliation results, transition audit_status.
  6. Tier transitions — nightly job moves rows older than 13/36 months to warm/cold tiers.
Consumer registry
ConsumerRead pathRefreshWhat they extract
AGT-804 Revenue RecognitionFilter by record_version = MAX per natural key. Filter by audit_status IN (verified, pending_recon).Monthly batch on period close. Intra-period for consumption SKUs.units_consumed, overage_units, overage_amount_usd, prior-period corrections.
AGT-802 Billing & InvoicingSame view. audit_status = verified preferred; pending_recon acceptable for non-final invoices.End-of-period batch.units_consumed, overage_amount_usd for invoice line generation.
AGT-503 Expansion TriggerSubscribes to UsageThresholdEvents. Falls back to scheduled poll for daily aggregates.Event-driven on threshold breach. Daily aggregate poll for trailing pattern.overage_units > 0 → +40 expansion pts. seat_utilization_pct ≥ 80 for 2+ periods → +30 expansion pts.
AGT-501 Customer HealthDaily aggregate per account: max(seat_utilization_pct) over trailing 30 days.Daily batch.Seat utilization dimension (18 pts max in health model).
All consumers read from the same canonical row set. There is no per-consumer projection table by design — that would create reconciliation surface area. Materialized views per consumer are acceptable as performance optimizations but they must be rebuildable from the canonical table.
Failure modes & runbook hints
SymptomLikely causeAction
DLQ depth growingSource system schema drift, or quality rule too strictInspect rejected payloads. If schema drift, ingestion service needs version handling. If rule false-positive, tune.
Reconciliation tolerance breachedDropped events in queue, source DB lag, or genuine source bugRe-run reconciliation against source DB at multiple offsets. If reproducible, hold AGT-804 monthly close + notify Finance.
Duplicate rows for same natural keyUPSERT bypass — either ingestion service bug or two services writingStop secondary writer. Data fix: delete duplicates keeping highest record_version.
Late-arrival warnings spikeSource system event-emission lagInvestigate source. If chronic, adjust soft-threshold; document in runbook.
AGT-503 not firing on overageUsageThresholdEvents publisher offline OR threshold rule wrongCheck publisher health first. Then validate threshold rule. AGT-503 daily fallback should still catch within 24h.
AGT-804 month close holdingReconciliation incomplete or disputed rows presentTriage disputed rows. If timing only, override gate with documented Finance approval. If data-integrity, do not override.
Go-live checklist

Schema is "production" only when all of these are true. Until then, this document describes the target state.

Until every box is ticked, the brain layer (Tier 2 in the architecture plan) reads from this table at its own risk — and should default to "telemetry not yet trusted" guardrails when used in operational decisions.