Margin Guards: Profitability Governance
Infrastructure insurance for the Agentic Era. Protect unit economics with sub-5ms circuit breakers at the gateway edge.
Configuration#
Margin Guards are defined as YAML configuration that lives alongside your infrastructure-as-code. Each guard targets a specific metric, sets a minimum acceptable margin, and defines escalating intervention levels. No application code changes required.
The Margin Formula#
Aforo computes margin in real time using the following formula, evaluated on every request against the tenant's current billing period:
Revenue is pulled from the active subscription's rate plan. COGS is aggregated from provider cost events ingested through the metering engine. The margin percentage is cached in Redis at the gateway edge and recalculated every time a new cost event arrives.
Level 1 Warning#
When the margin for a tenant-metric pair falls below the Level 1 threshold, Aforo fires a webhook notification to the configured channels (Slack, PagerDuty, email). The API request proceeds normally and the customer experiences no degradation. This is the early-warning system that gives your Finance and Engineering teams time to investigate before the situation escalates. Common causes: unexpected spike in reasoning tokens, provider price increase, customer exploiting an underpriced tier.
Level 2 Throttle#
When the margin breaches the Level 2 threshold, Aforo injects an X-Aforo-Throttle: true header and rate-limits the tenant to the configured throttle rate (e.g., 5 req/sec). Your application reads this header and can optionally switch to a lower-cost compute path (e.g., GPT-4o to GPT-4o-mini, or high-fidelity search to approximate search). The customer still receives responses but at reduced throughput. This buys time for the account team to renegotiate the contract or adjust the rate plan.
Level 3 Block#
When the margin falls to the Level 3 threshold, Aforo rejects the request at the gateway with a 429 Margin Limit Exceeded response. The request never reaches your backend, so no compute cost is incurred. This is the financial kill switch: unprofitable traffic is stopped before it generates provider charges. The customer receives a clear error with a Retry-After header indicating when the next billing period begins.
Gateway Enforcement#
Margin Guards are enforced at the gateway edge by the same plugins that handle entitlement checks. Kong, Apigee, AWS API Gateway, Azure APIM, and MuleSoft plugins all automatically read the margin signal from the local Redis cache and apply the configured action. No backend code changes are required.
Best Practices#
Rollout Strategy
Always start with Level 1 in Production to gather baseline data. Only enable Level 3 after 30 days of margin data. Set Level 2 throttle rates based on your P95 traffic.