Sign in →
Protocols1 min read

Agentic APIs: Session & Tool-Level Billing

Bill AI agents per session, per tool invocation, and per token. Full lifecycle metering for autonomous agents that consume your APIs.

Updated 2026-06-15Suggest edits
Docs Product Types Agentic APIs

What are Agentic APIs#

An Agentic API is any API endpoint that AI agents consume autonomously — without a human clicking a button for each request. Where a standard API might see a developer call POST /search once per user query, an agentic workflow might invoke that same endpoint forty times during a single planning loop, across a session that runs for twelve minutes.

Standard request-count billing breaks down here. Aforo's Agentic API product type introduces three billing surfaces purpose-built for the agentic runtime:

Per Session
A flat or tiered charge for starting an agent session, regardless of how many tools are invoked
Per Tool Call
A per-invocation charge for each successful tool execution, with optional per-tool price differentiation
Per Token
Pass-through LLM costs billed as metered usage — input, output, and reasoning tokens tracked separately
INFO
Agentic APIs use CLIENT_CREDENTIALS instead of Bearer tokens. An agent authenticates once with a client_id and client_secret, receives a short-lived access token, and includes that token in every subsequent tool call. This lets Aforo attribute usage to a specific registered agent — not just a generic API key.

Agent Identity#

Before an agent can consume your API, it must be registered in the Aforo customer portal. Registration creates a persistent agent record linked to a customer account, a team (department), and a subscription. Each registered agent gets a unique agent_id.

Registration via the Customer Portal

Customers navigate to Developer Hub → AI Agents → Register Agent in the storefront portal. They provide a display name, agent type (Claude, GPT-4, LangChain, CrewAI, AutoGen, or Custom), and assign it to a team. On save, Aforo generates and returns a client_id and client_secret. The secret is shown exactly once.

agent-credentials.json
{
  "agent_id": "agt_01HXK3M9BVWZ4P6R8SNGF2Y7D",
  "agent_name": "Research Assistant v2",
  "agent_type": "CLAUDE",
  "team": "engineering",
  "credential_type": "CLIENT_CREDENTIALS",
  "client_id": "ca_live_9f2a3b4c5d6e7f8a",
  "client_secret": "cs_live_••••••••••••••••••••••••••••••••",
  "created_at": "2026-04-18T09:22:00Z"
}

Token Exchange

At runtime, the agent exchanges its credentials for a short-lived JWT (default 1-hour TTL) via the standard OAuth2 client credentials flow. The agent caches this token and renews it before expiry — your application code never manages credential rotation.

terminal
# Exchange client credentials for an access token
curl -X POST https://api.aforo.ai/oauth/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "client_id=ca_live_9f2a3b4c5d6e7f8a" \
  -d "client_secret=cs_live_..." \
  -d "scope=api:read tool:invoke"

# Response
{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "Bearer",
  "expires_in": 3600,
  "scope": "api:read tool:invoke"
}

Session Metering#

A session begins when an agent makes its first authenticated API call and ends either explicitly (the agent calls a session-end endpoint) or implicitly (60 minutes of inactivity). Sessions are the billing unit for time-based or flat-rate agentic pricing.

EventTriggerBilling Effect
session.startedFirst tool call with a new session_idSession fee charged (if configured)
tool.invokedSuccessful tool execution returns 2xxPer-tool-call fee accrued
token.consumedLLM response includes usage.total_tokensToken usage metered async
session.endedExplicit end call or 60-min idle timeoutSession closed; usage finalized
PRO TIP
Pass a stable X-Session-Id header on every request within an agent run. Aforo groups all usage under that session. If you omit the header, Aforo assigns a new session per request — which defeats session-based pricing and inflates session counts.

Tool-Level Billing#

Not all tools are priced equally. A lightweight search.query call costs far less to serve than a code.execute call that spins up an isolated sandbox. Aforo's dimension pricing lets you set a unique rate per tool name within a single rate plan.

Configuring Per-Tool Rates

In the Rate Plan Wizard, select Agentic API as the product type, then open the Dimension Pricing tab. Each tool name in your tool registry appears as a billable dimension. Set a per-invocation rate for each. Tools without an explicit rate fall back to the plan's base per-call price.

dimension-pricing-config.json
{
  "rate_plan": "agentic-professional",
  "base_per_call": 0.002,
  "dimension_pricing": {
    "search.query":        0.001,
    "search.semantic":     0.003,
    "code.execute":        0.025,
    "code.analyze":        0.010,
    "data.read":           0.001,
    "data.write":          0.005,
    "web.browse":          0.008,
    "image.describe":      0.006
  },
  "token_metering": {
    "metric_id": "llm_tokens",
    "input_rate":    0.000003,
    "output_rate":   0.000015,
    "reasoning_rate": 0.000020
  }
}
WARNING
Tool names are matched case-sensitively against the tool_name field in the usage event. If your MCP server returns Search.Query but your dimension config specifies search.query, the call falls back to the base rate. Normalize tool names in lowercase before emitting usage events.

Example Integration#

The following Node.js example shows the complete agentic integration lifecycle: agent authentication, session start, tool invocation with metering, and explicit session close. It uses the @aforo/mcp-metering SDK which handles token exchange, event buffering, and flush automatically.

terminal
npm install @aforo/mcp-metering
agent-runner.js
import { AforoAgentClient } from '@aforo/mcp-metering';

// 1. Authenticate the agent (client credentials flow)
const aforo = new AforoAgentClient({
  clientId:     process.env.AFORO_CLIENT_ID,
  clientSecret: process.env.AFORO_CLIENT_SECRET,
  tenantId:     process.env.AFORO_TENANT_ID,
  productId:    'prod_research_assistant_v2',
});

await aforo.authenticate(); // exchanges client_id/secret → JWT; auto-renews

// 2. Start a named session (returns a stable session_id)
const session = await aforo.startSession({
  agentId:   'agt_01HXK3M9BVWZ4P6R8SNGF2Y7D',
  sessionTag: 'research-run-2026-04-18',
});

console.log('Session started:', session.sessionId);
// → ses_01HXKRM3ABCDE7F8G9HNPQRST

// 3. Execute tool calls — metering is automatic via the wrapper
const results = await aforo.wrapToolHandler(
  'search.semantic',
  async (args) => {
    // Your actual tool logic here
    const response = await fetch('https://api.yourservice.com/search', {
      method:  'POST',
      headers: { Authorization: `Bearer ${session.accessToken}` },
      body:    JSON.stringify(args),
    });
    return response.json();
  }
)({ query: 'latest RLHF research papers', limit: 20 });

// 4. Meter LLM tokens separately if your agent calls an LLM
await aforo.meterTokens({
  sessionId:      session.sessionId,
  inputTokens:    1240,
  outputTokens:   380,
  reasoningTokens: 520,
  model:          'claude-opus-4',
});

// 5. End the session explicitly (optional — idle timeout closes it too)
const summary = await aforo.endSession(session.sessionId);
console.log('Session summary:', summary);
// → { toolCalls: 14, totalTokens: 24820, durationMs: 47230 }

Usage Events Emitted

The SDK emits three types of usage events to Aforo's ingestion endpoint on your behalf. You never construct raw event payloads manually.

usage-event-examples.json
// Tool call event (emitted per wrapToolHandler invocation)
{
  "event_type": "tool_invoked",
  "metric_id":  "tool_invocations",
  "quantity":   1,
  "agent_id":   "agt_01HXK3M9BVWZ4P6R8SNGF2Y7D",
  "session_id": "ses_01HXKRM3ABCDE7F8G9HNPQRST",
  "tool_name":  "search.semantic",
  "execution_status": "SUCCESS",
  "execution_duration_ms": 312
}

// Token event (emitted per meterTokens call)
{
  "event_type":  "tokens_consumed",
  "metric_id":   "llm_tokens",
  "quantity":    2140,
  "agent_id":    "agt_01HXK3M9BVWZ4P6R8SNGF2Y7D",
  "session_id":  "ses_01HXKRM3ABCDE7F8G9HNPQRST",
  "input_tokens":     1240,
  "output_tokens":    380,
  "reasoning_tokens": 520,
  "model":       "claude-opus-4"
}
PRO TIP
The SDK batches events in memory and flushes every 5 seconds or 100 events, whichever comes first. If your process exits unexpectedly, call await aforo.flush() in your shutdown handler to ensure no usage data is lost.