LLM Inference with Receipt
Orchestrations & Workflows
Record LLM Inference
Record an LLM inference call within an orchestration for cost tracking and provenance auditing.
POST
LLM Inference with Receipt
Records an individual LLM inference call that occurred within a running orchestration. Each inference record captures the model, provider, token counts, latency, cost, and content hashes — enabling fine-grained cost attribution and provenance auditing of every AI decision.
This endpoint is called by agent runtimes after each LLM API call completes. The inference record is linked to the parent orchestration and included in the final transparency-log receipt.
Authentication
API key with
orchestrations:write scope. Alternatively, pass a Bearer JWT
token in the Authorization header.Tenant identifier for multi-tenant isolation.
Path Parameters
Parent orchestration identifier (
maip-orch:ULID). Must be in running
status.Request
LLM model identifier (e.g.
claude-sonnet-4-20250514, gpt-4o,
amazon.titan-text-express).LLM provider. Accepted values:
openai, anthropic, bedrock, custom.Number of input tokens consumed by the inference call.
Number of output tokens generated by the inference call.
End-to-end latency of the inference call in milliseconds.
Cost of the inference call in US cents. Calculated by the agent runtime based
on provider pricing.
SHA-256 hex digest of the prompt input. Enables verification that the exact
input can be reproduced.
SHA-256 hex digest of the model output. Enables verification that the recorded
output matches what was returned.
Response
Unique identifier for this inference record.
Parent orchestration identifier.
LLM model used.
LLM provider.
Input token count.
Output token count.
Inference cost in US cents.
ISO 8601 timestamp when the inference was recorded.
Authorizations
API key for machine-to-machine authentication
Body
application/json

