Skip to main content
POST
/
v1
/
orchestrations
/
llm-inference
LLM Inference with Receipt
curl --request POST \
  --url https://api.truthlocks.com/v1/orchestrations/llm-inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "model": "claude-3-sonnet",
  "prompt": "Analyze this dataset for anomalies",
  "agent_id": "550e8400-e29b-41d4-a716-446655440000",
  "parameters": {
    "temperature": 0.3,
    "max_tokens": 2048
  }
}
'
{
  "response": "<string>",
  "receipt_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "tokens_used": 123,
  "cost": 123
}
Records an individual LLM inference call that occurred within a running orchestration. Each inference record captures the model, provider, token counts, latency, cost, and content hashes — enabling fine-grained cost attribution and provenance auditing of every AI decision. This endpoint is called by agent runtimes after each LLM API call completes. The inference record is linked to the parent orchestration and included in the final transparency-log receipt.

Authentication

X-API-Key
string
required
API key with orchestrations:write scope. Alternatively, pass a Bearer JWT token in the Authorization header.
X-Tenant-ID
string
required
Tenant identifier for multi-tenant isolation.

Path Parameters

id
string
required
Parent orchestration identifier (maip-orch:ULID). Must be in running status.

Request

model
string
required
LLM model identifier (e.g. claude-sonnet-4-20250514, gpt-4o, amazon.titan-text-express).
provider
string
required
LLM provider. Accepted values: openai, anthropic, bedrock, custom.
prompt_tokens
integer
Number of input tokens consumed by the inference call.
completion_tokens
integer
Number of output tokens generated by the inference call.
latency_ms
integer
End-to-end latency of the inference call in milliseconds.
cost_cents
number
Cost of the inference call in US cents. Calculated by the agent runtime based on provider pricing.
input_hash
string
SHA-256 hex digest of the prompt input. Enables verification that the exact input can be reproduced.
output_hash
string
SHA-256 hex digest of the model output. Enables verification that the recorded output matches what was returned.

Response

inference_id
string
Unique identifier for this inference record.
orchestration_id
string
Parent orchestration identifier.
model
string
LLM model used.
provider
string
LLM provider.
prompt_tokens
integer
Input token count.
completion_tokens
integer
Output token count.
cost_cents
number
Inference cost in US cents.
recorded_at
string
ISO 8601 timestamp when the inference was recorded.

Authorizations

X-API-Key
string
header
required

API key for machine-to-machine authentication

Body

application/json
model
string
required

Model identifier (e.g. gpt-4, claude-3)

prompt
string
required

Input prompt

agent_id
string<uuid>
required

Invoking agent

parameters
object

Model parameters (temperature, max_tokens, etc.)

Response

Inference result with receipt

response
string
receipt_id
string<uuid>
tokens_used
integer
cost
number<float>