Once you have basic instrumentation in place, these patterns help you model complex AI workflows — from multi-step RAG pipelines to conversational agents with tool calling.
Span content: input and output
Every span’s content object carries two keys — input and output — both stringified JSON. These represent the raw payload going into the operation and the raw result coming back out.
While input and output accept any valid stringified JSON, you get the most value by sending the exact request payload you send to your AI provider as input and the full response as output. When you use a supported provider — OpenAI, Anthropic, Google, and others — Adaline automatically parses these raw objects to:
Calculate cost based on token counts and the model’s pricing
Extract token usage (prompt tokens, completion tokens, total)
Surface model metadata such as stop reason, tool calls, and function invocations
Power continuous evaluations with structured input/output pairs
The easiest way to achieve this is to build your request params as an object, pass it to the provider SDK, and stringify that same object as input:
const params = {
model: "gpt-4" ,
messages: [
{ role: "system" , content: "You are a helpful assistant." },
{ role: "user" , content: "Explain quantum computing in simple terms." },
],
temperature: 0.7 ,
};
const response = await openai . chat . completions . create ( params );
span . update ({
status: "success" ,
content: {
type: "Model" ,
provider: "openai" ,
model: "gpt-4" ,
input: JSON . stringify ( params ),
output: JSON . stringify ( response ),
},
});
params = {
"model" : "gpt-4" ,
"messages" : [
{ "role" : "system" , "content" : "You are a helpful assistant." },
{ "role" : "user" , "content" : "Explain quantum computing in simple terms." },
],
"temperature" : 0.7 ,
}
response = openai.chat.completions.create( ** params)
span.update(
status = "success" ,
content = {
"type" : "Model" ,
"provider" : "openai" ,
"model" : "gpt-4" ,
"input" : json.dumps(params),
"output" : json.dumps(response.model_dump()),
},
)
This pattern works the same way for any supported provider — Anthropic, Google, etc. The key is that the input and output values are the raw, unmodified payloads.
You can also set the input and output fields to use Adaline’s own content schema , although this is more advanced and requires maintaining custom transformations to convert provider payloads into the Adaline format.
If you’re using a custom or unsupported provider, you can still send any valid JSON string for input and output. Adaline will store and display them — automatic enrichment simply won’t apply.
Group spans into a single trace
When your workflow makes multiple LLM calls (e.g., a RAG pipeline with an embedding call followed by a chat completion), group them under a single trace to see the full picture.
With the Proxy
Use the adaline-trace-reference-id header with the same value across all requests:
import uuid
trace_id = str (uuid.uuid4())
# Step 1: Embedding
embedding = client.embeddings.create(
model = "text-embedding-3-small" ,
input = "User query" ,
extra_headers = {
"adaline-api-key" : os.getenv( "ADALINE_API_KEY" ),
"adaline-project-id" : os.getenv( "ADALINE_PROJECT_ID" ),
"adaline-prompt-id" : os.getenv( "ADALINE_PROMPT_ID" ),
"adaline-trace-reference-id" : trace_id,
"adaline-trace-name" : "rag-pipeline" ,
"adaline-trace-status" : "pending" ,
"adaline-span-name" : "query-embedding" ,
},
)
# Step 2: Chat completion with context
response = client.chat.completions.create(
model = "gpt-4" ,
messages = [
{ "role" : "system" , "content" : f "Context: { retrieved_docs } " },
{ "role" : "user" , "content" : "User query" },
],
extra_headers = {
"adaline-api-key" : os.getenv( "ADALINE_API_KEY" ),
"adaline-project-id" : os.getenv( "ADALINE_PROJECT_ID" ),
"adaline-prompt-id" : os.getenv( "ADALINE_PROMPT_ID" ),
"adaline-trace-reference-id" : trace_id,
"adaline-trace-name" : "rag-pipeline" ,
"adaline-trace-status" : "success" ,
"adaline-span-name" : "response-generation" ,
},
)
With the SDK
The SDK groups spans naturally through the trace object:
const trace = monitor . logTrace ({ name: "rag-pipeline" });
const embeddingSpan = trace . logSpan ({ name: "query-embedding" });
// ... do embedding call ...
embeddingSpan . update ({ status: "success" , content: { type: "Embeddings" , ... } });
embeddingSpan . end ();
const retrievalSpan = trace . logSpan ({ name: "vector-search" });
// ... do retrieval ...
retrievalSpan . update ({ status: "success" , content: { type: "Retrieval" , ... } });
retrievalSpan . end ();
const llmSpan = trace . logSpan ({ name: "response-generation" });
// ... do LLM call ...
llmSpan . update ({ status: "success" , content: { type: "Model" , ... } });
llmSpan . end ();
trace . update ({ status: "success" });
trace . end ();
Session tracking
Group related traces by session to follow a user’s full conversation or multi-request workflow:
With the Proxy
session_id = "user-session-abc123"
response = client.chat.completions.create(
model = "gpt-4" ,
messages = conversation_history,
extra_headers = {
"adaline-api-key" : os.getenv( "ADALINE_API_KEY" ),
"adaline-project-id" : os.getenv( "ADALINE_PROJECT_ID" ),
"adaline-prompt-id" : os.getenv( "ADALINE_PROMPT_ID" ),
"adaline-trace-session-id" : session_id,
"adaline-trace-name" : "chat-turn" ,
},
)
With the SDK
const trace = monitor . logTrace ({
name: "Chat Turn" ,
sessionId: "user-session-abc123" ,
});
Session IDs let you filter traces to see all activity within a single user session.
For agents that use tool calling, create nested spans that capture the decision -> tool execution -> response cycle:
const trace = monitor . logTrace ({ name: "agent-request" });
// Step 1: Initial LLM call (may produce tool calls)
const decisionSpan = trace . logSpan ({ name: "agent-decision" });
const llmResponse = await callLLM ( messages );
decisionSpan . update ({
status: "success" ,
content: { type: "Model" , provider: "openai" , model: "gpt-4" , input: "..." , output: "..." },
});
decisionSpan . end ();
// Step 2: Execute tool calls
if ( llmResponse . toolCalls ) {
for ( const toolCall of llmResponse . toolCalls ) {
const toolSpan = trace . logSpan ({
name: `tool- ${ toolCall . function . name } ` ,
tags: [ "tool-call" , toolCall . function . name ],
});
const result = await executeToolCall ( toolCall );
toolSpan . update ({
status: "success" ,
content: {
type: "Tool" ,
name: toolCall . function . name ,
input: toolCall . function . arguments ,
output: JSON . stringify ( result ),
},
});
toolSpan . end ();
}
// Step 3: Final response with tool results
const responseSpan = trace . logSpan ({ name: "final-response" });
const finalResponse = await callLLM ([ ... messages , ... toolResults ]);
responseSpan . update ({
status: "success" ,
content: { type: "Model" , provider: "openai" , model: "gpt-4" , input: "..." , output: "..." },
});
responseSpan . end ();
}
trace . update ({ status: "success" });
trace . end ();
Error handling pattern
Capture errors at both the span and trace level for effective debugging:
const trace = monitor . logTrace ({ name: "workflow" });
try {
const span = trace . logSpan ({ name: "llm-call" });
try {
const response = await callLLM ( messages );
span . update ({ status: "success" , content: { type: "Model" , ... } });
} catch ( error ) {
span . update ({
status: "failure" ,
attributes: {
error: error instanceof Error ? error . message : String ( error ),
errorType: error instanceof Error ? error . name : "Unknown" ,
},
});
throw error ;
} finally {
span . end ();
}
trace . update ({ status: "success" });
} catch ( error ) {
trace . update ({
status: "failure" ,
attributes: { error: String ( error ) },
});
} finally {
trace . end ();
}
Multi-provider pattern
When your workflow uses multiple AI providers (e.g., OpenAI for generation and Anthropic for review), capture each in separate spans within the same trace:
const trace = monitor . logTrace ({ name: "content-pipeline" });
// OpenAI generation
const genSpan = trace . logSpan ({ name: "content-generation" });
const draft = await openaiCall ( messages );
genSpan . update ({
status: "success" ,
content: { type: "Model" , provider: "openai" , model: "gpt-4" , input: "..." , output: draft },
});
genSpan . end ();
// Anthropic review
const reviewSpan = trace . logSpan ({ name: "content-review" });
const review = await anthropicCall ( draft );
reviewSpan . update ({
status: "success" ,
content: { type: "Model" , provider: "anthropic" , model: "claude-3-5-sonnet" , input: draft , output: review },
});
reviewSpan . end ();
trace . end ();
Passing variables for evaluation
When using continuous evaluations , include variable values so they can be used to build datasets :
With the Proxy
headers[ "adaline-span-variables" ] = json.dumps({
"user_question" : { "modality" : "text" , "value" : user_input},
"context" : { "modality" : "text" , "value" : retrieved_context},
})
With the SDK
const span = trace . logSpan ({
name: "llm-call" ,
variables: {
user_question: { modality: "text" , value: userInput },
context: { modality: "text" , value: retrievedContext },
},
});
Override continuous evaluation sample rate
Continuous evaluations run on a configurable sample rate — not every span is evaluated. When you need to guarantee that a specific span is evaluated regardless of the sample rate, set runEvaluation: true on the span.
This is useful when you want to force evaluation on specific requests — high-value customers, flagged conversations, edge cases you’re debugging, or canary deployments where every response matters.
At span creation
const span = trace . logSpan ({
name: "chat-completion" ,
runEvaluation: true ,
content: { type: "Model" , provider: "openai" , model: "gpt-4" , input: JSON . stringify ( params ), output: "{}" },
});
span = trace.log_span(
name = "chat-completion" ,
run_evaluation = True ,
content = { "type" : "Model" , "provider" : "openai" , "model" : "gpt-4" , "input" : json.dumps(params), "output" : " {} " },
)
Via span update
You can also set it after creation — for example, based on a condition you only know after the LLM responds:
const response = await openai . chat . completions . create ( params );
const shouldEvaluate = user . tier === "enterprise" || response . choices [ 0 ]. finish_reason === "content_filter" ;
span . update ({
status: "success" ,
runEvaluation: shouldEvaluate ,
content: {
type: "Model" ,
provider: "openai" ,
model: "gpt-4" ,
input: JSON . stringify ( params ),
output: JSON . stringify ( response ),
},
});
span . end ();
response = openai.chat.completions.create( ** params)
should_evaluate = user.tier == "enterprise" or response.choices[ 0 ].finish_reason == "content_filter"
span.update(
status = "success" ,
run_evaluation = should_evaluate,
content = {
"type" : "Model" ,
"provider" : "openai" ,
"model" : "gpt-4" ,
"input" : json.dumps(params),
"output" : json.dumps(response.model_dump()),
},
)
span.end()
With the Proxy
Set the adaline-span-run-evaluation header to "true":
response = client.chat.completions.create(
model = "gpt-4" ,
messages = messages,
extra_headers = {
"adaline-api-key" : os.getenv( "ADALINE_API_KEY" ),
"adaline-project-id" : os.getenv( "ADALINE_PROJECT_ID" ),
"adaline-span-run-evaluation" : "true" ,
},
)
Setting runEvaluation: true guarantees the span will be evaluated. It does not affect spans where the flag is omitted or false — those still follow the configured sample rate.
Parallel workflows
When your workflow runs operations concurrently — multiple tool calls at once, parallel retrieval from several sources, or fan-out to multiple LLM providers — create sibling spans that overlap in time. Adaline renders them correctly based on their startedAt and endedAt timestamps.
const trace = monitor . logTrace ({ name: "parallel-workflow" });
// Create all spans before starting the work
const searchSpan = trace . logSpan ({ name: "web-search" , tags: [ "parallel" ] });
const dbSpan = trace . logSpan ({ name: "db-lookup" , tags: [ "parallel" ] });
const cacheSpan = trace . logSpan ({ name: "cache-check" , tags: [ "parallel" ] });
// Run all operations concurrently
const [ searchResults , dbResults , cacheResults ] = await Promise . all ([
webSearch ( query ). then (( res ) => {
searchSpan . update ({ status: "success" , content: { type: "Retrieval" , input: JSON . stringify ({ query }), output: JSON . stringify ( res ) } });
searchSpan . end ();
return res ;
}),
dbLookup ( query ). then (( res ) => {
dbSpan . update ({ status: "success" , content: { type: "Retrieval" , input: JSON . stringify ({ query }), output: JSON . stringify ( res ) } });
dbSpan . end ();
return res ;
}),
cacheCheck ( query ). then (( res ) => {
cacheSpan . update ({ status: "success" , content: { type: "Function" , input: JSON . stringify ({ query }), output: JSON . stringify ( res ) } });
cacheSpan . end ();
return res ;
}),
]);
// Merge results and generate final response
const llmSpan = trace . logSpan ({ name: "generate-response" });
const response = await callLLM ( mergeContext ( searchResults , dbResults , cacheResults ));
llmSpan . update ({ status: "success" , content: { type: "Model" , provider: "openai" , model: "gpt-4" , input: JSON . stringify ( params ), output: JSON . stringify ( response ) } });
llmSpan . end ();
trace . end ();
trace = monitor.log_trace( name = "parallel-workflow" )
# Create all spans before starting the work
search_span = trace.log_span( name = "web-search" , tags = [ "parallel" ])
db_span = trace.log_span( name = "db-lookup" , tags = [ "parallel" ])
cache_span = trace.log_span( name = "cache-check" , tags = [ "parallel" ])
# Run all operations concurrently
async def do_search ():
res = await web_search(query)
search_span.update( status = "success" , content = { "type" : "Retrieval" , "input" : json.dumps({ "query" : query}), "output" : json.dumps(res)})
search_span.end()
return res
async def do_db ():
res = await db_lookup(query)
db_span.update( status = "success" , content = { "type" : "Retrieval" , "input" : json.dumps({ "query" : query}), "output" : json.dumps(res)})
db_span.end()
return res
async def do_cache ():
res = await cache_check(query)
cache_span.update( status = "success" , content = { "type" : "Function" , "input" : json.dumps({ "query" : query}), "output" : json.dumps(res)})
cache_span.end()
return res
search_results, db_results, cache_results = await asyncio.gather(
do_search(), do_db(), do_cache()
)
# Merge results and generate final response
llm_span = trace.log_span( name = "generate-response" )
response = await call_llm(merge_context(search_results, db_results, cache_results))
llm_span.update( status = "success" , content = { "type" : "Model" , "provider" : "openai" , "model" : "gpt-4" , "input" : json.dumps(params), "output" : json.dumps(response)})
llm_span.end()
trace.end()
The same pattern works for parallel tool calls inside an agent loop — create a span per tool call and run them with Promise.all / asyncio.gather. Each span records its own timing independently.
Distributed logging
When a single user request flows through multiple services — an API gateway, a worker queue, a retrieval service — you can attach all spans to one trace using the REST API . The main service creates the trace and passes its referenceId to downstream services. Those services don’t create a new trace; they attach spans to the existing one using traceReferenceId, or update the trace using referenceId.
Step 1: Main service creates the trace
The orchestrator creates the trace, logs its own span, and passes the referenceId to downstream workers:
curl -X POST https://api.adaline.ai/v2/logs/trace \
-H "Authorization: Bearer $ADALINE_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"projectId": "your-project-id",
"trace": {
"name": "user-request",
"status": "unknown",
"referenceId": "req-abc-123",
"startedAt": 1700000000000,
"endedAt": 1700000001000,
"tags": ["api-gateway"]
},
"spans": [
{
"name": "route-and-dispatch",
"status": "success",
"referenceId": "span-gateway-001",
"startedAt": 1700000000000,
"endedAt": 1700000001000,
"content": { "type": "Function", "input": "{...}", "output": "{...}" }
}
]
}'
The response returns the Adaline-generated traceId. But downstream services don’t need it — they can reference the trace by traceReferenceId instead.
Step 2: Worker attaches a span
A downstream worker picks up the job, reads the shared referenceId, and attaches its span to the same trace:
curl -X POST https://api.adaline.ai/v2/logs/span \
-H "Authorization: Bearer $ADALINE_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"projectId": "your-project-id",
"traceReferenceId": "req-abc-123",
"span": {
"name": "retrieval-worker",
"status": "success",
"referenceId": "span-worker-001",
"startedAt": 1700000001500,
"endedAt": 1700000003000,
"content": { "type": "Retrieval", "input": "{...}", "output": "{...}" }
}
}'
Step 3: Another service attaches more spans
A second worker handles the LLM call, again using the same traceReferenceId:
curl -X POST https://api.adaline.ai/v2/logs/span \
-H "Authorization: Bearer $ADALINE_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"projectId": "your-project-id",
"traceReferenceId": "req-abc-123",
"span": {
"name": "llm-worker",
"status": "success",
"referenceId": "span-worker-002",
"startedAt": 1700000003000,
"endedAt": 1700000005000,
"content": { "type": "Model", "provider": "openai", "model": "gpt-4", "input": "{...}", "output": "{...}" }
}
}'
Step 4: Any service can update the trace
Once the full pipeline completes, any service can update the trace status or add metadata using the same referenceId:
curl -X PATCH https://api.adaline.ai/v2/logs/trace \
-H "Authorization: Bearer $ADALINE_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"projectId": "your-project-id",
"referenceId": "req-abc-123",
"logTrace": {
"status": "success",
"tags": ["completed"]
}
}'
The referenceId is any string you control — a request ID, a job ID, a correlation ID from your existing infrastructure. Pass it between services however you like: HTTP headers, message queue payloads, job metadata, or environment variables.
Best practices
One trace per user request — Each user interaction (API call, chat message, workflow trigger) should be a single trace.
Descriptive span names — Use names that indicate the operation: "query-embedding", "vector-search", "response-generation" — not "step-1", "step-2".
Always set final status — Update trace and span status to "success" or "failure" before ending.
Use tags for filtering — Add tags like ["production", "v1.3"] to make it easy to filter in the Monitor .
Include variables — Pass variable values on spans so they can be captured into datasets for evaluation.
Next steps
Log User Feedback Attach user feedback signals to traces.
Analyze Log Traces View your traces in the Monitor pillar.