Annotate logs - Adaline

Where annotation happens

Adaline keeps raw evidence in Traces, but review work belongs in Datasets. Use logs to find the case, then use dataset columns to capture what the team learned.

Typical annotation fields include:

Field	Purpose
Expected output	The response or behavior future prompt versions should produce.
Issue label	The failure mode, such as wrong tool, missing context, unsafe answer, or bad format.
Review note	Human explanation of what happened and why the row matters.
Priority	Whether the case should block release or remain advisory.
Status	Pending, reviewed, fixed, or accepted risk, based on your team’s workflow.

Column names are flexible. Keep them consistent across datasets so evaluators and reviewers do not have to relearn each table.

Build a review queue

Use dataset filters to create a small queue, not a giant backlog:

Add only representative spans from Monitor, Traces, Behaviors, or Improve review.

Filter dataset rows where review fields are empty.

Annotate the rows that will affect release decisions.

Attach evaluators that use the expected output, label, or rubric.

Move reviewed rows into regression coverage when they should gate future prompt versions.

Keep annotations practical

Write enough for another teammate to understand the case, but do not turn every row into a long incident report. The best annotations explain what the model should do differently and why the row belongs in the dataset.

Build datasets from logs

Add useful production spans to datasets.

Set up a dataset

Structure dataset columns for review and evaluation.

Evaluators overview

Turn annotated expectations into checks.

Build regression coverage

Use production cases as release safety.

​Where annotation happens

​Build a review queue

​Keep annotations practical