Skip to main content
A dataset is only as useful as its column design. Columns decide what data a prompt receives, what evaluators compare against, and what reviewers can understand when a row fails.

Column types

Adaline datasets support three column patterns:
Column typeWhat it stores or resolvesUse it for
StaticA value stored directly in the dataset cell.Prompt variables, expected outputs, labels, metadata, and notes.
Dynamic APIA value resolved from an API request.Lookup data that should be fetched at evaluation time.
Dynamic PromptA value resolved by running another prompt.Preprocessing, synthetic context, transformations, or chained prompt workflows.
Dynamic columns can make evaluations realistic, but they also add cost, latency, and another failure point. Use static columns for release gates when repeatability matters most.

Name columns intentionally

Use names that match how the prompt and evaluators read the row. Good column names:
  • customer_question
  • expected_answer
  • retrieved_policy
  • account_tier
  • expected_tool
  • risk_label
  • release_tag
Avoid names such as input1, output_new, misc, or temp_notes in long-lived datasets.

Map columns to prompt variables

Prompt variables should resolve from dataset columns during evaluation. If a prompt expects {{customer_question}}, the dataset should have a customer_question column or an explicit mapping that provides it. Before running a dataset:
  • Confirm every required prompt variable has a value.
  • Confirm expected-output columns are not accidentally passed as user input.
  • Confirm optional variables have sane blanks or defaults.
  • Confirm dynamic columns finish before dependent prompt runs.

Use metadata columns

Metadata columns help reviewers filter and interpret failures. Useful metadata:
  • source such as production, manual, synthetic, csv, or support-ticket.
  • risk such as low, medium, high, policy, safety, or revenue.
  • behavior_id or behavior label.
  • trace_id when a row came from production.
  • release_tag or incident_id.
  • locale, region, segment, or workflow.
Do not store secrets or raw identifiers unless your data policy allows it.

Dynamic API columns

Use dynamic API columns when the evaluation needs fresh data from another service. Before using dynamic API columns in a release gate:
  • Confirm the endpoint is stable.
  • Confirm authentication and headers are safe.
  • Add timeout and error expectations.
  • Decide whether a backend outage should fail the prompt release.
  • Track added cost and latency in evaluation results.
When possible, store a static fixture for critical gates and use dynamic API columns for exploratory or integration-style evaluations.

Dynamic prompt columns

Dynamic prompt columns run another prompt to produce a dataset value. They are useful for preprocessing, summarization, labeling, extraction, and chained workflows. Use them carefully:
  • Keep the upstream prompt version stable.
  • Add evaluator coverage for the generated value if it matters.
  • Watch total cost and latency because linked prompt runs contribute to evaluation metrics.
  • Avoid using a dynamic prompt column when a static expected value would be clearer.

Row review checklist

Before a dataset becomes a release gate:
  • Every row has realistic input.
  • Expected outputs or labels are reviewed.
  • Edge cases are intentional.
  • Synthetic rows are marked as synthetic.
  • Production rows keep enough trace context to explain the issue.
  • Duplicates are removed.
  • Columns match prompt variables and evaluator needs.
  • Sensitive data has been removed or approved.
When a dataset starts to feel hard to review, split it by behavior, workflow, or risk level. Smaller named datasets are easier to trust than one giant mixed table.