Start an Improve cycle

Start an Improve cycle when you want Adaline to propose a prompt change from production evidence. The cycle runs in the background, writes progress to the cycle detail page, and moves to Pending review only after it has produced reviewable evidence.

Improve wizard with prompt, focus, behavior targeting, thoroughness, and notification controls

Open Improve

Open a project, then select Improve from the project navigation.

Choose the prompt

Click Start improvement and choose the prompt Adaline should improve. The prompt picker is restricted to prompts in the current project.

Add a focus

Describe what should change. Good focus notes name the failure mode, audience, tone, safety boundary, format requirement, cost concern, latency concern, or evaluator you care about.

Target behaviors

Select one or more Behaviors when you want the cycle to focus on known production patterns. If you start from a Behavior row, the wizard can pre-fill the prompt and Behavior when the relationship is available.

Pick thoroughness

Choose Quick, Standard, or Thorough. Quick is the shortest run, Standard is the default, and Thorough gives Adaline more time to inspect evidence and explore candidates.

Start the cycle

Add teammates to notify when the candidate is ready, then click Start cycle.

Readiness checks

The run button is disabled when the selected prompt is not ready. The most common reasons are:

The prompt has not recorded production traces yet.
No clustered Behaviors are available for that prompt.
The clustering pipeline has not finished for the agent or prompt subject.

When this happens, send traffic to Adaline, confirm traces are arriving, and let Behaviors analyze the traffic before trying again.

Thoroughness presets

Preset	Use it when	Typical tradeoff
Quick	You need a fast first pass on a narrow issue.	Fewer candidates and less exploration.
Standard	You want the default balance of speed and evidence.	Good first choice for most cycles.
Thorough	The issue is high impact, ambiguous, or safety-sensitive.	Longer run with broader candidate exploration.

The app may show approximate durations. Actual time depends on prompt size, number of traces, dataset size, evaluator cost, provider latency, and candidate count.

What makes a good cycle

Start with a prompt that has enough evidence to measure change. The strongest cycles combine:

Recent production traces that show the real behavior.
Behaviors that identify repeated patterns instead of one-off examples.
Evaluators that define what must improve and what must not regress.
Datasets that include normal, edge, and failure cases.
A narrow focus note that explains what a reviewer would accept.

If you do not select a Behavior, Adaline can use failure clusters automatically. Select specific Behaviors when you want to constrain the cycle to a known issue.

After starting

The cycle appears under In progress and opens its live detail page. You can leave the page while Adaline analyzes traces, inspects evaluators, prepares datasets, generates prompt candidates, and prepares the review. If the backend cannot produce a real diff, validation cases, and objective scores, the cycle is marked failed instead of showing a fabricated review. Open the failed cycle to read the failure reason and decide whether to add more evidence or start again with a narrower focus.

​Readiness checks

​Thoroughness presets

​What makes a good cycle

​After starting

Readiness checks

Thoroughness presets

What makes a good cycle

After starting