Documentation Index Fetch the complete documentation index at: https://www.adaline.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Prompt changes can regress quality in ways that are invisible without evaluation. A tweak that improves one flow can break another, or push cost per request over budget. CI/CD integration lets you run Adaline evaluations inside your existing pipeline — so every prompt deployment is validated against your dataset and evaluators before it reaches production.
How it works
Adaline’s deployment webhooks and evaluation API connect directly to your CI/CD platform. The flow:
A prompt is deployed to a staging environment in Adaline
Adaline fires a webhook to your CI platform (e.g. GitHub Actions, GitLab CI, Jenkins)
Your pipeline calls the Adaline API to run evaluations against a dataset
Evaluation scores are checked against thresholds you define
If scores pass — the pipeline promotes the deployment or updates your app config. If scores fail — the pipeline blocks the release with a clear error
This gives you the same quality gate you have in the Adaline Dashboard, but fully automated and embedded in your development workflow.
Triggering evaluations from CI
Webhook trigger
When you deploy a prompt , Adaline fires a create-deployment webhook to every configured endpoint. Use this to trigger your pipeline automatically:
name : Prompt deployment gate
on :
repository_dispatch :
types : [ adaline-prompt-deployed ]
workflow_dispatch :
inputs :
environment :
description : 'Deployment environment'
required : true
default : 'staging'
If webhooks are not configured, you can trigger the same workflow manually via workflow_dispatch or on a schedule.
Fetching the deployment
Pull the deployed prompt from the Adaline API using your prompt ID and environment:
curl -H "Authorization: Bearer $ADALINE_API_KEY " \
"https://api.adaline.ai/v2/deployments?promptId=${ PROMPT_ID }&deploymentId=latest&deploymentEnvironmentId=${ ENV_ID }"
See Get deployment for the full API reference.
Running evaluations
Trigger an evaluation run against a dataset connected to your prompt:
curl -X POST \
-H "Authorization: Bearer $ADALINE_API_KEY " \
-H "Content-Type: application/json" \
-d '{"datasetId": "' $DATASET_ID '"}' \
"https://api.adaline.ai/v2/prompts/${ PROMPT_ID }/evaluations"
Poll the evaluation status until complete, then fetch results:
curl -H "Authorization: Bearer $ADALINE_API_KEY " \
"https://api.adaline.ai/v2/prompts/${ PROMPT_ID }/evaluations/${ EVAL_ID }/results"
See Create evaluation and Get evaluation results for full API references.
Gating deployments on evaluation results
The core of the CI/CD gate is a threshold check. Parse the evaluation results and compare each evaluator’s score against the minimum you require:
- name : Check thresholds
run : |
HELPFULNESS="${{ steps.eval.outputs.EVAL_HELPFULNESS }}"
TONE="${{ steps.eval.outputs.EVAL_TONE }}"
if (( $(echo "$HELPFULNESS < 0.7" | bc -l) )) || \
(( $(echo "$TONE < 0.8" | bc -l) )); then
echo "::error::Eval failed. Helpfulness=$HELPFULNESS (min 0.7), Tone=$TONE (min 0.8)"
exit 1
fi
When the check passes, the pipeline continues — updating your app’s prompt config, committing the change, and triggering the release:
- name : Update prompt config
if : success()
run : |
jq --arg id "$DEPLOYMENT_ID" \
'.response_generator.deployment_id = $id' config/prompts.json > tmp && mv tmp config/prompts.json
- name : Commit and push
if : success()
run : |
git config user.name "github-actions"
git config user.email "actions@github.com"
git add config/prompts.json
git commit -m "chore: promote prompt deployment (evals passed)"
git push
When the check fails, the job exits with a non-zero code and the release steps are skipped. The error message surfaces in your CI dashboard so the team knows exactly which evaluator failed and by how much.
What you can gate on
Adaline evaluations support multiple evaluator types, all of which can serve as CI/CD gates:
Evaluator type Gate example LLM-as-a-Judge Block if helpfulness or factuality score drops below 0.7 JavaScript Block if output format validation fails Text Matcher Block if required phrases are missing or banned patterns appear Cost Block if average cost per request exceeds $0.005 Latency Block if p95 latency exceeds 3 seconds Response Length Block if responses consistently exceed or fall short of length targets
Combine multiple evaluators to create a comprehensive quality gate. A prompt only ships if every evaluator meets its threshold.
GitHub Actions
GitHub Actions is the most common integration pattern. Use repository_dispatch to receive Adaline webhooks and run evaluations in a workflow job.
Setup
Store the following as repository secrets and variables in your GitHub repo:
Name Type Value ADALINE_API_KEYSecret Your Adaline API key ADALINE_PROMPT_IDVariable The prompt ID to evaluate ADALINE_DEPLOYMENT_ENV_IDVariable The deployment environment ID (e.g. staging) ADALINE_DATASET_IDVariable The dataset ID to run evaluations against
Complete workflow
Below is a complete workflow you can copy into .github/workflows/prompt-gate.yml. It receives the webhook (or runs manually), pulls the deployed prompt, runs evaluations, checks scores against thresholds, and only updates your prompt config if everything passes.
name : Prompt deployment gate
on :
repository_dispatch :
types : [ adaline-prompt-deployed ]
workflow_dispatch :
inputs :
environment :
description : 'Deployment environment (e.g. staging or production)'
required : true
default : 'staging'
jobs :
eval :
runs-on : ubuntu-latest
steps :
- name : Pull deployed prompt from Adaline API
id : deployment
env :
ADALINE_API_KEY : ${{ secrets.ADALINE_API_KEY }}
ADALINE_PROMPT_ID : ${{ vars.ADALINE_PROMPT_ID }}
ADALINE_DEPLOYMENT_ENV_ID : ${{ vars.ADALINE_DEPLOYMENT_ENV_ID }}
run : |
RESP=$(curl -sS -H "Authorization: Bearer $ADALINE_API_KEY" \
"https://api.adaline.ai/v2/deployments?promptId=${ADALINE_PROMPT_ID}&deploymentId=latest&deploymentEnvironmentId=${ADALINE_DEPLOYMENT_ENV_ID}")
echo "deployment_id=$(echo "$RESP" | jq -r '.id')" >> $GITHUB_OUTPUT
echo "project_id=$(echo "$RESP" | jq -r '.projectId')" >> $GITHUB_OUTPUT
- name : Run evaluation via Adaline API
id : eval
env :
ADALINE_API_KEY : ${{ secrets.ADALINE_API_KEY }}
ADALINE_PROMPT_ID : ${{ vars.ADALINE_PROMPT_ID }}
ADALINE_DATASET_ID : ${{ vars.ADALINE_DATASET_ID }}
run : |
EVAL_RESP=$(curl -sS -X POST \
-H "Authorization: Bearer $ADALINE_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"datasetId\": \"$ADALINE_DATASET_ID\"}" \
"https://api.adaline.ai/v2/prompts/${ADALINE_PROMPT_ID}/evaluations")
EVAL_ID=$(echo "$EVAL_RESP" | jq -r '.id')
echo "EVAL_ID=$EVAL_ID" >> $GITHUB_ENV
# Poll GET .../evaluations/$EVAL_ID until status=completed, then fetch results:
# curl -sS -H "Authorization: Bearer $ADALINE_API_KEY" \
# "https://api.adaline.ai/v2/prompts/${ADALINE_PROMPT_ID}/evaluations/${EVAL_ID}/results"
# Parse per-evaluator scores and set outputs accordingly
echo "EVAL_HELPFULNESS=0.85" >> $GITHUB_OUTPUT
echo "EVAL_TONE=0.9" >> $GITHUB_OUTPUT
echo "EVAL_COST_PASS=1" >> $GITHUB_OUTPUT
- name : Check thresholds
run : |
HELPFULNESS="${{ steps.eval.outputs.EVAL_HELPFULNESS }}"
TONE="${{ steps.eval.outputs.EVAL_TONE }}"
COST_PASS="${{ steps.eval.outputs.EVAL_COST_PASS }}"
THRESHOLD_HELP=0.7
THRESHOLD_TONE=0.8
if (( $(echo "$HELPFULNESS < $THRESHOLD_HELP" | bc -l) )) || \
(( $(echo "$TONE < $THRESHOLD_TONE" | bc -l) )) || \
[ "$COST_PASS" != "1" ]; then
echo "::error::Eval failed. Helpfulness=$HELPFULNESS (min $THRESHOLD_HELP), Tone=$TONE (min $THRESHOLD_TONE), Cost pass=$COST_PASS"
exit 1
fi
- name : Checkout repo
if : success()
uses : actions/checkout@v4
- name : Update prompt config JSON
if : success()
env :
DEPLOYMENT_ID : ${{ steps.deployment.outputs.deployment_id }}
run : |
jq --arg id "$DEPLOYMENT_ID" '.response_generator.deployment_id = $id' config/prompts.json > tmp && mv tmp config/prompts.json
- name : Commit and push
if : success()
run : |
git config user.name "github-actions"
git config user.email "actions@github.com"
git add config/prompts.json
git commit -m "chore: promote prompt deployment (evals passed)"
git push
- name : Trigger release pipeline
if : success()
run : |
echo "Trigger release (e.g. workflow_dispatch or API call)"
The evaluation step above includes placeholder score values. In a production workflow, you would poll GET .../evaluations/$EVAL_ID until status is completed, then parse the evaluation results to extract real per-evaluator scores.
When evaluation fails
When the threshold check fails, the job exits with exit 1 and the Checkout , Update prompt config , Commit and push , and Trigger release steps are all skipped. The ::error:: message surfaces in your GitHub Actions dashboard so the team knows exactly which evaluator failed and by how much. Fix the prompt or thresholds in Adaline and redeploy to try again.
GitLab CI
Use a webhook-triggered pipeline or a scheduled job that calls the Adaline API. The same API calls and threshold logic apply — fetch the deployment, run evaluations, check scores, and gate the release.
Jenkins
Trigger a Jenkins pipeline via a webhook endpoint or a polling job. Use curl in shell steps to call the Adaline API, then parse evaluation results and set the build status based on threshold checks.
Any CI/CD platform that supports webhooks or scheduled triggers and can make HTTP requests works with Adaline’s evaluation API. The integration pattern is always the same: receive trigger, call API, check scores, gate release.
Next steps
Configure webhooks Set up real-time deployment notifications for your CI pipeline.
Create evaluation API API reference for triggering evaluations programmatically.
Setup evaluators Configure the evaluators that power your CI/CD gate.