LLM-as-a-Judge is the most versatile evaluator in Adaline. It uses an LLM to assess your prompt outputs against the custom rubric. This evaluator excels at qualitative assessment, where nuanced judgment matters more than simple metrics.
Setting up the rubric
Select the LLM as a Judge evaluator from the “Add evaluator” action menu.
Define Your Rubric
This is where implementation quality is determined. Your rubric should be specific, actionable, and aligned with your success metrics.
Examples of Rubrics.
Below are some examples of custom rubric to get you started.
Evaluating chatbot responses for accuracy and user satisfaction
Customer Support Response Quality
Evaluate this customer support response using the following criteria:
Scoring Scale (1-4):
4 - Excellent: Completely resolves the issue, professional tone, anticipates follow-up needs
3 - Good: Addresses the main concern clearly and professionally
2 - Fair: Partially helpful but missing key information or context
1 - Poor: Fails to address the issue or uses inappropriate tone
Evaluation Factors:
- Problem resolution completeness
- Professional communication standards
- Information accuracy
- User experience quality
Provide a score and brief justification for your assessment.
Content Marketing Effectiveness
Assessing blog content for engagement and value delivery
Rate this content piece on effectiveness for our target audience (1-5):
5 - Outstanding: Highly engaging, actionable insights, clear value proposition
4 - Strong: Good engagement with solid practical value
3 - Adequate: Informative but limited engagement or actionability
2 - Weak: Basic information with minimal practical value
1 - Poor: Lacks clarity, value, or relevance to target audience
Consider these dimensions:
- Audience alignment and relevance
- Practical value and actionability
- Engagement potential
- Brand positioning effectiveness
Product Feature Documentation
Evaluating technical documentation for clarity and completeness
Assess this feature documentation quality (1-4):
4 - Comprehensive: Clear explanation, complete coverage, excellent user guidance
3 - Good: Well-explained with adequate detail and guidance
2 - Acceptable: Basic explanation but missing important details or clarity
1 - Inadequate: Confusing, incomplete, or lacks necessary user guidance
Evaluation Areas:
- Technical accuracy and completeness
- User comprehension and clarity
- Implementation guidance quality
- Overall user experience
Brand Voice Consistency
Maintaining consistent brand communication across channels
Evaluate brand voice alignment (1-3 scale):
3 - Excellent Alignment: Perfect adherence to brand guidelines, authentic voice
2 - Good Alignment: Generally consistent with minor deviations
1 - Poor Alignment: Inconsistent with established brand voice
Assessment Criteria:
- Tone consistency with brand guidelines
- Language and terminology alignment
- Audience appropriateness
- Brand personality expression