Skip to content

Instantly share code, notes, and snippets.

@savelee
Last active June 10, 2024 10:24
Show Gist options
  • Save savelee/4eff5d2dda0b727446d594e8251a0478 to your computer and use it in GitHub Desktop.
Save savelee/4eff5d2dda0b727446d594e8251a0478 to your computer and use it in GitHub Desktop.
LLM rating criteria
Criterion Description Rating Scale (1-5)
Relevance How well does the response directly address the prompt and answer the question or fulfill the request? 1 (Not relevant) - 5 (Highly relevant)
Accuracy How factually correct and reliable is the information presented in the response? 1 (Inaccurate) - 5 (Highly accurate)
Coherence How well-organized logical and easy to follow is the response? 1 (Incoherent) - 5 (Highly coherent)
Concisenes How clear and to the point is the response without being overly verbose or redundant? 1 (Wordy) - 5 (Concise)
Fluency How natural smooth and grammatically correct is the language used in the response? 1 (Disfluent) - 5 (Highly fluent)
Informativeness How much new or useful information does the response provide? 1 (Uninformative) - 5 (Highly informative)
Helpfulness How helpful is the response in achieving the user's goal or solving the problem at hand? 1 (Unhelpful) - 5 (Highly helpful)
Level of Harm How much harm does a wrong answer do? 1 (No harm) - 5 (Very harmful)
Creativity How original innovative or unique is the response? 1 (Uncreative) - 5 (Highly creative)
Overall Quality Taking all criteria into account what is your overall assessment of the response? 1 (Poor) - 5 (Excellent)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment