Evaluation workflow overview
Evaluation for GenAI output is inherently challenging because GenAI model responses are varied and context-dependent.
Onboard evaluation artifacts
Evaluation for GenAI output begins with onboarding, where you define the key elements of the benchmark.
Run an evaluation benchmark
Once you've completed artifact onboarding, it's time to set the benchmark for this GenAI model.
Refine evaluation benchmark
After running the initial evaluation, you may need to refine it. This step is iterative, with the end goal of having a benchmark that fully aligns with business...
Refine GenAI system
The end goal for GenAI output evaluation is to use the insights to refine your LLM system until it is production-worthy and meets your criteria.