Log in
Register
linkhut
Bookmarks
tagged with:
eval
Sort by:
recency
popularity
Order:
descending
ascending
2 days ago
Testing Agent Skills Systematically with Evals
https://developers.openai.com/blog/eval-skills
by
shubxam
2 days ago
Tags:
llm
eval
blog-post
tutorial
28 Jan 25
Your dataset is the heart of your LLM eval. To the extent possible, it should closely represent true inputs into your LLM app.
https://www.promptfoo.dev/docs/configuration/datasets/
by
ciwchris
1 year ago
Tags:
ai
prompt
eval
20 Jan 25
Tutorial: Evaluate an LLM's prompt completions
https://learn.microsoft.com/en-us/dotnet/ai/tutorials/llm-eval
by
ciwchris
1 year ago
Tags:
ai
eval
12 Dec 24
Evaluation and monitoring metrics for generative AI
https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-metrics-built-in
by
ciwchris
1 year ago
Tags:
ai
eval
11 Dec 24
Task-Specific LLM Evals that Do & Don't Work
https://eugeneyan.com/writing/evals/
by
ciwchris
1 year ago
Tags:
ai
eval