• Log in
  • Register

linkhut
Bookmarks
tagged with:
  • eval
Sort by:
  • recency
  • popularity
Order:
  • descending
  • ascending

28 Jan 25

Your dataset is the heart of your LLM eval. To the extent possible, it should closely represent true inputs into your LLM app.

https://www.promptfoo.dev/docs/configuration/datasets/
by ciwchris 11 months ago
Tags:
  • ai
  • prompt
  • eval

20 Jan 25

Tutorial: Evaluate an LLM's prompt completions

https://learn.microsoft.com/en-us/dotnet/ai/tutorials/llm-eval
by ciwchris 11 months ago
Tags:
  • ai
  • eval

12 Dec 24

Evaluation and monitoring metrics for generative AI

https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-metrics-built-in
by ciwchris 1 year ago
Tags:
  • ai
  • eval

11 Dec 24

Task-Specific LLM Evals that Do & Don't Work

https://eugeneyan.com/writing/evals/
by ciwchris 1 year ago
Tags:
  • ai
  • eval

Tags
Sort by:
  • label
  • usage
Order:
  • ascending
  • descending
  • ai
  • eval
  • prompt
Explore
  • Recent
  • Popular
RSS feed

linkhut is open source software. You can contribute and report issues on SourceHut at ~mlb/linkhut (v0.1.0)