Skip to content

Conversation

@caizer0x
Copy link
Contributor

@caizer0x caizer0x commented Apr 28, 2025

Added 85 qa pairs in jupyter notebook

Peformance:

Gpt 4.1
Total questions: 85
Total time: 49.63 seconds, Average time per question: 0.58 seconds
Average score: 0.18

Inkeep
Total questions: 85
Total time: 67.78 seconds, Average time per question: 0.80 seconds
Average score: 0.52

`Inkeep had worse performance on gill and litesvm, missing docs?`

# Randomized 20 questions
Inkeep rag + gpt 4.1
Total questions: 20
Total time: 87.67 seconds, Average time per question: 4.38 seconds
Average score: 0.60

Inkeep expert
Total questions: 20
Total time: 59.92 seconds, Average time per question: 3.00 seconds
Average score: 0.60

Gpt 4.1
Total questions: 20
Total time: 45.33 seconds, Average time per question: 2.27 seconds
Average score: 0.15

Also added MCP tool picking eval
Performance: 70/85 correct tool usage

Added langsmith for observability

@vercel
Copy link

vercel bot commented Apr 28, 2025

@caizer0x is attempting to deploy a commit to the Solana Foundation team on Vercel, but is not a member of this team. To resolve this issue, you can:

  • Make your repository public. Collaboration is free for open source and public repositories.
  • Add @caizer0x as a member. A Pro subscription is required to access Vercel's collaborative features.
    • If you're the owner of the team, click here and add @caizer0x as a member.
    • If you're the user who initiated this build request, click here to request access.
    • If you're already a member of the Solana Foundation team, make sure that your Vercel account is connected to your GitHub account.

To read more about collaboration on Vercel, click here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants