Popular repositories Loading
-
-
-
-
-
benchmarking-eng-takehome
benchmarking-eng-takehome PublicStarter Code for Benchmarking Engineer Takehome
CSS 1
Repositories
Showing 10 of 10 repositories
- model-library Public
vals-ai/model-library’s past year of commit activity - finance-agent Public
vals-ai/finance-agent’s past year of commit activity - ioi-agent Public
vals-ai/ioi-agent’s past year of commit activity - corp-fin-vals-sdk Public
vals-ai/corp-fin-vals-sdk’s past year of commit activity - med-alignment Public
vals-ai/med-alignment’s past year of commit activity - petri Public Forked from safety-research/petri
co-opting Anthropic's red-teaming tool for therapy benchmarking
vals-ai/petri’s past year of commit activity - SWE-agent Public archive Forked from SWE-agent/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
vals-ai/SWE-agent’s past year of commit activity - valsai-github-action Public
vals-ai/valsai-github-action’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Most used topics
Loading…