Stars
KodCode-AI / code-r1
Forked from ganler/code-r1Reproducing R1 for Code with Reliable Rewards
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
DataSciBench: An LLM Agent Benchmark for Data Science
Official PyTorch implementation for "Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data" (ICLR 2025)
[AAAI'2025] Official PyTorch implementation of the paper "Identity-Text Video Corpus Grounding".
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
A curated list of awesome Deep Reinforcement Learning resources.
OpenChat: Advancing Open-source Language Models with Imperfect Data
Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
Python Implementation of Reinforcement Learning: An Introduction
MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING
CTF framework and exploit development library
Playing Hollow Knight with reinforcement learning.
Instruction Tuning with GPT-4
Existing Literature about Machine Unlearning
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
A modular RL library to fine-tune language models to human preferences