Stars
We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference…
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
An unofficial, typed, asynchronous Python SDK for Tastytrade!
Let Claude manage your tastytrade portfolio.
This repository contains the toolkit for replicating results from our technical report.
"what, how, where, and how well? a survey on test-time scaling in large language models" repository
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
TradingAgents: Multi-Agents LLM Financial Trading Framework
"AI-Trader: 100% Fully-Automated Agent-Native Trading"
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
Implement a reasoning LLM in PyTorch from scratch, step by step
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
[arXiv 25] OCRGenBench: A Comprehensive Benchmark for Evaluating OCR Generative Capabilities
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
A search engine dedicated to CS conferences. It provides useful filters for conferences and year range.
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.5, GPT-OSS, Llama, and more!
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Implementation of Reinforcement Pre-Training (RPT) for Language Models - ArXiv:2506.08007
Repo for paper "On The Design Choices of Next Level LLMs"
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
LLaMA 2 implemented from scratch in PyTorch
The simplest, fastest repository for training/finetuning medium-sized GPTs.