Starred repositories
PyTorch building blocks for the OLMo ecosystem
NeurIPS'24 DB (Spotlight) | Instruction Tuning Large Language Models to Understand Electronic Health Records
Python toolkit for building graph-enhanced GenAI applications
An interpretable large language model (LLM) for medical diagnosis.
DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue
Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
OctoTools: An agentic framework with extensible tools for complex reasoning
Training HuggingFace models on EHR data
[NeurIPS '25] Knowledge Graph Generation from Any Text
[ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
[ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"
[NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning
Code for "DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation" (ACL 2024)
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Medical o1, Towards medical complex reasoning with LLMs
Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models (CRFM) at Stanford for holistic, reproducible and transparen…
[COLM’25] DeepRetrieval — 🔥 Training Search Agent by RLVR with Retrieval Outcome
MedRAX: Medical Reasoning Agent for Chest X-ray - ICML 2025
Minimal reproduction of DeepSeek R1-Zero