-
Salesforce Research
- Palo Alto
- https://azshue.github.io/
Stars
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
My learning notes for ML SYS.
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Solve puzzles. Improve your pytorch.
Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Minimal reproduction of DeepSeek R1-Zero
[EMNLP-2024] Build multimodal language agents for fast prototype and production
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
A instruction data generation system for multimodal language models.
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
🍃 MINT-1T: A one trillion token multimodal interleaved dataset.
[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
An open-source framework for training large multimodal models.
LAVIS - A One-stop Library for Language-Vision Intelligence
A Next-Generation Training Engine Built for Ultra-Large MoE Models
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Ongoing research training transformer models at scale
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
GLIDE: a diffusion-based text-conditional image synthesis model
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
A framework for few-shot evaluation of language models.