-
Google Deepmind
- https://yaqingwang.github.io/
- @Yaqing_Wang
Starred repositories
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
✨✨Latest Advances on Multimodal Large Language Models
An MBTI Exploration of Large Language Models
Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)
Deal or No Deal? End-to-End Learning for Negotiation Dialogues
[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
Code and datasets for "Character-LLM: A Trainable Agent for Role-Playing"
Robust recipes to align language models with human and AI preferences
Paper List of Pre-trained Foundation Recommender Models
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
DeepSeek Coder: Let the Code Write Itself
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
Reverse Instructions to generate instruction tuning data with corpus examples
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
Reference implementation for DPO (Direct Preference Optimization)
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks