-
Alibaba
- Beijing, China
- https://sites.google.com/site/jhdubjtu/home
Starred repositories
Ghostty-based macOS terminal with vertical tabs and notifications for AI coding agents
an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures
Post-training with Tinker
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step." — Andrej Karpathy. A frontier, first-principles handbook inspi…
Awesome Deep Learning papers for industrial Search, Recommendation and Advertisement. They focus on Embedding, Matching, Pre-Ranking, Ranking (CTR/CVR prediction), Post Ranking, Relevance, LLM, Rei…
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Awesome Unified Multimodal Models
Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".
This repo contains the code for 1D tokenizer and generator
An easy-to-use framework for large scale recommendation algorithms.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Solve Visual Understanding with Reinforced VLMs
ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括359个大模型,覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及step3.5-flash、kimi-k2.5、ernie4.5、…
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
A PyTorch Library for Multi-Task Learning
A playbook for effectively prompting post-trained LLMs
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
[ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
🍃 MINT-1T: A one trillion token multimodal interleaved dataset.
Training MLPs on Graphs without Supervision, WSDM 25
[ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)
OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)
Eagle: Frontier Vision-Language Models with Data-Centric Strategies
🚀 Efficient implementations for emerging model architectures