-
Alibaba Cloud Intelligence Group
- Hangzhou, China
Stars
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Fast and memory-efficient exact kmeans
🚀 Efficient implementations for emerging model architectures
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
SGLang is a high-performance serving framework for large language models and multimodal models.
STEP-GUI: The top GUI agent solution in the galaxy. Developed by the StepFun-GELab team and powered by StepFun’s cutting-edge research capabilities.
VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs
Official implementation of "Pyramid Texture Filtering"
Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature convergence and unlock greater RL potential.
[EMNLP 2025 Main] Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models
Official Repo of paper "QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression".
🔥 Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.
[NeurIPS 2025] HoliTom: Holistic Token Merging for Fast Video Large Language Models
Code for paper: Optimizing Length Compression in Large Reasoning Models
This is the open-source code for TokenCarve.
[NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
[ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression