Skip to content
Change the repository type filter

All

    Repositories list

    • DECO

      Public
      Source code for paper "DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices".
      Python
      Other
      0100Updated May 12, 2026May 12, 2026
    • OPD

      Public
      Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
      Python
      2043020Updated May 12, 2026May 12, 2026
    • A LLM-based Agent that predict its tasks proactively.
      Python
      Apache License 2.0
      5860360Updated May 12, 2026May 12, 2026
    • LexRel

      Public
      Python
      0100Updated May 7, 2026May 7, 2026
    • CPMobius

      Public
      Python
      Apache License 2.0
      0110Updated Apr 29, 2026Apr 29, 2026
    • JustRL

      Public
      [ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
      Python
      1227100Updated Apr 18, 2026Apr 18, 2026
    • NOSA

      Public
      The official implementation of NOSA
      Python
      MIT License
      01700Updated Apr 15, 2026Apr 15, 2026
    • Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts
      Python
      23620Updated Apr 9, 2026Apr 9, 2026
    • APB

      Public
      Official Implementation of APB (ACL 2025 main Oral) and Spava (ACL 2026 main).
      C++
      53700Updated Apr 6, 2026Apr 6, 2026
    • KARL

      Public
      KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
      Python
      MIT License
      16810Updated Apr 5, 2026Apr 5, 2026
    • LexChain

      Public
      Python
      0410Updated Mar 25, 2026Mar 25, 2026
    • SE-Bench

      Public
      Official repo for "SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization"
      Python
      MIT License
      42750Updated Mar 24, 2026Mar 24, 2026
    • Python
      Apache License 2.0
      6387500Updated Mar 5, 2026Mar 5, 2026
    • ACDiT

      Public
      ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer
      Python
      MIT License
      14220Updated Jan 29, 2026Jan 29, 2026
    • Official implementation for the paper "KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs"
      Python
      12200Updated Jan 18, 2026Jan 18, 2026
    • H-Neurons

      Public
      The official implementation of the paper: H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs
      Python
      MIT License
      116210Updated Jan 14, 2026Jan 14, 2026
    • BlockFFN

      Public
      Source codes for paper "BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity".
      Python
      51900Updated Jan 10, 2026Jan 10, 2026
    • LLaVA-UHD

      Public
      LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs
      Python
      Apache License 2.0
      2142270Updated Dec 20, 2025Dec 20, 2025
    • [ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation
      Python
      57920Updated Dec 8, 2025Dec 8, 2025
    • StateX

      Public
      The official implementation of the paper "StateX: Enhancing RNN Recall via Post-training State Expansion".
      Python
      0300Updated Oct 24, 2025Oct 24, 2025
    • AgentRM

      Public
      [ACL 2025 main] AgentRM: Enhancing Agent Generalization with Reward Modeling
      Python
      0610Updated Sep 29, 2025Sep 29, 2025
    • The code of the paper Stuffed Mamba: Oversized States Lead to the Inability to Forget
      Python
      0100Updated Sep 28, 2025Sep 28, 2025
    • BurstEngine is an efficient framework designed to train LLMs on long-sequence data.
      Python
      3900Updated Sep 25, 2025Sep 25, 2025
    • The code for the paper "Cost-Optimal Grouped-Query Attention for Long-Context Modeling"
      Python
      1410Updated Sep 14, 2025Sep 14, 2025
    • SIR-Bench

      Public
      Python
      Apache License 2.0
      0510Updated Sep 12, 2025Sep 12, 2025
    • Seq1F1B

      Public
      Sequence-level 1F1B schedule for LLMs.
      Python
      Other
      4k3710Updated Aug 26, 2025Aug 26, 2025
    • FR-Spec

      Public
      [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling
      C++
      35430Updated Jul 15, 2025Jul 15, 2025
    • TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
      Python
      Apache License 2.0
      1412841Updated Jun 14, 2025Jun 14, 2025
    • Python
      01010Updated Jun 11, 2025Jun 11, 2025
    • Must-read Papers on Textual Adversarial Attack and Defense
      Python
      MIT License
      1941.6k31Updated Jun 4, 2025Jun 4, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.