Starred repositories
MoCo: A One-Stop Shop for Model Collaboration Research
Manifold-Constrained Hyper-Connections with fused Triton kernels for efficient training
Official MLCC implementation for the paper “Compress, Cross, and Scale: Multi-Level Compression Cross Networks for Efficient Scaling in Recommender Systems”.
A Flexible Framework for Generative Recommendation
IntTravel: A Real-World Dataset and Generative Framework for Integrated Multi-Task Travel Recommendation
LLM Inference with Deep Learning Accelerator.
Scaling laws for different architectures have different slopes
Total Electron Content Prediction using Multi-modal Large Language Model - A hybrid deep learning model combining Graph Neural Networks, Temporal CNNs, and LLMs with LoRA fine-tuning for ionosphere…
PGTS with the SPAWN operation to enhance Qwen3b's reasoning in math domains
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"
A sparse attention kernel supporting mix sparse patterns
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG against first-stage retrieval errors on 18 languages.
A Qwen Contrastive Learning test based on Jax and EasyDel. Used for Recommendation System.
Ring attention implementation with flash attention
a list of demo websites for automatic music generation research
Unveiling Inference Scaling for Difference-Aware User Modeling in LLM Personalization
Fast hierarchical embedding cache for recommenders
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
Official codebase for CuGRO: Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay