Starred repositories
An easy to set up and use "SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model" using Docker image on GPU as well as CPU.
Simulator and scenario evaluation notebooks for "Diagnosing ML-Driven Queue Systems in Production" — ICDM 2026 Applied Track
YaFF is a high-performance C++ serialization library that provides a zero-copy wire format for the Protobuf ecosystem.
A PyTorch framework for training transformer language models with Mixture of Experts (MoE) architecture support, Mixture of Depths (MoD), and DeepSpeed integration. Implements models from 70M to 30…
RecStore: High-performance parameter storage for large-scale recommendation models, unifying heterogeneous memory as a scalable embedding pool.
Full fine-tuning project for Qwen3-VL-2B with HF parquet-to-JSON conversion and post-finetune evaluation
Experimental GPT-2 scale (~124M param) LLM trained from scratch. Trained on 22B tokens od Cosmopedia Dataset. Includes full training pipeline, with SFT FineTuning and log analysis tools with backen…
TOML-driven diffusion training on Linux: DeepSpeed, LoRA/LoKr/full finetune (SDXL, Cosmos Predict2), optional web UI — rengu CLI
An easy-to-configure and extensible veRL extension for agent RL training with skill co-evolution.
DPP + Slide Window + GPU. DDP多样性推荐算法,滑窗+GPU加速。
ai agents for trading
Build the product: first Claude call → tools → evals → cost engineering → MCP. The 15-minute founder workshop.
Winner 🏆 (Agent-only) MLSys 2026 - FlashInfer AI Kernel Generation Contest for the DeepSeek Sparse Attention (DSA) track with an average speedup of 34.93x
This Repository includes recent papers (RecSys, SIGIR, WWW, etc.) related to the Recommender Systems
Causal ML project with public marketing dataset Hillstrom. The aim is to find the best intervention for each customer: send a promotional email (and if yes, which one) or not in order to maximise t…
AI-powered Resume Screening & Intelligent Shortlisting system with multi-signal ranking (SBERT, TF-IDF, Cross-Encoder, Learning-to-Rank), PDF upload, skill-based explainability, and a glassmorphic …
A generalist autonomous research agent — runs experiments, researches, and iteratively optimizes, autonomously.
Official source code for SIGIR 2026 paper: Fusion and Alignment Enhancement with Large Language Models for Tail-item Sequential Recommendation