SuperCB

Follow

🏠

Working from home

CuiBo SuperCB

🏠

Working from home

Follow

Learning Machine Learning System

50 followers · 178 following

rednote-hilab
Beijing

Achievements

Achievements

Lists (1)

Sort

MLsys

Starred repositories

dautroc / flash-vscode

Flash nvim for VSCode

TypeScript 8 2 Updated Dec 12, 2025

NVlabs / GatedDeltaNet

[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule

Python 450 25 Updated Sep 15, 2025

hhy-huang / HiRAG

[EMNLP'25 findings] This is the official repo for the paper, HiRAG: Retrieval-Augmented Generation with Hierarchical Knowledge.

Python 511 78 Updated Nov 19, 2025

getzep / graphiti

Build Real-Time Knowledge Graphs for AI Agents

Python 22,584 2,226 Updated Feb 6, 2026

ImprintLab / Medical-Graph-RAG

A Graph RAG System for Evidenced-based Medical Information Retrieval [ACL 2025]

Python 723 121 Updated Oct 18, 2025

emmericp / ixy

A simple yet fast user space network driver for Intel 10 Gbit/s NICs written from scratch

C 1,287 138 Updated Feb 19, 2022

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,388 1,341 Updated Jul 9, 2025

NoakLiu / PiKV

PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]

Python 48 7 Updated Oct 19, 2025

bytedance / InfiniStore

KV cache store for distributed LLM inference

C++ 390 34 Updated Nov 13, 2025

Victarry / PP-Schedule-Visualization

Pipeline Parallelism Emulation and Visualization

Python 77 9 Updated Jan 8, 2026

policy-gradient / GRPO-Zero

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,761 87 Updated Apr 18, 2025

AnthonyCalandra / modern-cpp-features

A cheatsheet of modern C++ language and library features.

21,436 2,257 Updated Apr 5, 2025

cunbidun / flash.vscode

Flash VSCode is a minimal port of the flash.nvim Neovim plugin

TypeScript 10 1 Updated May 3, 2025

McGill-NLP / nano-aha-moment

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 591 53 Updated Oct 7, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,703 1,551 Updated Apr 24, 2025

codefuse-ai / CodeFuse-Embeddings

Python 281 46 Updated Jan 1, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,453 981 Updated Feb 6, 2026

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 2,042 130 Updated Apr 3, 2025

yosefk / funtrace

A fast, small C/C++ function call tracer for x86-64/Linux, supports clang & gcc, ftrace, threads, exceptions & shared libraries

C++ 194 2 Updated Mar 25, 2025

slow-steppers / NeighborHash

A faster int-to-int hashmap implemented in C++.

C++ 50 9 Updated Jan 6, 2025

codefuse-ai / RepoFuse

Python 65 5 Updated Jan 16, 2025

glb400 / Toy-RecLM

A toy large model for recommender system based on LLaMA2/SASRec/Meta's generative recommenders. Besides, note and experiments of official implementation for Meta's generative recommenders.

Python 68 6 Updated Apr 25, 2024

fenbf / AwesomePerfCpp

A curated list of awesome C/C++ performance optimization resources: talks, articles, books, libraries, tools, sites, blogs. Inspired by awesome.

CSS 2,518 261 Updated Sep 22, 2022

uccl-project / uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,205 119 Updated Feb 6, 2026

aorwall / moatless-tree-search

Python 132 28 Updated Jun 6, 2025

k4black / codebleu

Pip compatible CodeBLEU metric implementation available for linux/macos/win

Python 130 28 Updated Mar 31, 2025

agiresearch / AIOS

AIOS: AI Agent Operating System

Python 5,036 674 Updated Jan 22, 2026

ashvardanian / less_slow.cpp

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO

C++ 1,899 83 Updated Dec 23, 2025

wuye9036 / CppTemplateTutorial

中文的C++ Template的教学指南。与知名书籍C++ Templates不同，该系列教程将C++ Templates作为一门图灵完备的语言来讲授，以求帮助读者对Meta-Programming融会贯通。(正在施工中)

C++ 10,515 1,622 Updated Aug 20, 2024

FSoft-AI4Code / RepoExec

[NAACL 2025] Benchmark for Repository-Level Code Generation, focus on Executability, Correctness from Test Cases and Usage of Contexts from Cross-file Dependencies

Python 41 4 Updated Jan 8, 2026

Starred topics

anomaly-detection

polyhedral-model

embedded-machine-learning

Compiler

Emulator

Database