Skip to content
View SuperCB's full-sized avatar
🏠
Working from home
🏠
Working from home
  • rednote-hilab
  • Beijing

Block or report SuperCB

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[EMNLP'25 findings] This is the official repo for the paper, HiRAG: Retrieval-Augmented Generation with Hierarchical Knowledge.

Python 445 60 Updated Sep 28, 2025

Build Real-Time Knowledge Graphs for AI Agents

Python 19,799 1,866 Updated Nov 5, 2025

A Graph RAG System for Evidenced-based Medical Information Retrieval [ACL 2025]

Python 633 107 Updated Oct 18, 2025

A simple yet fast user space network driver for Intel 10 Gbit/s NICs written from scratch

C 1,276 136 Updated Feb 19, 2022

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,382 1,367 Updated Jul 9, 2025

PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]

Python 42 6 Updated Oct 19, 2025

KV cache store for distributed LLM inference

C++ 355 30 Updated Sep 10, 2025

Pipeline Parallelism Emulation and Visualization

Python 70 5 Updated Jun 12, 2025

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,651 76 Updated Apr 18, 2025

A cheatsheet of modern C++ language and library features.

21,208 2,237 Updated Apr 5, 2025

Flash VSCode is a minimal port of the flash.nvim Neovim plugin

TypeScript 10 1 Updated May 3, 2025

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 552 51 Updated Oct 7, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 12,355 1,523 Updated Apr 24, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,842 896 Updated Sep 30, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,950 116 Updated Apr 3, 2025

A fast, small C/C++ function call tracer for x86-64/Linux, supports clang & gcc, ftrace, threads, exceptions & shared libraries

C++ 189 2 Updated Mar 25, 2025

A faster int-to-int hashmap implemented in C++.

C++ 49 8 Updated Jan 6, 2025
Python 62 5 Updated Jan 16, 2025

A toy large model for recommender system based on LLaMA2/SASRec/Meta's generative recommenders. Besides, note and experiments of official implementation for Meta's generative recommenders.

Python 65 6 Updated Apr 25, 2024

A curated list of awesome C/C++ performance optimization resources: talks, articles, books, libraries, tools, sites, blogs. Inspired by awesome.

CSS 2,480 260 Updated Sep 22, 2022

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 822 73 Updated Nov 5, 2025

Pip compatible CodeBLEU metric implementation available for linux/macos/win

Python 117 26 Updated Mar 31, 2025

AIOS: AI Agent Operating System

Python 4,759 609 Updated Oct 25, 2025

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO

C++ 1,870 75 Updated Sep 10, 2025

中文的C++ Template的教学指南。与知名书籍C++ Templates不同,该系列教程将C++ Templates作为一门图灵完备的语言来讲授,以求帮助读者对Meta-Programming融会贯通。(正在施工中)

C++ 10,442 1,618 Updated Aug 20, 2024

[NAACL 2025] Benchmark for Repository-Level Code Generation, focus on Executability, Correctness from Test Cases and Usage of Contexts from Cross-file Dependencies

Python 34 3 Updated Mar 7, 2025

🚴 Call stack profiler for Python. Shows you why your code is slow!

Python 7,459 254 Updated Nov 3, 2025
Python 33 2 Updated Jun 5, 2025
Next