jr-shen

🤔

Junru Shen jr-shen

🤔

23 followers · 80 following

21:27 (UTC +08:00)

Achievements

Highlights

Lists (5)

Sort

Stars

karpathy / autoresearch

AI agents running research on single-GPU nanochat training automatically

Python 62,373 8,719 Updated Mar 26, 2026

aiming-lab / AutoResearchClaw

Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞

Python 9,752 1,067 Updated Mar 31, 2026

microsoft / DiskANN

Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search

Rust 1,738 391 Updated Mar 31, 2026

psf / pyperf

Toolkit to run Python benchmarks

Python 920 97 Updated Mar 21, 2026

tokio-rs / loom

Concurrency permutation testing tool for Rust.

Rust 2,654 133 Updated Feb 20, 2026

taco-project / FlexKV

Python 198 38 Updated Mar 31, 2026

datenlord / open-rdma

RoCE v2 hardware and software implementation

187 41 Updated Sep 26, 2024

IcicleF / tx-traces

Transaction traces generated by FissLock's generator.

1 Updated Dec 19, 2024

stackhpc / ansible-role-ofed

Installs OFED from Mellanox Repositories

Jinja 7 4 Updated Apr 8, 2022

nelhage / reptyr

Reparent a running program to a new terminal

C 6,219 230 Updated Nov 20, 2025

George-Miao / clashctl

CLI for interacting with clash

Rust 256 28 Updated Apr 2, 2024

asterinas / asterinas

Asterinas aims to be a production-grade Linux alternative—memory safe, high-performance, and more.

Rust 4,394 286 Updated Mar 31, 2026

mjun0812 / flash-attention-prebuild-wheels

Provide with pre-build flash-attention package wheels on Linux and Windows platforms using GitHub Actions

Python 1,181 60 Updated Mar 31, 2026

NixOS / nix

Nix, the purely functional package manager

C++ 16,463 1,878 Updated Mar 31, 2026

twpayne / chezmoi

Manage your dotfiles across multiple diverse machines, securely.

Go 18,821 619 Updated Mar 30, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,882 542 Updated Mar 13, 2026

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,788 1,024 Updated Mar 30, 2026

XpuOS / xsched

A scheduling framework for multitasking over diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs

C 163 22 Updated Jan 13, 2026

falcon-infra / falconfs

FalconFS is a high-performance distributed file system (DFS) designed for AI workloads.

C++ 60 18 Updated Mar 25, 2026

ekondis / mixbench

A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)

C++ 452 71 Updated Feb 7, 2026

facebookincubator / strobelight

Meta's fleetwide profiler framework

C++ 346 22 Updated Sep 22, 2025

thustorage / GustANN

High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU [to appear in SIGMOD'26]

Cuda 21 5 Updated Jan 16, 2026

thustorage / PipeANN

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

Jupyter Notebook 106 40 Updated Mar 28, 2026

NoakLiu / PiKV

PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]

Python 49 5 Updated Feb 24, 2026

llm-d / llm-d-kv-cache

Distributed KV cache scheduling & offloading libraries

Go 122 105 Updated Mar 31, 2026

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,448 493 Updated Mar 31, 2026

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Python 5,244 835 Updated Mar 31, 2026

THUDM / LongBench

LongBench v2 and LongBench (ACL 25'&24')

Python 1,137 124 Updated Jan 15, 2025

NVIDIA / RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python 1,492 125 Updated Nov 13, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,393 774 Updated Mar 30, 2026

Junru Shen jr-shen

Highlights

Lists (5)

ANNS

GPU Programming

LLM Benchmarks

LLM Inference Engine

LLM KVCache Management

Stars