Skip to content
View chiakicage's full-sized avatar
🦀
rusting
🦀
rusting
  • Zhejiang University

Highlights

  • Pro

Block or report chiakicage

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.

Rust 173,429 104,924 Updated Apr 6, 2026
Cuda 31 5 Updated Dec 19, 2025

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 763 45 Updated Apr 6, 2026

Speed of Light Analysis for ML Model Runtime

Python 47 4 Updated Apr 2, 2026

A benchmark of real-world DL kernel problems

Python 161 13 Updated Apr 2, 2026

Universal LLM Deployment Engine with ML Compilation

Python 22,364 1,989 Updated Apr 6, 2026

Collection of memory microbenchmarks to investigate NVIDIA GPUs Network on Chip architectures

Cuda 13 2 Updated Jul 2, 2025

Learn Lean 4 with PLFA proofs.

Lean 107 8 Updated Apr 8, 2025

A garden of small programming language implementations 🪴

OCaml 314 8 Updated Mar 26, 2026

Modify implementations for Pierce' Types and Programming Languages to add a REPL, convert into dune projects, and provide preconfigured development containers based on devfiles

OCaml 71 12 Updated Apr 11, 2023

dLLM: Simple Diffusion Language Modeling

Python 2,312 226 Updated Feb 27, 2026

Userspace eBPF runtime for Observability, Network, GPU & General Extensions Framework

C++ 1,451 169 Updated Mar 19, 2026

Perplexity open source garden for inference technology

Rust 388 36 Updated Dec 25, 2025

DELTA-pytorch:DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

C++ 12 3 Updated Apr 16, 2024
C++ 10 2 Updated Sep 22, 2025

We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstra…

C++ 193 12 Updated Jan 28, 2025

TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

Cuda 106 6 Updated Jun 28, 2025

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 691 40 Updated Mar 8, 2026

gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling

Python 56 4 Updated Mar 31, 2026

Curated collection of papers in machine learning systems

533 36 Updated Feb 7, 2026

Rust version of THU uCore OS. Linux compatible.

Rust 3,668 377 Updated Aug 24, 2023

[NeurIPS 2025] ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive

Cuda 66 6 Updated Dec 11, 2025
Python 250 25 Updated Jul 27, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,536 1,772 Updated Apr 2, 2026

The Higher-Order Intermediate Representation

C++ 163 19 Updated Mar 13, 2026

Random stuff.

Cuda 6 Updated Dec 29, 2025

Training neural networks in TensorFlow 2.0 with 5x less memory

Python 137 17 Updated Feb 21, 2022
Next