-
Tsinghua University
- Shanghai
-
18:51
(UTC +08:00)
Stars
Benchmarking coding-agent paradigms (plain prompt, /goal, skills, autoresearch...) on one fixed task: hand-writing the fastest fp16 GEMM kernel for the NVIDIA A100.
A deep learning training framework in MoonBit with tape-based autograd, a PyTorch-like API, and tagless final backends for CPU/GPU. Accelerated by CUDA/cuDNN.
A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value.
A curated collection of awesome MoonBit tools, frameworks, libraries and articles.
A simple, declarative, functional web UI framework
agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
GO Simple Tunnel - a simple tunnel written in golang
Universal instructions for running a K8s cluster with various Container Runtime inside a Proxmox LXC container.
A container platform that needs no Kubernetes learning, Build, deploy, assemble, and manage apps on Kubernetes, no K8s expertise needed, all in a graphical platform.
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE, WebAssembly, VSX, RISC-V))
SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
🗻 Log-structured, embeddable key-value storage engine written in Rust
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model