Persistent file-based planning for AI coding agents and long-running agentic tasks. Crash-proof markdown plans that survive context loss and /clear, plus a deterministic completion gate and multi-a…

Python 23,760 2,074 Updated Jun 16, 2026

Panniantong / Agent-Reach

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

Python 37,700 2,992 Updated Jun 16, 2026

BytedTsinghua-SIA / CUDA-Agent

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Python 1,078 90 Updated Mar 4, 2026

deepreinforce-ai / CUDA-L2

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

Cuda 441 28 Updated Mar 30, 2026

fzyzcjy / torch_memory_saver

Allow torch tensor memory to be released and resumed later

Python 251 60 Updated May 16, 2026

alibaba / RecIS

A unified architecture deep learning framework designed specifically for ultra-large-scale sparse models.

Python 348 24 Updated Feb 9, 2026

deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Python 3,336 586 Updated Apr 15, 2024

keirp / OpenWebMath

XSLT 171 11 Updated May 2, 2024

langgenius / dify

Production-ready platform for agentic workflow development.

TypeScript 146,159 22,985 Updated Jun 22, 2026

chongminggao / KuaiRand

An Unbiased Sequential Recommendation Dataset with Randomly Exposed Videos

HTML 131 11 Updated Jan 6, 2026

onnx / onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 1,032 436 Updated Jun 18, 2026

NVIDIA / recsys-examples

Examples for Recommenders - easy to train and deploy on accelerated infrastructure.

Python 282 71 Updated Jun 17, 2026

LibreCAD / LibreCAD

LibreCAD is a cross-platform 2D CAD program. It can read DXF/DWG, and write DXF/DWG/PDF/SVG files. It supports point/line/circle/ellipse/parabola/hyperbola/spline primitives. The GUI is highly cust…

C++ 5,978 1,235 Updated Jun 22, 2026

anilshanbhag / gpu-topk

Efficient Top-K implementation on the GPU

Cuda 191 25 Updated Apr 9, 2019

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

Cuda 784 195 Updated Jun 22, 2026

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 7,401 1,059 Updated Jun 4, 2026

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 9,751 1,293 Updated Jun 15, 2026

zartbot / learn_cutlass

Sass 4 3 Updated Sep 14, 2024

pranjalssh / fast.cu

Fastest kernels written from scratch

Cuda 583 76 Updated Sep 18, 2025

ngocson2vn / learncuda

Learning CUDA

HTML 6 Updated Jun 17, 2026

csullivan / wgmma-intrin

Cuda 1 Updated Sep 26, 2024

Faraz9877 / H100_GEMM

High-performance GEMM implementation optimized for NVIDIA H100 GPUs, leveraging Hopper architecture's TMA, WGMMA, and Thread Block Clusters for near-peak theoretical performance.

Cuda 11 Updated Dec 4, 2024

networkx / networkx

Network Analysis in Python

Python 17,032 3,521 Updated Jun 20, 2026

cybertronai / gradient-checkpointing

Make huge neural nets fit in memory

Python 2,840 279 Updated Apr 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lifann

Organizations

Block or report Lifann

Starred repositories

Cambricon / mlu-ops

patrick-toulme / pyptx

jarrodwatts / claude-hud

lobehub / lobehub

SuperClaude-Org / SuperClaude_Framework

SawyerHood / dev-browser

OthmanAdi / planning-with-files