Skip to content
View iofu728's full-sized avatar
😶
Focusing
😶
Focusing

Organizations

@QwenLM

Block or report iofu728

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

high-performance linear attention kernel library built on TileLang

Python 342 24 Updated Apr 30, 2026

A kernel library written in tilelang

Python 1,363 114 Updated Apr 23, 2026

FlashKDA: high-performance Kimi Delta Attention kernels

Cuda 403 30 Updated Apr 22, 2026

An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Python 341 32 Updated Apr 30, 2026

CUDA Kernel Benchmarking Library

Cuda 859 105 Updated Apr 22, 2026

CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

Python 481 50 Updated Apr 24, 2026
Python 81 5 Updated Apr 29, 2026

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 300 23 Updated Apr 27, 2026

Connect to any agents with WeChat ClawBot.

Go 1,352 161 Updated Apr 1, 2026

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.

Python 1,331 128 Updated Mar 19, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,181 552 Updated Apr 30, 2026

AI agents running research on single-GPU nanochat training automatically

Python 78,027 11,377 Updated Mar 26, 2026

A lightweight inference engine supporting speculative speculative decoding (SSD).

Python 900 63 Updated Mar 22, 2026

A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modular configuration across SFT, RLVR, and evaluation workflows.

Python 560 40 Updated Apr 28, 2026

A simple, fast and robust program-aware agentic inference system.

Python 278 23 Updated Mar 16, 2026

FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels

Python 161 130 Updated Apr 26, 2026

Building the Virtuous Cycle for AI-driven LLM Systems

Python 227 40 Updated Apr 28, 2026

A rejection-sampling based distribution alignment method for extreme actor-policy mismatch RL Training

Python 15 1 Updated Feb 11, 2026

FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.

Rust 60 7 Updated Feb 6, 2026
Python 65 6 Updated Feb 5, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 366,644 75,271 Updated Apr 30, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 4,359 330 Updated Jan 14, 2026

DFlash: Block Diffusion for Flash Speculative Decoding

Python 2,431 175 Updated Apr 26, 2026

OpenAI Frontier Evals

Python 1,183 149 Updated Apr 21, 2026

A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.

C++ 194 32 Updated Apr 27, 2026
Next