Skip to content
View vasqu's full-sized avatar
🐢
🐢

Block or report vasqu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 265 21 Updated Jun 13, 2026

high-performance linear attention kernel library built on TileLang

Python 536 47 Updated May 7, 2026

A pure-Python implementation of the Nvidia CuTe layout algebra intended to be approachable and easy to learn.

Python 185 12 Updated May 15, 2026

Open ABI and FFI for Machine Learning Systems

C++ 411 80 Updated Jun 13, 2026

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 204 14 Updated Jun 11, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 4,455 340 Updated Jan 14, 2026

Accelerating MoE with IO and Tile-aware Optimizations

Python 713 89 Updated Jun 13, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,395 698 Updated May 17, 2026

PyTorch building blocks for the OLMo ecosystem

Python 1,292 256 Updated Jun 14, 2026

TTS model capable of streaming conversational audio in realtime.

Python 1,143 98 Updated Nov 29, 2025

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 884 153 Updated Jun 13, 2026

A Quirky Assortment of CuTe Kernels

Python 1,012 136 Updated Jun 14, 2026

Open-Source Frontier Voice AI

Python 49,329 5,488 Updated May 6, 2026

Build compute kernels and load them from the Hub.

Python 692 105 Updated Jun 12, 2026

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 20,162 2,094 Updated Jun 9, 2026

[CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network

Python 111 8 Updated Jul 18, 2025
Python 135 6 Updated Feb 4, 2026

Nano vLLM

Python 14,020 2,211 Updated Apr 26, 2026

An extremely fast Python type checker and language server, written in Rust.

Python 18,939 302 Updated Jun 12, 2026

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 478 26 Updated May 17, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 27,299 1,991 Updated Jan 9, 2026

A PyTorch native platform for training generative AI models

Python 5,436 860 Updated Jun 14, 2026

An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs

Python 943 106 Updated Jun 14, 2026

A Conversational Speech Generation Model

Python 14,666 1,485 Updated May 27, 2025

🤖FFPA: Extends FlashAttention-2 via Split-D for large headdims, 1.5x~3×↑🎉 vs SDPA, up to 430T🎉 on H200.

Python 310 20 Updated Jun 12, 2026

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 1,960 208 Updated Jun 6, 2026

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 3,433 328 Updated Jul 7, 2025

Karabiner-Elements is a powerful tool for customizing keyboards on macOS

C++ 22,312 913 Updated Jun 14, 2026
Next