Skip to content
View pOtatOxin's full-sized avatar

Block or report pOtatOxin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction

C++ 2,569 506 Updated Jun 12, 2026

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 12,978 1,485 Updated Jun 12, 2026

AI Tensor Engine for ROCm

Python 460 351 Updated Jun 14, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,553 256 Updated Jun 14, 2026

List of papers related to neural network quantization in recent AI conferences and journals.

832 66 Updated Mar 27, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,561 317 Updated Jul 17, 2025

High Performance LLM Inference Operator Library

C++ 935 96 Updated Jun 11, 2026

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook 1,752 292 Updated Jun 11, 2026

A Python DSL to write Nvidia PTX for Hopper and Blackwell in JAX and PyTorch

Python 311 26 Updated May 8, 2026

GPUOcelot: A dynamic compilation framework for PTX

C++ 226 18 Updated Feb 9, 2025

PTX on XPUs

Rust 130 3 Updated Jun 14, 2026

LLM Architecture Gallery source data

1,300 110 Updated Jun 14, 2026

Open-source, low-cost 10.5 GHz PLFM phased array RADAR system

PLSQL 21,636 5,097 Updated May 29, 2026

WonderTrader——量化研发交易一站式框架

C++ 6,139 1,163 Updated Sep 30, 2025

A Lightweight LLM Post-Training Library

Python 2,338 307 Updated Jun 13, 2026

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Python 66,503 5,961 Updated Jun 14, 2026

Qwen3.5-Thor — High-performance BF16/NVFP4 inference engine for Qwen3.5 model family on NVIDIA Jetson AGX Thor (SM110a Blackwell). C++17/CUDA, Ollama/OpenAI compatible API.

C++ 10 Updated Apr 2, 2026

A high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators.

C++ 1,329 228 Updated Jun 13, 2026

llama.cpp fork with additional SOTA quants and improved performance

C++ 2,737 350 Updated Jun 14, 2026

Distribute and run LLMs with a single file.

C++ 24,943 1,392 Updated Jun 9, 2026

Port of OpenAI's Whisper model in C/C++

C++ 50,712 5,663 Updated Jun 9, 2026

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++

C++ 6,262 663 Updated Jun 14, 2026

C inference for Qwen3-ASR 0.6b and 1.7b transcriptions models

C 563 66 Updated Feb 17, 2026

Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch — just C and BLAS. Supports 0.6B and 1.7B models, 9 voices, 10 languages.

C 54 10 Updated Jun 7, 2026

On-device AI across mobile, embedded and edge for PyTorch

Python 4,730 1,030 Updated Jun 14, 2026

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 24,960 4,808 Updated Jun 14, 2026

The VESC motor control firmware

C 3,199 1,839 Updated Jun 13, 2026

Run a 1-billion parameter LLM on a $10 board with 256MB RAM

C 1,649 205 Updated Feb 22, 2026

An awesome & curated list of best LLMOps tools for developers

Shell 5,843 838 Updated May 21, 2026

Packages related to gathering, viewing, and analyzing diagnostics data from robots.

C++ 154 203 Updated Jun 12, 2026
Next