Stars
- All languages
- Assembly
- C
- C#
- C++
- CSS
- Cuda
- Cython
- Dockerfile
- G-code
- Go
- HCL
- HTML
- Haskell
- JavaScript
- Jupyter Notebook
- Kotlin
- Lean
- Lua
- MATLAB
- MLIR
- Makefile
- Markdown
- Nim
- Nix
- OCaml
- Objective-C++
- Processing
- Python
- Rich Text Format
- Ruby
- Rust
- Sail
- Scala
- Shell
- Svelte
- Swift
- TeX
- TypeScript
- Verilog
- Vim Script
- Zig
87 tools for Korean law — statutes, precedents, ordinances, interpretations | MCP Server · CLI · npm
REAP: Router-weighted Expert Activation Pruning for SMoE compression
Implementation of Fast Weight Attention
Official implementation for Training LLMs with MXFP4
Implements harmful/harmless refusal removal using pure HF Transformers
tokenbender / parameter-golf
Forked from openai/parameter-golfTrain the smallest LM you can that fits in 16MB. Best model wins!
Comparative study and experimentation on standard vs mHC vs attention residual (full and block)
A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.
Original reference implementation of the CUDA rasterizer from the paper "StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering"
An unofficial implementation of absGS
Open-source framework for turning expert knowledge into PII-free synthetic conversational data and production-ready LoRA adapters.
An agent for CUDA compute-communication kernel co-design
A lightweight inference engine supporting speculative speculative decoding (SSD).
Open-source CUDA compiler targeting multiple GPU architectures. Compiles .cu to AMD and Tenstorrent GPU's
Voice-to-text app for macOS to transcribe what you say to text almost instantly
A collection of research papers on low-precision training methods
The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".
Shared Middle-Layer for Triton Compilation
Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)
Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Pure C inference of Mistral Voxtral Realtime 4B speech to text model
Implementation of the fast weight product key memory from Sakana AI