-
Yandex Research
- Moscow, Russia
Stars
Crystal ML library: Autograd, Tensors, Neural Networks, Optimizers
cuda-oxide is an experimental Rust-to-CUDA compiler that lets you write (SIMT) GPU kernels in safe(ish), idiomatic Rust. It compiles standard Rust code directly to PTX — no DSLs, no foreign languag…
DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
Official implementation of AsymFlow, pi-Flow, GMFlow
DFlash: Block Diffusion for Flash Speculative Decoding
Claude Code skill — autonomous ML intern (port of huggingface/ml-intern) with Telegram + Slack notifications
Official Codebase: LT2: Linear-Time Looped Transformers.
CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs
A Datacenter Scale Distributed Inference Serving Framework
Official PyTorch Implementation of Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention
Academic Research Skills for Claude Code: research → write → review → revise → finalize
Show usage stats for OpenAI Codex and Claude Code, without having to login.
Nemu-x / ClashFest
Forked from MetaCubeX/ClashMetaForAndroidA rule-based tunnel for Android.
🦀 Полный roadmap по изучению Rust на русском + большой список ресурсов. Telegram: t.me/rust_code
A kernel library written in tilelang
[ICLR 2026] Taming large-scale few-step training with self-adversarial flows! 👏🏻
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A collection of deep learning research experiments focused on transformer model internals, weight analysis, and layer-level properties.
Benchmark showing all major LLMs exhibit measurable decision biases, worsened by structured outputs that reduce safety refusals.
reverse engineering Gemini's SynthID detection
Sorted heap table AM for PostgreSQL with zone map scan pruning
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.