Skip to content
View michalwols's full-sized avatar

Block or report michalwols

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models

Python 20 1 Updated Mar 20, 2026
Python 1,408 165 Updated Jul 14, 2025

Fast and memory-efficient exact kmeans

Python 482 24 Updated Mar 17, 2026

Build ultra fast, tiny, and cross-platform desktop apps with Typescript.

TypeScript 10,583 249 Updated Mar 22, 2026

Pure MLX implementations of UMAP, t-SNE, PaCMAP, TriMap, DREAMS, CNE, MMAE, and NNDescent for Apple Silicon. Metal GPU for computation and video rendering.

Python 76 2 Updated Mar 20, 2026

A Quirky Assortment of CuTe Kernels

Python 863 98 Updated Mar 23, 2026

UMAP in pure MLX for Apple Silicon. 30x faster than umap-learn.

Python 40 4 Updated Mar 5, 2026

100M tokens. Infinite compute. Lowest val loss wins.

Python 373 47 Updated Mar 23, 2026

Official implementation of VLANeXt.

Python 145 3 Updated Mar 23, 2026

The Adana algorithm official repository

Python 3 Updated Mar 9, 2026
Jupyter Notebook 13 Updated Feb 27, 2026

Algorithms for latent compaction

Python 176 20 Updated Feb 19, 2026
Python 45 2 Updated Mar 1, 2026
Python 83 2 Updated Mar 23, 2026

Various ML tidbits in Python/PyTorch and C++

Python 85 7 Updated Mar 8, 2026

Official implementation of ViT-5: Vision Transformers for The Mid-2020s

Python 93 7 Updated Feb 16, 2026

Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Python 28 2 Updated Feb 4, 2026

Helpful kernel tutorials and examples for tile-based GPU programming

Python 682 55 Updated Mar 23, 2026

FLA but cuTile

Python 28 Updated Feb 22, 2026

LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.

378 21 Updated Feb 12, 2026

nanoTabPFN in X minutes

Python 19 7 Updated Mar 22, 2026

An interface library for RL post training with environments.

Python 1,306 212 Updated Mar 20, 2026

Kernels, of the mega variety :)

Python 693 46 Updated Mar 23, 2026

Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)

Cuda 83 6 Updated Feb 10, 2026

An open source library designed to provide community examples of Joint Embedding Predictive Architectures (JEPAs). It contains code and examples for learning representations from images, video, and…

Python 524 50 Updated Feb 4, 2026

Bf-Tree is a modern read-write-optimized concurrent larger-than-memory range index in Rust from MS Research.

Rust 998 34 Updated Mar 9, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 2,116 206 Updated Jan 30, 2026

Fast, flexible LLM inference

Rust 6,725 549 Updated Mar 21, 2026

High-performance zero-copy tensor serialization for Fastest Transmission

Python 75 2 Updated Feb 17, 2026
Next