michalwols

Mike michalwols

200 followers · 1.9k following

New York
michal.io

Starred repositories

chen-hao-chao / mdm-prime-v2

MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models

Python 20 1 Updated Mar 20, 2026

svg-project / flash-kmeans

Fast and memory-efficient exact kmeans

Python 482 24 Updated Mar 17, 2026

blackboardsh / electrobun

Build ultra fast, tiny, and cross-platform desktop apps with Typescript.

TypeScript 10,583 249 Updated Mar 22, 2026

hanxiao / mlx-vis

Pure MLX implementations of UMAP, t-SNE, PaCMAP, TriMap, DREAMS, CNE, MMAE, and NNDescent for Apple Silicon. Metal GPU for computation and video rendering.

Python 76 2 Updated Mar 20, 2026

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 863 98 Updated Mar 23, 2026

hanxiao / umap-mlx

UMAP in pure MLX for Apple Silicon. 30x faster than umap-learn.

Python 40 4 Updated Mar 5, 2026

qlabs-eng / slowrun

100M tokens. Infinite compute. Lowest val loss wins.

Python 373 47 Updated Mar 23, 2026

DravenALG / VLANeXt

Official implementation of VLANeXt.

Python 145 3 Updated Mar 23, 2026

ScalingIntelligence / tokasaurus

Python 470 36 Updated Nov 25, 2025

mlexpos / adana

The Adana algorithm official repository

Python 3 Updated Mar 9, 2026

minxin-zhg / namo

Jupyter Notebook 13 Updated Feb 27, 2026

adamzweiger / compaction

Algorithms for latent compaction

Python 176 20 Updated Feb 19, 2026

mvparakhin / ml-tidbits

Various ML tidbits in Python/PyTorch and C++

Python 85 7 Updated Mar 8, 2026

wangf3014 / ViT-5

Official implementation of ViT-5: Vision Transformers for The Mid-2020s

Python 93 7 Updated Feb 16, 2026

thunlp / hybrid-linear-attention

Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Python 28 2 Updated Feb 4, 2026

NVIDIA / TileGym

Helpful kernel tutorials and examples for tile-based GPU programming

Python 682 55 Updated Mar 23, 2026

YJMSTR / cutile-flash-linear-attention

Forked from fla-org/flash-linear-attention

FLA but cuTile

Python 28 Updated Feb 22, 2026

inclusionAI / LLaDA2.X

LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.

378 21 Updated Feb 12, 2026

borawhocodess / modded-nanotabpfn

nanoTabPFN in X minutes

Python 19 7 Updated Mar 22, 2026

meta-pytorch / OpenEnv

An interface library for RL post training with environments.

Python 1,306 212 Updated Mar 20, 2026

HazyResearch / Megakernels

Kernels, of the mega variety :)

Python 693 46 Updated Mar 23, 2026

Infatoshi / MegaQwen

Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)

Cuda 83 6 Updated Feb 10, 2026

facebookresearch / eb_jepa

An open source library designed to provide community examples of Joint Embedding Predictive Architectures (JEPAs). It contains code and examples for learning representations from images, video, and…

Python 524 50 Updated Feb 4, 2026

microsoft / bf-tree

Bf-Tree is a modern read-write-optimized concurrent larger-than-memory range index in Rust from MS Research.

Rust 998 34 Updated Mar 9, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 2,116 206 Updated Jan 30, 2026

EricLBuehler / mistral.rs

Fast, flexible LLM inference

Rust 6,725 549 Updated Mar 21, 2026

Khushiyant / tenso

High-performance zero-copy tensor serialization for Fastest Transmission

Python 75 2 Updated Feb 17, 2026

Mike michalwols

Starred repositories

retail-data

state-of-the-art