Skip to content
View michalwols's full-sized avatar

Block or report michalwols

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

LoMa: Local Feature Matching Revisited

Python 119 2 Updated Apr 7, 2026

Efficient Universal Perception Encoder: a single on-device vision encoder with versatile representations that match or exceed specialized experts across multiple task domains.

Python 446 26 Updated Apr 6, 2026

Embedding Vector Oriented Clustering

Python 217 16 Updated Apr 7, 2026

a disk cache for using DuckDB to access Data Lakes (ducklake, iceberg, delta)

C++ 23 3 Updated Mar 26, 2026

⚡ Super fast clustering for high-dimensional vectors on CPUs (x86, ARM) and GPUs — for Python and C++. 100x faster clustering of vector embeddings than FAISS

C++ 58 4 Updated Apr 8, 2026

MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models

Python 25 3 Updated Mar 27, 2026
Python 1,434 172 Updated Jul 14, 2025

Fast and memory-efficient exact kmeans

Python 526 27 Updated Mar 26, 2026

Build ultra fast, tiny, and cross-platform desktop apps with Typescript.

TypeScript 11,202 273 Updated Apr 8, 2026

Pure MLX implementations of UMAP, t-SNE, PaCMAP, TriMap, DREAMS, CNE, MMAE, and NNDescent for Apple Silicon. Metal GPU for computation and video rendering.

Python 79 2 Updated Mar 20, 2026

A Quirky Assortment of CuTe Kernels

Python 918 107 Updated Apr 8, 2026

UMAP in pure MLX for Apple Silicon. 30x faster than umap-learn.

Python 42 4 Updated Mar 5, 2026

100M tokens. Infinite compute. Lowest val loss wins.

Python 403 57 Updated Apr 9, 2026

Official implementation of VLANeXt.

Python 154 5 Updated Mar 23, 2026

The Adana algorithm official repository

Python 2 Updated Mar 9, 2026
Jupyter Notebook 14 Updated Feb 27, 2026

Algorithms for latent compaction

Python 196 23 Updated Mar 31, 2026
Python 45 2 Updated Apr 1, 2026
Python 94 2 Updated Mar 24, 2026

Various ML tidbits in Python/PyTorch and C++

Python 87 7 Updated Mar 8, 2026

Official implementation of ViT-5: Vision Transformers for The Mid-2020s

Python 106 7 Updated Feb 16, 2026

Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Python 32 2 Updated Apr 9, 2026

Helpful kernel tutorials and examples for tile-based GPU programming

Python 695 59 Updated Apr 9, 2026

FLA but cuTile

Python 28 Updated Apr 9, 2026

LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.

394 21 Updated Feb 12, 2026

nanoTabPFN in X minutes

Python 22 7 Updated Apr 2, 2026

An interface library for RL post training with environments.

Python 1,581 319 Updated Apr 8, 2026

Kernels, of the mega variety :)

Python 700 55 Updated Apr 9, 2026

Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)

Cuda 86 6 Updated Feb 10, 2026
Next