- Warsaw / San Francisco
- mirkowski.dev
- in/franek-mirkowski-a0abb3330
Highlights
- Pro
Stars
https://www.shoufachen.com/Awesome-Diffusion-Transformers/
🎒 Token-Oriented Object Notation (TOON) – JSON for LLM prompts at half the tokens. Spec, benchmarks & TypeScript implementation.
DeepEP: an efficient expert-parallel communication library
Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"
TOFlow: Video Enhancement with Task-Oriented Flow
Improving Convolutional Networks via Attention Transfer (ICLR 2017)
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
[NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Evaluating text-to-image/video/3D models with VQAScore
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
The collection of awesome papers on alignment of diffusion models.
[CVPR 2025] Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment
source code for the ECCV18 paper A Style-Aware Content Loss for Real-time HD Style Transfer
[ICCV 2025] SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS 2024] VFIMamba: Video Frame Interpolation with State Space Models
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Type-safe TypeScript SDK for docling-serve with first-class Bun support. The client is generated from the official OpenAPI schema and wraps every endpoint with a small, DX-focused API surface.
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
GPU controlled Hetzner Cloud workers swarm for Crawling@Home project
Large-scale text-video dataset. 10 million captioned short videos.
Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
A curated list of events, hackathons, and communities focused on AI and tech in Poland