Skip to content
View fmirkowski's full-sized avatar
🎯
🎯

Highlights

  • Pro

Block or report fmirkowski

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Native Multimodal Models are World Learners

Python 1,130 39 Updated Nov 5, 2025

https://www.shoufachen.com/Awesome-Diffusion-Transformers/

HTML 151 8 Updated Mar 6, 2024

🎒 Token-Oriented Object Notation (TOON) – JSON for LLM prompts at half the tokens. Spec, benchmarks & TypeScript implementation.

TypeScript 10,495 352 Updated Nov 5, 2025
Python 545 52 Updated Sep 23, 2025
Python 1,152 106 Updated Oct 11, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,691 972 Updated Nov 5, 2025

Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"

Python 309 17 Updated Dec 23, 2024

TOFlow: Video Enhancement with Task-Oriented Flow

MATLAB 468 92 Updated Nov 11, 2019

Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Jupyter Notebook 1,462 274 Updated Jul 11, 2018

[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

Python 365 14 Updated May 30, 2025

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 1,264 42 Updated Jun 12, 2025

[NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Python 166 8 Updated Nov 18, 2024

Evaluating text-to-image/video/3D models with VQAScore

Jupyter Notebook 354 28 Updated Sep 22, 2025

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,546 648 Updated Nov 5, 2025

The collection of awesome papers on alignment of diffusion models.

357 16 Updated Oct 27, 2025

[CVPR 2025] Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Python 77 2 Updated Jun 4, 2025

source code for the ECCV18 paper A Style-Aware Content Loss for Real-time HD Style Transfer

Python 746 142 Updated Jan 11, 2021

[ICCV 2025] SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Jupyter Notebook 38 1 Updated Nov 5, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,982 1,259 Updated Oct 27, 2025

[NeurIPS 2024] VFIMamba: Video Frame Interpolation with State Space Models

Python 131 10 Updated Sep 26, 2024

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,372 103 Updated Oct 31, 2025

RAM is all you need

Python 213 21 Updated Nov 5, 2025

Type-safe TypeScript SDK for docling-serve with first-class Bun support. The client is generated from the official OpenAPI schema and wraps every endpoint with a small, DX-focused API surface.

TypeScript 10 Updated Oct 2, 2025

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Python 1,165 36 Updated Oct 26, 2025

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 4,207 361 Updated Oct 19, 2025

GPU controlled Hetzner Cloud workers swarm for Crawling@Home project

Jupyter Notebook 57 11 Updated Oct 9, 2022

Large-scale text-video dataset. 10 million captioned short videos.

Python 662 39 Updated Aug 14, 2024

Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini

JavaScript 23,350 3,556 Updated Nov 1, 2025

A curated list of events, hackathons, and communities focused on AI and tech in Poland

29 Updated Aug 21, 2025
Next