Lists (3)
Sort Name ascending (A-Z)
Stars
Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Wan: Open and Advanced Large-Scale Video Generative Models
Official Implementation of Paper Transfer between Modalities with MetaQueries
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
[ICCV 2025] MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
Processed / Cleaned Data for Paper Copilot
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entr…
Plugin configuration manager for BepInEx
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"
A fast cross-platform HTTP file server (轻量小巧快速上手的跨平台 HTTP 文件服务器互传文件)