Skip to content
View LongXiao2001's full-sized avatar
  • Beihang University
  • Beijing, China

Highlights

  • Pro

Block or report LongXiao2001

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

​​Unlimited-length talking video generation​​ that supports image-to-video and video-to-video generation

Python 6,905 1,216 Updated May 22, 2026

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,908 2,188 Updated Apr 13, 2026

Ideogram 4: Open image model at the forefront of design

Python 2,057 201 Updated Jun 4, 2026

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 54,343 6,353 Updated Sep 18, 2024

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Python 3,735 297 Updated Jun 15, 2026

ControlFoley: Unified and Controllable Video-to-Audio Generation with Cross-Modal Conflict Handling

Python 135 3 Updated Jun 11, 2026

Evolve your language agent with Agentic Context Engineering (ACE)

Python 1,154 148 Updated May 19, 2026

[CVPR 2026] PersonaLive! : Expressive Portrait Image Animation for Live Streaming

Python 3,326 467 Updated May 15, 2026

The codebase of Cola DLM

Python 228 13 Updated Jun 11, 2026

Wan2.2-Lightning: Speed up wan2.2 model with distillation

Python 304 17 Updated Nov 7, 2025

OpenViking is an open-source context database designed specifically for AI Agents(such as openclaw). OpenViking unifies the management of context (memory, resources, and skills) that Agents need th…

Python 25,670 1,983 Updated Jun 15, 2026

Unofficial extension implementation of CausVid

Python 77 5 Updated Apr 28, 2025

Flow Map OPD for AnyStep Video Diffusion

Python 366 8 Updated May 23, 2026
Python 312 18 Updated May 6, 2026

JoyAI-Image is the unified multimodal foundation model for image understanding, text-to-image generation, and instruction-guided image editing.

Python 2,173 157 Updated Jun 12, 2026

[ICML'26] Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis

Python 512 19 Updated May 23, 2026

Decoupled Weight Decay Regularization (ICLR 2019)

Lua 296 26 Updated Jan 9, 2019

🎓 系统性大语言模型构建课程|🛠️ 覆盖预训练数据工程、Tokenizer、Transformer、MoE、GPU 编程 (CUDA/Triton)、分布式训练、Scaling Laws、推理优化及对齐 (SFT/RLHF/GRPO)|🚀 6 个渐进式作业 + 代码驱动,建立 LLM 全栈认知体系

Jupyter Notebook 953 101 Updated Jun 10, 2026

[ICLR 2026] Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?

Python 252 13 Updated Dec 15, 2025

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,649 96 Updated Mar 16, 2025

Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation

Python 713 28 Updated Jun 9, 2026
Python 2,234 160 Updated Nov 8, 2024

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,848 109,961 Updated Jun 8, 2026

Raster to Vector Graphics Converter

Rust 6,204 416 Updated Mar 23, 2026

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

Python 1,065 78 Updated Apr 26, 2024

[NeurIPS 2025] OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from sim…

Python 2,531 95 Updated Mar 1, 2026

DiagramBank: A Dataset of Diagram Design Exemplars with Paper Metadata for Retrieval-Augmented Generation.

Python 8 1 Updated Jun 3, 2026

Official implementation of AnimateDiff.

Python 12,142 1,076 Updated Jul 31, 2024

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,918 147 Updated Jul 3, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 13,353 1,489 Updated May 19, 2026
Next