LongXiao2001

LongXiao LongXiao2001

My name is Xiao Long, a master student at Beihang University. My study interest lies in Digital human, Agent and RL, especially their application in games.

2 followers · 6 following

Beihang University
Beijing, China

Highlights

Stars

MeiGen-AI / InfiniteTalk

Unlimited-length talking video generation that supports image-to-video and video-to-video generation

Python 6,905 1,216 Updated May 22, 2026

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,908 2,188 Updated Apr 13, 2026

ideogram-oss / ideogram4

Ideogram 4: Open image model at the forefront of design

Python 2,057 201 Updated Jun 4, 2026

facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 54,343 6,353 Updated Sep 18, 2024

ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Python 3,735 297 Updated Jun 15, 2026

xiaomi-research / controlfoley

ControlFoley: Unified and Controllable Video-to-Audio Generation with Cross-Modal Conflict Handling

Python 135 3 Updated Jun 11, 2026

ace-agent / ace

Evolve your language agent with Agentic Context Engineering (ACE)

Python 1,154 148 Updated May 19, 2026

GVCLab / PersonaLive

[CVPR 2026] PersonaLive! : Expressive Portrait Image Animation for Live Streaming

Python 3,326 467 Updated May 15, 2026

ByteDance-Seed / Cola-DLM

The codebase of Cola DLM

Python 228 13 Updated Jun 11, 2026

ModelTC / LightX2V-Wan2.2-Lightning

Forked from Wan-Video/Wan2.2

Wan2.2-Lightning: Speed up wan2.2 model with distillation

Python 304 17 Updated Nov 7, 2025

volcengine / OpenViking

OpenViking is an open-source context database designed specifically for AI Agents(such as openclaw). OpenViking unifies the management of context (memory, resources, and skills) that Agents need th…

Python 25,670 1,983 Updated Jun 15, 2026

GoatWu / CausVid-Plus

Forked from tianweiy/CausVid

Unofficial extension implementation of CausVid

Python 77 5 Updated Apr 28, 2025

NVlabs / AnyFlow

Flow Map OPD for AnyStep Video Diffusion

Python 366 8 Updated May 23, 2026

bytedance / mammothmoda

Python 312 18 Updated May 6, 2026

jd-opensource / JoyAI-Image

JoyAI-Image is the unified multimodal foundation model for image understanding, text-to-image generation, and instruction-guided image editing.

Python 2,173 157 Updated Jun 12, 2026

black-forest-labs / Self-Flow

[ICML'26] Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis

Python 512 19 Updated May 23, 2026

loshchil / AdamW-and-SGDW

Decoupled Weight Decay Regularization (ICLR 2019)

Lua 296 26 Updated Jan 9, 2019

datawhalechina / diy-llm

🎓 系统性大语言模型构建课程｜🛠️ 覆盖预训练数据工程、Tokenizer、Transformer、MoE、GPU 编程 (CUDA/Triton)、分布式训练、Scaling Laws、推理优化及对齐 (SFT/RLHF/GRPO)｜🚀 6 个渐进式作业 + 代码驱动，建立 LLM 全栈认知体系

Jupyter Notebook 953 101 Updated Jun 10, 2026

End2End-Diffusion / iREPA

[ICLR 2026] Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?

Python 252 13 Updated Dec 15, 2025

sihyun-yu / REPA

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,649 96 Updated Mar 16, 2025

facebookresearch / tuna-2

Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation

Python 713 28 Updated Jun 9, 2026

XLabs-AI / x-flux

Python 2,234 160 Updated Nov 8, 2024

ultraworkers / claw-code

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,848 109,961 Updated Jun 8, 2026

visioncortex / vtracer

Raster to Vector Graphics Converter

Rust 6,204 416 Updated Mar 23, 2026

allenai / s2orc

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

Python 1,065 78 Updated Apr 26, 2024

OmniSVG / OmniSVG

[NeurIPS 2025] OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from sim…

Python 2,531 95 Updated Mar 1, 2026

csml-rpi / DiagramBank

DiagramBank: A Dataset of Diagram Design Exemplars with Paper Metadata for Retrieval-Augmented Generation.

Python 8 1 Updated Jun 3, 2026

guoyww / AnimateDiff

Official implementation of AnimateDiff.

Python 12,142 1,076 Updated Jul 31, 2024

Yuanshi9815 / OminiControl

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,918 147 Updated Jul 3, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 13,353 1,489 Updated May 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly