DingYikang

Follow

Yikang Ding DingYikang

Follow

51 followers · 40 following

Tsinghua University
Beijing

Lists (28)

Sort

3D-from-mono

13 repositories

3d_recon

49 repositories

3DGS

49 repositories

AIGC

61 repositories

Anything-Model

autonomous driving

40 repositories

C++

CG

chatgpt

CUDA_tools

deep-learning

diffusion

28 repositories

digital-avatar

image_edit

LLM-AI

32 repositories

Localization

mono-depth

24 repositories

MVS

10 repositories

nerf

72 repositories

occupancy-network

30 repositories

optical flow

pose_est

sample_utils

simulation

SLAM

12 repositories

stylization

tools

16 repositories

world-model

Stars

caojiaolong / spaces-index

🌟本项目自动抓取并索引科学空间的文章元数据，按研究主题进行规则分类，方便在 GitHub 上快速浏览并跳转到原文。

Python 245 7 Updated Jun 8, 2026

Jiawei-Yang / FD-Loss

Python 526 12 Updated May 1, 2026

inspatio / worldfm

Python 798 83 Updated May 6, 2026

black-forest-labs / Self-Flow

[ICML'26] Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis

Python 511 19 Updated May 23, 2026

dreamzero0 / dreamzero

Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals

Python 2,263 194 Updated Apr 19, 2026

zai-org / GLM-Image

GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image Generation.

Python 927 75 Updated Mar 20, 2026

Anionex / banana-slides

一个基于nano banana pro🍌的原生AI PPT生成应用，迈向＂Vibe PPT＂; 支持上传任意模板图片，上传任意素材&智能解析，一句话/大纲/页面描述自动生成PPT，口头修改指定区域、一键导出可编辑ppt - An AI-native slides generator based on nano banana pro🍌

Python 14,936 1,743 Updated Jun 15, 2026

thu-ml / TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,533 265 Updated Apr 15, 2026

NVIDIA / cutile-python

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 2,072 140 Updated Jun 13, 2026

lumalabs / tvm

Terminal Velocity Matching

Python 87 1 Updated Feb 14, 2026

chenfengxu714 / StreamDiffusionV2

StreamDiffusion, Live Stream APP

Python 490 57 Updated May 19, 2026

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,524 66 Updated Dec 30, 2025

AlmondGod / tinyworlds

A minimal implementation of DeepMind's Genie world model

Python 1,311 102 Updated Apr 15, 2026

NVlabs / LongLive

LongLive 2.0: Infra - Long Video Gen

Python 2,336 210 Updated Jun 13, 2026

Tencent-Hunyuan / HunyuanImage-3.0

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 3,127 166 Updated Feb 3, 2026

fudan-generative-vision / PPFlow

[ICLR 2026] Pyramidal Patchification Flow for Visual Generation (PPFlow)

Python 7 1 Updated Jul 20, 2025

nv-tlabs / lyra

Project Lyra: Open Generative 3D World Models

Python 2,089 224 Updated Jun 11, 2026

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,832 265 Updated Apr 23, 2026

gojasper / LBM

Latent Bridge Matching for Fast Image-to-Image Translation (ICCV 2025 Highlight)

Python 839 57 Updated Jul 24, 2025

facebookresearch / map-anything

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Python 3,490 262 Updated Jun 3, 2026

yangzhou24 / OmniWorld

[ICLR 2026] OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Python 482 8 Updated Apr 16, 2026

NJU-3DV / SpatialVID

[CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotations

Python 570 20 Updated Apr 22, 2026

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 21,652 2,498 Updated May 25, 2026

VITA-MLLM / VITA

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,514 181 Updated Mar 28, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

17,885 1,128 Updated May 1, 2026

Dorniwang / UniVerse-1-code

The official UniVerse-1 code.

Python 129 11 Updated Oct 13, 2025

nv-tlabs / vipe

ViPE: Video Pose Engine for Geometric 3D Perception

Python 1,981 161 Updated Jun 9, 2026

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 10,673 876 Updated Jun 12, 2026

facebookresearch / sapiens

High-resolution models for human tasks.

Python 5,385 319 Updated May 26, 2026

FoundationVision / Waver

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

939 121 Updated Aug 27, 2025