zhangzilongc

🎯

Focusing

Zilong Zhang zhangzilongc

🎯

Focusing

Ph.D. Student | Xi'an Jiaotong University | Self-supervised Learning / Unified Vision / Anomaly Detection/ Vision-based Industrial Inspection

25 followers · 21 following

Xi'an Jiaotong University
XI'AN CHINA

Achievements

Lists (2)

Sort

dataset

1 repository

✨ Inspiration

1 repository

Stars

QingYuanQu / insightface

Forked from deepinsight/insightface

State-of-the-art 2D and 3D Face Analysis Project学习文档

Python 3 1 Updated Dec 8, 2025

XuAdventurer / PIN-WM

[RSS 2025] PIN-WM : Learning Physics-INformed World Models for Non-Prehensile Manipulation

Python 44 5 Updated Aug 20, 2025

Tongyi-MAI / Z-Image

Python 7,524 444 Updated Dec 14, 2025

facebookresearch / sam3

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 6,263 728 Updated Dec 21, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 28,136 2,815 Updated Apr 30, 2025

IDEA-Research / Rex-Omni

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

Jupyter Notebook 1,012 66 Updated Dec 15, 2025

openvla / openvla

Forked from TRI-ML/prismatic-vlms

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 4,784 575 Updated Mar 23, 2025

facebookresearch / EdgeTAM

[CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"

Jupyter Notebook 842 66 Updated Dec 8, 2025

YuZhaoshu / Efficient-VLAs-Survey

🔥This is a curated list of "A survey on Efficient Vision-Language Action Models" research. We will continue to maintain and update the repository, so follow us to keep up with the latest developmen…

101 5 Updated Nov 13, 2025

TheShadow29 / awesome-grounding

awesome grounding: A curated list of research papers in visual grounding

1,123 105 Updated Sep 21, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 17,299 1,445 Updated Nov 28, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,826 1,813 Updated Oct 13, 2025

robrosinc / REALTIME_SAM2

Python 39 3 Updated Aug 18, 2025

luanshiyinyang / awesome-multiple-object-tracking

Resources for Multiple Object Tracking (MOT)

1,421 186 Updated Oct 7, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,026 1,271 Updated Oct 11, 2025

ZhuiyiTechnology / roformer

Rotary Transformer

Python 1,065 59 Updated Mar 21, 2022

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,368 1,348 Updated Jul 9, 2025

facebookresearch / habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python 2,748 614 Updated Oct 12, 2025

yakhyo / head-pose-estimation

Real Time Head Pose Estimation: Accurate head pose estimation using ResNet 18/34/50 and MobileNet V2/V3 models. Evaluate yaw, pitch, and roll with pre-trained weights for quick integration.

Python 57 7 Updated Mar 28, 2025

Redhwan-A / HPE_5D

Head Pose Estimation Based on 5D Rotation Representation

Python 5 Updated Sep 14, 2024

thohemp / 6DRepNet360

Official Pytorch implementation of "Towards Robust and Unconstrained Full Range of Rotation Head Pose Estimation" IEEE TIP 24

Python 154 7 Updated Jun 10, 2024

Tencent-Hunyuan / Hunyuan3D-2

High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.

Python 12,704 1,266 Updated Oct 28, 2025

VAST-AI-Research / TripoSG

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Python 1,476 158 Updated Apr 18, 2025

xiaobiaodu / 3DRealCar_Toolkit

[ICCV2025] 3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

Jupyter Notebook 158 9 Updated Mar 12, 2025

cvlab-epfl / multicam-gt

Our Webapp to annotate multi-camera pedestrian detection datasets.

JavaScript 23 4 Updated Jul 16, 2025

crowdbotp / OpenTraj

Human Trajectory Prediction Dataset Benchmark (ACCV 2020)

Python 539 111 Updated Apr 4, 2024

chengche6230 / ReST

[ICCV 2023] ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking

Python 163 18 Updated Mar 27, 2024

asw91666 / TRG-Release

Official PyTorch implementation of "6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry," ECCV 2024

Python 97 5 Updated Jun 17, 2025

AvaLovelace1 / BrickGPT

Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.

Python 1,548 94 Updated Nov 9, 2025

SherryJYC / paper-MTMC

A repo of awesome papers about multi target multi camera tracking

202 19 Updated Nov 30, 2022