-
East China University of Science and Technology
- xiarho.github.io
Stars
Official code for our ICCV2025 paper "SDMatte: Grafting Diffusion Models for Interactive Matting"
Reference PyTorch implementation and models for DINOv3
A library for calculating the FLOPs in the forward() process based on torch.fx
ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
[NeurIPS 2025] Direct3D‑S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
[NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding
[ICLR' 25] SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
Official implementation of "DepthMaster: Taming Diffusion Models for Monocular Depth Estimation".
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
2018/2019/校招/春招/秋招/自然语言处理(NLP)/深度学习(Deep Learning)/机器学习(Machine Learning)/C/C++/Python/面试笔记,此外,还包括创建者看到的所有机器学习/深度学习面经中的问题。 除了其中 DL/ML 相关的,其他与算法岗相关的计算机知识也会记录。 但是不会包括如前端/测试/JAVA/Android等岗位中有关的问题。
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
[ICCV23] Official Implementation of CMDA: Cross-Modality Domain Adaptation for Nighttime Semantic Segmentation
Open source implementation of CVPR 2020 "Video to Events: Recycling Video Dataset for Event Cameras"
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
[TPAMI 2023 ESI Highly Cited Paper] SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation https://arxiv.org/abs/2204.08808