-
University of North Carolina at Chapel Hill
- Chapel Hill
-
07:27
(UTC -04:00) - https://owenh-unc.github.io/
- @owenhuang117
Highlights
- Pro
Stars
VEFX-Bench: A Holistic Benchmark for Generic Video Editing and Visual Effects
Official implementation of paper "PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation"
2026 AI/ML internship & new graduate job list updated daily
Official implementation of paper "Planning with Sketch-Guided Verification for Physics-Aware Video Generation"
GH200 drone build templates (pytorch, torchvision, triton, vllm...)
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and follow me if you like what you see🤩.
[IROS 2024] Official implementation of paper: DriVLMe: "Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experience"s
The world's simplest facial recognition api for Python and the command line
Character Animation (AnimateAnyone, Face Reenactment)
A one-stop library to standardize the inference and evaluation of all the conditional image generation models. [ICLR 2024]
📖 A curated list of resources dedicated to talking face.
[TMLR] Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
[CVPR 2024] Official implementation, Inversion-Free Image Editing with Natural Language"
[ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"
[ICRA 2024] Chat with NeRF enables users to interact with a NeRF model by typing in natural language.
[NeurIPS 2023] Official Code for CycleNet: Rethinking Cycle Consistent in Text‑Guided Diffusion for Image Manipulation
collection of diffusion model papers categorized by their subareas
Summer 2026 software engineering, data science, AI, quant, product management, and hardware internship postings. Updated daily by Simplify and Pitt CSC.
Official Code for DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents (Findings of EMNLP 2022)
[PRICAI 2023] A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.
A modular RL library to fine-tune language models to human preferences
Programmer's guide about how to cook at home.