Lists (9)
Sort Name ascending (A-Z)
Starred repositories
[NeurIPS 2025 Spotlight🔥] Official Implementation of "Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery"
[CVPR 2023] BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
[ICLR2025] Official Implementation of "AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction"
A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io
An extremely fast Python package and project manager, written in Rust.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA …
PPSNet: Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos (ECCV, 2024)
[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
[CVPR 2025] "DiC: Rethinking Conv3x3 Designs in Diffusion Models", a performant & speedy Conv3x3 diffusion model.
Modelling human health trajectories using generative transformers
SGLang is a high-performance serving framework for large language models and multimodal models.
Source code for ICCV 2025 paper "FlowSeek: Optical Flow Made Easier with Depth Foundation Models and Motion Bases"
An aggregation of human motion understanding research.
Arc Virtual Cell Atlas
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs o…
[CVPR 2024] AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Code of [CVPR 2024] "Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling"
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour; code and checkpoints
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
A curated list of awesome ARKit projects and resources. Feel free to contribute!
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
Cross-platform, customizable ML solutions for live and streaming media.
A Conversational Speech Generation Model
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
A machine learning software for extracting information from scholarly documents