Stars
GELab: GUI Exploration Lab. One of the best GUI agent solutions in the galaxy, built by the StepFun-GELab team and powered by Step’s research capabilities.
Official repository for Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs
OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text- and Time-Series Data
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
12 Weeks, 24 Lessons, AI for All!
Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model
Official implementation of SIGGRAPH 2025 paper "Image-GS: Content-Adaptive Image Representation via 2D Gaussians"
Spec-Driven Development MCP Server, not just Vibe Coding
Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input
Convert safetensors to gguf q4_0 - q8_0 on windows
[TMLR 2025] Thera: Aliasing-Free Arbitrary-Scale Super-Resolution with Neural Heat Fields
[CVPR 2025] "A Distractor-Aware Memory for Visual Object Tracking with SAM2"
The repo for "Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator"
Official Code for "MITracker: Multi-View Integration for Visual Object Tracking"
An ongoing & curated collection of awesome software best practices and techniques, libraries and frameworks, E-books and videos, websites, blog posts, links to github Repositories, technical guidel…
SSEPy: Implementation of searchable symmetric encryption in pure Python
Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
LLM agents built for control. Designed for real-world use. Deployed in minutes.
Original implementation of "Radiant Foam: Real-Time Differentiable Ray Tracing"
PyTorch implementation of "TryOnDiffusion: A Tale of Two UNets", a virtual try-on diffusion-based network by Google
[AAAI 2025] MV-VTON: Multi-View Virtual Try-On with Diffusion Models
This is the official repository for the paper "Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On". CVPR 2024