Stars
GitNexus: The Zero-Server Code Intelligence Engine - GitNexus is a client-side knowledge graph creator that runs entirely in your browser. Drop in a GitHub repo or ZIP file, and get an interactive …
Your GitHub profile as a 3D pixel art building in an interactive city
Real-time global intelligence dashboard. AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking in a unified situational awareness interface
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Reference PyTorch implementation and models for DINOv3
Easily train a good VC model with voice data <= 10 mins!
Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…
No fortress, purely open ground. OpenManus is Coming.
Train your AI self, amplify you, bridge the world
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
Official implementation of the WACV 2025 ( Oral ) paper. RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision.
Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496
Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed
EVE Series: Encoder-Free Vision-Language Models from BAAI
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos