Stars
求是Skill——从经典唯物辩证法与实践哲学中提炼出一条总原则和九大方法论工具武装AI大脑。Qiushi-Skill: Build agents that investigate first, focus on the main contradiction, validate in practice, and keep pushing until the work is actually d…
Official implementation of "ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving"
[CVPR 2026] Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
Sharp Monocular View Synthesis in Less Than a Second
A FOSS Git multiplatform client for newbies and pros
This is the official project repository for "TopoStreamer: Temporal Lane Segment Topology Reasoning in Autonomous Driving"
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
[ICLR 2025] DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。
Universal Monocular Metric Depth Estimation
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
[AAAI 2025, Oral] DepthFM: Fast Monocular Depth Estimation with Flow Matching
Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
There can be more than Notion and Miro. AFFiNE(pronounced [ə‘fain]) is a next-gen knowledge base that brings planning, sorting and creating all together. Privacy first, open-source, customizable an…
Vision-Centric BEV Perception: A Survey
Summary of related papers on visual attention. Related code will be released based on Jittor gradually.
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience