Stars
VCP 部署在 AI 模型 API 与前端应用之间,通过统一指令协议、多层级持久化记忆、分布式插件引擎及多 Agent 协作框架,将原本“无状态、无记忆、无工具调用能力”的大语言模型,彻底改造成拥有永久自我意识、物理世界操作权及群体协作智能的完整智能体系统。
🐿️ Sirchmunk: Raw data to self-evolving intelligence, real-time.
[ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"
Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything
开放式的缠论python实现框架,支持形态学/动力学买卖点分析计算,多级别K线联立,区间套策略,可视化绘图,多种数据接入,策略开发,交易系统对接;
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Modeling, training, eval, and inference code for OLMo
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
real time face swap and one-click video deepfake with only a single image
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
🧙AutoDev: the AI-native Multi-Agent development platform built on Kotlin Multiplatform, covering all 7 phases of SDLC.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
[ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".
The #1 open-source voice interface for desktop, mobile, and ESP32 chips.
An open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data
VMamba: Visual State Space Models,code is based on mamba
chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication feat…
[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval
GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)
Code for "Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed", CVPR 2024
Labeling tool with SAM(segment anything model),supports SAM, SAM2, SAM3, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds