Stars
Official implement of AlignGS: Aligning Geometry and Semantics for Robust Indoor Reconstruction from Sparse Views
Fully Open Source Search Engine for SME (Small Medium Size Enterprise)
MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, BrowserComp and xBench.
DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models
🤖 AutoAudit--智能审计决策系统 Python FastAPI License 基于大语言模型的智能审计平台 | 集成知识图谱、RAG、强化学习等前沿技术 功能特性 • 快速开始 • 技术架构 • 文档 📋 项目简介 智能审计决策系统是一个基于大语言模型(LLM)的智能审计平台,集成了知识图谱、RAG检索增强生成、强化学习等前沿技术,为审计工作提供智能化支持。 🎯 核心价值: 突破…
SAG - SQL驱动的RAG引擎 · 查询时自动构建知识图谱 | SQL-Driven RAG Engine · Automatically Build Knowledge Graph During Querying
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.
A lightweight browser-to-NAS pipeline for capturing and downloading web videos. It integrates a Chrome Extension with a NAS-hosted Docker backend (FastAPI, workers, FFmpeg) to automatically detect,…
A real-time interactive Omni Avatar built on LiveKit, which allows you to seamlessly integrate with any open source Avatar components (real-time model, visual, voice, memory, search, etc.).
MiroThinker is a series of open-source agentic models trained for deep research and complex tool use scenarios.
A type-safe, elegant iMessage SDK for macOS with zero dependencies
🚀 A minimal and lightweight video streaming management platform 一个极简轻量的视频流媒体管理平台
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505
PageEyes Agent 是一个轻量级 UI Agent,通过自然语言指令驱动,无需编写脚本既可实现Web、Android平台的UI自动化任务。
Enterprise-grade, commercial-friendly agentic workflow platform for building next-generation SuperAgents.
MAD-Former: A Traceable Interpretability Model for Alzheimer’s Disease Recognition based on Multi-patch Attention
A cross-platform instant messaging client application built with Tauri and Vue 3, featuring one-to-one chat, group chat, file transfer, audio/video calling, screen recording, screenshot capture, an…
(ECCV 2024) Open-Vocabulary Camouflaged Object Segmentation
Tego is a pluggable Node.js framework for building customizable development platforms. It enables developers to create their own no-code/low-code systems or event-driven applications, while the cor…
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Awesome Literature Graph Learning Challenges
Nexent is a zero-code platform for auto-generating agents — no orchestration, no complex drag-and-drop required. Nexent also offers powerful capabilities for agent running control, data processing …