shiyongde

shiyongde

3 followers · 22 following

Stars

bytedance / UI-TARS

Pioneering Automated GUI Interaction with Native Agents

Python 11,004 830 Updated Jan 27, 2026

wgcyeo / WorldMM

[CVPR 2026 Highlight] WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning

Python 85 6 Updated Jun 18, 2026

360CVGroup / RzenEmbed

Embedding model prioritized towards Multimodal RAG, overall + VisDoc double top1 on MMEB benchmark

Python 36 1 Updated Jun 16, 2026

facebookresearch / sam3

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 10,613 1,595 Updated Jun 15, 2026

Jintao-Huang / llmscope

Forked from modelscope/ms-swift

Python 9 Updated Jun 18, 2026

Weiyun1025 / verl-internvl

Python 52 8 Updated Oct 20, 2025

RUC-NLPIR / Tool-Star

🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning

Python 397 23 Updated Apr 3, 2026

Xinyi-0724 / SmartHome-Bench-LLM

SmartHome-Bench: A Comprehensive Benchmark for Video Anomaly Detection in Smart Homes Using Multi-Modal Foundation Models

Python 32 3 Updated Nov 7, 2025

hsliuping / TradingAgents-CN

基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版

Python 28,666 6,067 Updated Apr 20, 2026

OpenGVLab / VeBrain

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

86 7 Updated Jun 6, 2025

inclusionAI / Ming

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Jupyter Notebook 657 58 Updated Mar 17, 2026

zwl666666 / Skip-Vision

skip-vision: efficient and scalable acceleration of vision-language models via adaptive token skipping

Python 12 1 Updated Oct 31, 2025

huggingface / huggingface-gemma-recipes

Inference, Fine Tuning and many more recipes with Gemma family of models

Jupyter Notebook 305 47 Updated Apr 2, 2026

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,531 1,314 Updated Jul 9, 2025

Visual-AI / 3DRS

[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Python 158 Updated Dec 9, 2025

Haochen-Wang409 / ross3d

[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

Python 70 1 Updated Jul 22, 2025

flashslam / FlashSLAM

21 2 Updated Dec 6, 2024

ybgdgh / VLN-Game

A new zero-shot framework to explore and search for the language descriptive targets in unknown environment based on Large Vision Language Model.

Python 74 5 Updated Nov 28, 2024

OpenDriveLab / AgiBot-World

[IROS 2025 Best Paper Award Finalist & IEEE TRO 2026] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems

Python 3,065 207 Updated May 29, 2026

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 10,061 783 Updated Sep 22, 2025

DongSky / MR-GDINO

Python 54 6 Updated Dec 23, 2024

NVIDIA / Isaac-GR00T

NVIDIA Isaac GR00T N1.7 - A Foundation Model for Generalist Robots.

Python 7,378 1,269 Updated Jun 17, 2026

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 6,444 542 Updated Jun 17, 2026

OpenManus / OpenManus-RL

A live stream development of RL tunning for LLM agents

Python 4,103 578 Updated May 5, 2026

langchain-ai / local-deep-researcher

Fully local web research and report writing assistant

Python 9,218 966 Updated Jun 9, 2026

camel-ai / owl

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 19,865 2,288 Updated Jun 12, 2026

nnnth / UFO

[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface"

Python 275 12 Updated Nov 5, 2025

iSEE-Laboratory / LLMDet

(CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"

Python 598 31 Updated Feb 4, 2026

deepseek-ai / DeepSeek-R1

91,981 11,720 Updated Jun 27, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,705 1,062 Updated Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly