Lists (28)
Sort Name ascending (A-Z)
agent
ai
ai_infra
ASR
books
cv
dataset
DG
distill
docker
Drone
infer
iqa
konwGraph
mcp
mm
ocr
pruner
pruning
rag
rl
sr
super resolutiontinyMLLM
tools
trending
tts
VLA
world model
Starred repositories
High-fidelity world models for general embodied intelligence, such as data engines and world simulators.
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
ArduPlane, ArduCopter, ArduRover, ArduSub source
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
😼 优雅地使用基于 clash/mihomo 的代理环境
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
AutoClip : AI-powered video clipping and highlight generation · 一款智能高光提取与剪辑的二创工具
Open standard for machine learning interoperability
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
About Code Release for "CLIPood: Generalizing CLIP to Out-of-Distributions" (ICML 2023), https://arxiv.org/abs/2302.00864
[CVPR 2024] Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification
An app that brings language models directly to your phone.
[ECCV 2024] Soft Prompt Generation for Domain Generalization
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
A PyTorch toolbox for domain generalization, domain adaptation and semi-supervised learning.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Training library for Megatron-based models with bidirectional Hugging Face conversion capability
The simplest, fastest repository for training/finetuning small-sized VLMs.
Fully Open Framework for Democratized Multimodal Training