Stars
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Hierarchical Reasoning Model Official Release
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Mobile-Agent: The Powerful GUI Agent Family
A lightweight LMM-based Document Parsing Model
A True Instrumentable Binary Emulation Framework
Klavis AI (YC X25): MCP integration platforms that let AI agents use tools reliably at any scale
The open source platform for AI-native application development.
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台自动化标注,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有知识库,AI模型市场…
[ECCV 2024] Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
This Inventory management system is the currently Ford Asia Pacific after-sales logistics warehousing supply chain process . After I leave Ford , I start this project . You can share your vacant wa…
Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESM…
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs o…
Nexent is a zero-code platform for auto-generating agents — no orchestration, no complex drag-and-drop required. Nexent also offers powerful capabilities for agent running control, data processing …
[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
The next generation deep reinforcement learning tookit
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
Easiest and laziest way for building multi-agent LLMs applications.
DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
Represent, send, store and search multimodal data
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / veRL/ Swift / Ultra…
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.