-
Creative Minds
-
14:01
(UTC -07:00)
Highlights
Starred repositories
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
An Autonomous LLM Agent for Complex Task Solving
A lightweight LMM-based Document Parsing Model
Klavis AI: MCP integration platforms that let AI agents use tools reliably at any scale
The open source platform for AI-native application development.
HunyuanVideo-1.5: A leading lightweight video generation model
Align Anything: Training All-modality Model with Feedback
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs o…
Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…
The next generation deep reinforcement learning tookit
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.
🔥 [ICCV 2025 Highlight] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
[EMNLP-2024] Build multimodal language agents for fast prototype and production
SDG is a specialized framework designed to generate high-quality structured tabular data.
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台/MaaS/mlops/人工智能平台/训推平台,算法全链路流程,算力租赁平台,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务,VGPU虚拟化,云边端协同,边缘计算,自动化标注平台,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有…
Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"
Res-SAM Framework for GPR Underground Hazard Detection
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
( TPAMI2022 / CVPR2019 Oral ) Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation
[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer, ICLR 2026
[NeurIPS 2025 Main] SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications
A blueprint for building production-ready RAG systems that minimize hallucination, featuring switchable 3-step (Speed) and 4-step (Precision) pipelines.