-
02:55
(UTC +08:00) - wechat: Indian_mi_fans
- https://scholar.google.com/citations?user=Xy7aIcwAAAAJ&hl=zh-CN
- https://huggingface.co/OrlandoHugBot
Stars
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
The ICDAR 2019 cTDaR is to evaluate the performance of methods for table detection (TRACK A) and table recognition (TRACK B). For the first track, document images containing one or several tables a…
Official implementation of TrajBooster
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
Enjoy the magic of Diffusion models!
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
slime is an LLM post-training framework for RL Scaling.
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo…
Open Source Implementation of Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
✨✨latest advancements of RL in generative ai
Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Model Compression Toolbox for Large Language Models and Diffusion Models