-
HCMC University of Technology
- Saigon Metropolitan Area
- @anhduyle1603
- https://scholar.google.com/citations?user=VSj_iOQAAAAJ&hl=vi
LLM
No fortress, purely open ground. OpenManus is Coming.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [EMNLP 2025]
This is the repository for the Tool Learning survey.
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas
Fully open reproduction of DeepSeek-R1
[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges
A course on aligning smol models.
Train transformer language models with reinforcement learning.
🚀 The fast, Pythonic way to build MCP servers and clients
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
Generic MCP Client to use any MCP tool in a chat
The simplest, fastest repository for training/finetuning small-sized VLMs.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.
[arXiv: 2505.17163] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning
A Comprehensive Toolkit for High-Quality PDF Content Extraction
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
DataComp: In search of the next generation of multimodal datasets
[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers
Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision