-
HNA Group
- Guangzhou
Stars
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
awesome game security [Welcome to PR]
Build multimodal language agents for fast prototype and production
Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…
[ICCV2025] LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds
https://dev.to/answeryt/the-demo-spell-and-production-dilemma-of-ai-agents-how-i-built-a-self-learning-agent-system-4okk
SDG is a specialized framework designed to generate high-quality structured tabular data.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
🚀 EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents
Your Automatic Prompt Engineering Assistant for GenAI Applications
Applications self-hosting and DevOps platform for running open source, web-based linux Panel of lite PaaS
PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.
A library for users to write (experiment in research) configurations in Python Dict or JSON format, read and write parameter value via dot . in code, while can read parameters from the command line…
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model
[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert
airda(Air Data Agent)是面向数据分析的多智能体,能够理解数据开发和数据分析需求、理解数据、生成面向数据查询、数据可视化、机器学习等任务的SQL和Python代码
Train your Agent model via our easy and efficient framework
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"