Continuously updated paper list on advancements in Data Agents. Companion repo to our paper "A Survey of Data Agents: Emerging Paradigm or Overstated Hype?"

Python 229 11 Updated Oct 29, 2025

HKUSTDial / DeepFund

🔥[NeurIPS'25] DeepFund: Pilot for Your Next Fund Investment

Python 214 35 Updated Oct 30, 2025

PzySeere / MetaSpatial

MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, realistic, and adaptive scene generation for applications in…

Python 193 7 Updated May 5, 2025

iie-ycx / DEER

This is the repository of DEER, a Dynamic Early Exit in Reasoning method for Large Reasoning Language Models.

Python 175 7 Updated Jul 7, 2025

KANABOON1 / MemGen

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

Python 161 13 Updated Nov 1, 2025

HKUSTDial / NL2SQL360

🔥[VLDB'24] Official repository for the paper “The Dawn of Natural Language to SQL: Are We Fully Ready?”

Python 139 17 Updated Oct 2, 2025

HKUSTDial / Alpha-SQL

🔥[ICML'25] Official repository for the paper "Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search"

Python 116 12 Updated Oct 23, 2025

mitmedialab / vizml

Plotly dataset-visualization pairs, feature extraction scripts, and model training code for VizML (CHI 2019)

Python 110 30 Updated May 20, 2021

RainBowLuoCS / OpenOmni

(NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis

Python 107 6 Updated Sep 22, 2025

DataArcTech / SQL-R1

[NeurIPS'25] Official Repository for the Paper "SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning"

Python 105 12 Updated Nov 6, 2025

chenyuxin1999 / S-DPO

[NeurIPS 2024] The implementation of paper "On Softmax Direct Preference Optimization for Recommendation"

Python 87 4 Updated Nov 29, 2024

IBM-HRL-MLHLS / IBM-Causal-Inference-Benchmarking-Framework

Data derived from the Linked Births and Deaths Data (LBIDD); simulated pairs of treatment assignment and outcomes; scoring code

Python 84 13 Updated May 23, 2018

LightChen233 / AutoPR

This is the official implementation for **"AUTOPR: LET'S AUTOMATE YOUR ACADEMIC PROMOTION!**".

Python 80 4 Updated Oct 16, 2025

Previous Next

Peixian Ma MPX0222

Lists (3)

🔮 Future ideas

✨ Inspiration

🚀 LLMs Management Framework

Stars