ML Research Analysis Corpus

A structured library of 121,245 machine-learning paper analyses covering arXiv publications from 2023–2025. Each paper is distilled into a standardised markdown report so you can survey findings, compare mechanisms, and spot trends without reading thousands of PDFs.

Year	Analyses	Unique papers
2023	29,961	29,961
2024	39,185	38,027
2025	52,099	51,517

What an analysis looks like

Every file follows the same template. Here's a trimmed example (2511.21730_a-benchmark-for-procedural-memory-retrieval_…md):

# frontmatter
arxiv_id: '2511.21730'
core_contribution: >
  Introduces the first benchmark for evaluating procedural
  memory retrieval in language agents, isolating retrieval
  from execution …
tags: [procedural, retrieval, memory, …]   # ⚠ see caveat below

## Quick Facts          — arXiv link, authors, headline numbers
## Executive Summary    — what the paper does in one paragraph
## Method Summary       — experimental setup, models, data
## Key Results          — quantitative findings
## Mechanism Analysis   — *why* the approach works (multiple sub-sections)
## Reproduction Notes   — hyperparameters, compute, data details
## Limitations & Confidence

Tag caveat: Auto-generated tags are noisy (every record shares a long generic tail). Prefer searching core_contribution, titles, and body text. Tag regeneration is planned.

How to use this corpus

Browse a topic

Pick a year folder and search with ripgrep:

# Find all papers mentioning mixture-of-experts
rg -l "mixture of experts|MoE" ml_research_analysis_2025/

# Full-text search across every year
rg -n "speculative decoding" ml_research_analysis_202*/

Structured search via script

The search script works around noisy tags by matching across title, core_contribution, and filename:

python scripts/search_topic.py --topic "mixture of experts" --alias moe
python scripts/search_topic.py --topic "reinforcement learning" --alias rl --limit 25 --json

Query the SQLite index

analysis_outputs/research_index.sqlite indexes the 2025 bucket (52,099 rows) with columns: title, arxiv_id, core_contribution, tags, filename, file_size.

# papers whose core contribution mentions "distillation"
sqlite3 analysis_outputs/research_index.sqlite \
  "SELECT arxiv_id, title FROM papers WHERE core_contribution LIKE '%distillation%' LIMIT 10"

# look up a specific paper
sqlite3 analysis_outputs/research_index.sqlite \
  "SELECT filename FROM papers WHERE arxiv_id = '1706.03762'"

Explore curated topic groups

The spot_analyses/ directory and the spot_analysis_paper_groups table contain deep-dive clusters across eight research themes:

Group	Theme
`test_time_compute_scaling`	Scaling compute at inference
`reasoning_distillation`	Distilling reasoning capabilities
`multi_agent_debate`	Multi-agent argumentation
`process_reward_models`	Step-level reward modelling
`agentic_workflow_pipeline_design`	LLM agent architectures
`adaptive_compute_allocation`	Dynamic compute budgets
`test_time_adaptation`	Adapting models at test time
`continual_online_tta`	Continual / online TTA

Browse the website

The website/ directory contains a static site with full-text search. See website/README.md for build and deploy instructions.

Repository layout

ml_research_analysis_2023/   Per-paper markdown analyses
ml_research_analysis_2024/
ml_research_analysis_2025/
analysis_outputs/            SQLite index, digests, assessment outputs
scripts/                     index_frontmatter.py, search_topic.py
spot_analyses/               Curated topic deep-dives (8 groups, 1,824 papers)
website/                     Static browse/search UI
docs/                        Internal reference documents
archive/                     Superseded v1 analyses

How analyses are generated

A three-phase FlatAgents pipeline produces each report:

Prep — download arXiv PDF, extract text, match against ML terminology corpus
Expensive — parallel LLM calls for mechanism analysis, reproduction notes, and open questions
Wrap — limitations/confidence, tagging, report assembly, quality judge + auto-repair

The 2025 batch used GLM-5 (pony-alpha) for the expensive phase; 2023–2024 used Trinity Large throughout. Pipeline code, configs, and execution databases live in the pipeline repo — this repository is output only.

Known limitations

~190 permanent failures across all years: PDF 404s (~106), context overflow >256k (~60), provider errors (~9), PDF parse errors (~15). No pending retries.
Tags are unreliable — the tail of every tag list contains generic terms. Use core_contribution and full-text search instead.
Duplicate filenames exist where papers were rerun (1,158 in 2024, 582 in 2025). The SQLite index and filenames are deduplicated by (arxiv_id, timestamp).

Reindexing

After adding or removing analysis files, rebuild the SQLite index:

python scripts/index_frontmatter.py ml_research_analysis_2025
python scripts/index_frontmatter.py ml_research_analysis_2025 --prune  # also remove deleted files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Research Analysis Corpus

What an analysis looks like

How to use this corpus

Browse a topic

Structured search via script

Query the SQLite index

Explore curated topic groups

Browse the website

Repository layout

How analyses are generated

Known limitations

Reindexing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
analysis_outputs		analysis_outputs
archive		archive
docs		docs
ml_research_analysis		ml_research_analysis
ml_research_analysis_2023		ml_research_analysis_2023
ml_research_analysis_2024		ml_research_analysis_2024
ml_research_analysis_2025		ml_research_analysis_2025
ml_research_analysis_2026		ml_research_analysis_2026
scripts		scripts
spot_analyses/technique_extraction		spot_analyses/technique_extraction
website		website
.gitignore		.gitignore
AGENTS.md		AGENTS.md
GPT53_CODEX_HIGH_QUALITY_CHECKS_RUBRIC.md		GPT53_CODEX_HIGH_QUALITY_CHECKS_RUBRIC.md
README.md		README.md
TERMS_ANALYSIS.md		TERMS_ANALYSIS.md

Folders and files

Latest commit

History

Repository files navigation

ML Research Analysis Corpus

What an analysis looks like

How to use this corpus

Browse a topic

Structured search via script

Query the SQLite index

Explore curated topic groups

Browse the website

Repository layout

How analyses are generated

Known limitations

Reindexing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages