`futureProofR`

492509219-8cd53b4c-b025-48d3-b193-9fc1531d1993

R software for robust inference with LLMs. This accompanies this working paper from Bisbee and Spirling:

What to Do When Humans Are No Longer the Gold Standard: Large Language Models, State of the Art and Robustness

The abstract is as follows:

In this short paper, we consider the research implications of large language model (LLM) capabilities approaching, perhaps exceeding, those of highly-trained humans. Specifically, we note that frontier LLMs demonstrate near-expert performance for many data annotation tasks, and they are getting better over time. We show what this will mean for inference in downstream tasks: optimistically, it is that estimated treatment effects will become larger, although claimed null effects may be more dubious. We argue that authors should focus more on sensitivity and robustness with respect to future technological change, and we demonstrate how to use local calibration for such problems. We discuss how our findings, combined with the fact that performance is inherently bounded above (at 100%), should affect debates on the importance of using proprietary “State of the Art” versus open-weight, replicable LLMs. We make available fast and free software (futureProofR) for implementing our suggestions

Comments are very welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
R		R
man		man
vignettes		vignettes
Bisbee_Spirling_human_gold_standard_9_23_2025.pdf		Bisbee_Spirling_human_gold_standard_9_23_2025.pdf
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

`futureProofR`

What to Do When Humans Are No Longer the Gold Standard: Large Language Models, State of the Art and Robustness

About

Licenses found

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

License

Licenses found

ArthurSpirling/futureProofR

Folders and files

Latest commit

History

Repository files navigation

futureProofR

What to Do When Humans Are No Longer the Gold Standard: Large Language Models, State of the Art and Robustness

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

`futureProofR`

Packages