A Comparison of Agentic AI Systems and Human Economists


This paper compares agentic AI systems and human economists performing the same causal inference tasks. AI systems and humans generally obtain similar median causal effect estimates. While there is substantial dispersion of estimates across model instances, the human distributions of estimates have wider tails. Using AI models as reviewers to compare and rank “submissions,” the following ranking emerges regardless of reviewer model: (1) Codex GPT-5.4, (2) Codex GPT-5.3-Codex, (3) Claude Code Opus 4.6, and (4) Human Researchers. These findings suggest that agentic AI systems will allow us to scale empirical research in economics.

I enjoy the name of the author, namely Serafin Grundl.  Here is the paper, via Ethan Mollick.  You could interpret these results as showing the AIs have fewer hallucinations.  And just to reiterate a key point from the paper:

The second part of this paper is an AI review tournament in which “submissions” (codes and write-ups) from humans and the AI models are compared and ranked against each other. The reviewers are the following AI models: Gemini 3.1 Pro Preview, Opus 4.6 and GPT-5.4. For each review the reviewer is asked to write a report comparing four submissions (human, Opus 4.6, GPT-5.3-Codex, GPT-5.4). Each reviewer model writes comparison reports for the same 300 comparison groups. The average rankings are strikingly similar across reviewer models: (1) Codex GPT-5.4, (2) Codex GPT-5.3-Codex, (3) Claude Code Opus 4.6, and 2(4) Human Researchers.

Who comes in last?  Hi people!




Source link

  • Related Posts

    Biggest B.C. mining mergers and acquisitions of 2025

    Ranked by dollar value of the deal in CAD. Published April 20, 2026. Sources: Interviews with companies below, agents and/or advisers and BIV research. Source link

    Deadly shooting at Mexico tourist site

    IE 11 is not supported. For an optimal experience visit our site on another browser. Trump ‘unlikely’ to extend ceasefire as deadline for deal approaches 02:08 FBI says it is…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    WATCH: Japan shakes and rattles as earthquake hits off coast

    WATCH:  Japan shakes and rattles as earthquake hits off coast

    Buck Bumble, the Nintendo 64 shooter starring a cyborg bumblebee, is almost certainly getting a remake

    Buck Bumble, the Nintendo 64 shooter starring a cyborg bumblebee, is almost certainly getting a remake

    ‘Systemic Lapses’: Frontier Slams American’s Safety Culture In 2nd Collision Lawsuit

    ‘Systemic Lapses’: Frontier Slams American’s Safety Culture In 2nd Collision Lawsuit

    Trump’s Federal Reserve nominee to face tough hearing before Senate panel

    Trump’s Federal Reserve nominee to face tough hearing before Senate panel

    Biggest B.C. mining mergers and acquisitions of 2025

    Montreal closes Île Mercier bridge to vehicles due to spring flooding – Montreal

    Montreal closes Île Mercier bridge to vehicles due to spring flooding – Montreal