Because it’s Friday: Wrong in Real Time

50 sec read

A nonprofit research group called METR (Model Evaluation & Threat Research) ran a randomized controlled trial earlier this year to measure what AI coding tools actually do to developer productivity. Sixteen experienced developers, working on real open-source projects they’d contributed to for years, completed 246 tasks – some with AI tools, some without, randomly assigned.

Before the study, developers predicted AI would make them 24% faster. After finishing, they estimated it had made them 20% faster. The actual result was that AI made them 19% slower.

The more interesting detail is that developers were wrong about this in real time, while it was happening. When AI was allowed, they spent less time coding and more time prompting, waiting, and reviewing outputs – and didn’t seem to register the difference.

Worth noting: METR studies AI partly to understand whether it is approaching the point where it could meaningfully accelerate AI research itself. This study was their attempt to measure exactly that in a real-world setting.

METR has since updated the study design because a growing share of developers refused to work without AI tools, making the original methodology difficult to sustain. That said, 69% of developers in the study continued using the tools afterward, which suggests time saved may not be the only thing people are getting from them. One participant put it plainly: “I’m torn. I’d like to help provide updated data on this question but also I really like using AI.

Read the full post on METR’s website. Have a good weekend.

Because it’s Friday: Paul the Octopus

In 2010, the most accurate World Cup forecaster on the planet was an octopus named Paul. He lived at the Sea Life aquarium in...
Dora Moore
58 sec read

Because it’s Friday: The First World Cup Final

The first World Cup final, played in Montevideo on July 30, 1930, couldn’t even agree on a ball. Uruguay and Argentina each insisted on...
mladvocate
38 sec read

Because it’s Friday: Claude Has Feelings About This

Anthropic’s interpretability team just published a paper on why LLMs sometimes act like they have emotions. The answer turned out to be: because they...
mladvocate
54 sec read
SiteLock
ML Advocate Assistant
Answers from the blog
Hi! 👋 Ask me anything about machine learning and AI! I'll answer using ML Advocate blog posts.