Because it’s Friday: Wrong in Real Time

50 sec read

A nonprofit research group called METR (Model Evaluation & Threat Research) ran a randomized controlled trial earlier this year to measure what AI coding tools actually do to developer productivity. Sixteen experienced developers, working on real open-source projects they’d contributed to for years, completed 246 tasks – some with AI tools, some without, randomly assigned.

Before the study, developers predicted AI would make them 24% faster. After finishing, they estimated it had made them 20% faster. The actual result was that AI made them 19% slower.

The more interesting detail is that developers were wrong about this in real time, while it was happening. When AI was allowed, they spent less time coding and more time prompting, waiting, and reviewing outputs – and didn’t seem to register the difference.

Worth noting: METR studies AI partly to understand whether it is approaching the point where it could meaningfully accelerate AI research itself. This study was their attempt to measure exactly that in a real-world setting.

METR has since updated the study design because a growing share of developers refused to work without AI tools, making the original methodology difficult to sustain. That said, 69% of developers in the study continued using the tools afterward, which suggests time saved may not be the only thing people are getting from them. One participant put it plainly: “I’m torn. I’d like to help provide updated data on this question but also I really like using AI.

Read the full post on METR’s website. Have a good weekend.

Because It’s Friday: The Mozzarella Phase

A group of Italian physicists got tired of their cacio e pepe turning clumpy when cooking for big groups, so they did what any...
mladvocate
55 sec read

Because It’s Friday: Spurious Scholar

Tyler Vigen’s Spurious Correlations has been a beloved corner of the internet since 2014, pairing line charts of things like Nicolas Cage film appearances...
mladvocate
39 sec read

Because It’s Friday: The Animal Fart Database

Somewhere between a Twitter joke and a peer-reviewed research tool lives #DoesItFart, a crowdsourced spreadsheet maintained by biologists cataloging which animals pass gas and...
mladvocate
41 sec read

Leave a Reply

Your email address will not be published. Required fields are marked *

ML Advocate Assistant
Answers from the blog
Hi! 👋 Ask me anything about machine learning and AI! I'll answer using ML Advocate blog posts.