Software program engineer workflows have been reworked in recent times by an inflow of AI coding instruments like Cursor and GitHub Copilot, which promise to boost productiveness by mechanically writing traces of code, fixing bugs, and testing modifications. The instruments are powered by AI fashions from OpenAI, Google DeepMind, Anthropic, and xAI which have quickly elevated their efficiency on a spread of software program engineering assessments in recent times.
Nevertheless, a new study printed Thursday by the non-profit AI analysis group METR calls into query the extent to which immediately’s AI coding instruments improve productiveness for skilled builders.
METR carried out a randomized managed trial for this research by recruiting 16 skilled open-source builders and having them full 246 actual duties on giant code repositories they repeatedly contribute to. The researchers randomly assigned roughly half of these duties as “AI-allowed,” giving builders permission to make use of state-of-the-art AI coding instruments similar to Cursor Professional, whereas the opposite half of duties forbade using AI instruments.
Earlier than finishing their assigned duties, the builders forecasted that utilizing AI coding instruments would cut back their completion time by 24%. That wasn’t the case.
“Surprisingly, we discover that permitting AI really will increase completion time by 19%— builders are slower when utilizing AI tooling,” the researchers mentioned.
Notably, solely 56% of the builders within the research had expertise utilizing Cursor, the primary AI software provided within the research. Whereas practically all of the builders (94%) had expertise utilizing some web-based LLMs of their coding workflows, this research was the primary time some used Cursor particularly. The researchers word that builders have been skilled on utilizing Cursor in preparation for the research.
Nonetheless, METR’s findings elevate questions in regards to the supposed common productiveness positive aspects promised by AI coding instruments in 2025. Based mostly on the research, builders shouldn’t assume that AI coding instruments — particularly what’s come to be often called “vibe coders” — will instantly pace up their workflows.
METR researchers level to some potential the explanation why AI slowed down builders slightly than rushing them up.
First, builders spend far more time prompting AI and ready for it to reply when utilizing vibe coders slightly than really coding. AI additionally tends to wrestle in giant, complicated code bases, which this check used.
The research’s authors are cautious not to attract any robust conclusions from these findings, explicitly noting they don’t consider AI methods presently fail to hurry up many or most software program builders. Different large scale studies have proven that AI coding instruments do pace up software program engineer workflows.
The authors additionally word that AI progress has been substantial in recent times, and that they wouldn’t anticipate the identical outcomes even three months from now. METR has additionally discovered that AI coding instruments have considerably improved their potential to complete complex, long-horizon tasks in recent times.
Nevertheless, the analysis gives but another excuse to be skeptical of the promised positive aspects of AI coding instruments. Different research have proven that immediately’s AI coding instruments can introduce mistakes, and in some circumstances, security vulnerabilities.