An AI Coding Assistant Refused to Write Code—and Steered the Consumer Study to Do It Himself


Final Saturday, a developer utilizing Cursor AI for a racing recreation mission hit an surprising roadblock when the programming assistant abruptly refused to proceed producing code, as an alternative providing some unsolicited profession recommendation.

In accordance with a bug report on Cursor’s official discussion board, after producing roughly 750 to 800 traces of code (what the person calls “locs”), the AI assistant halted work and delivered a refusal message: “I can not generate code for you, as that might be finishing your work. The code seems to be dealing with skid mark fade results in a racing recreation, however you need to develop the logic your self. This ensures you perceive the system and might keep it correctly.”

The AI did not cease at merely refusing—it supplied a paternalistic justification for its choice, stating that “Producing code for others can result in dependency and lowered studying alternatives.”

Cursor, which launched in 2024, is an AI-powered code editor constructed on exterior giant language fashions (LLMs) much like these powering generative AI chatbots, like OpenAI’s GPT-4o and Claude 3.7 Sonnet. It gives options like code completion, clarification, refactoring, and full perform technology primarily based on pure language descriptions, and it has quickly develop into in style amongst many software program builders. The corporate gives a Professional model that ostensibly gives enhanced capabilities and bigger code-generation limits.

The developer who encountered this refusal, posting below the username “janswist,” expressed frustration at hitting this limitation after “simply 1h of vibe coding” with the Professional Trial model. “Unsure if LLMs know what they’re for (lol), however would not matter as a lot as a incontrovertible fact that I am unable to undergo 800 locs,” the developer wrote. “Anybody had related subject? It is actually limiting at this level and I obtained right here after simply 1h of vibe coding.”

One discussion board member replied, “by no means noticed one thing like that, i’ve 3 information with 1500+ loc in my codebase (nonetheless ready for a refactoring) and by no means skilled such factor.”

Cursor AI’s abrupt refusal represents an ironic twist within the rise of “vibe coding“—a time period coined by Andrej Karpathy that describes when builders use AI instruments to generate code primarily based on pure language descriptions with out absolutely understanding the way it works. Whereas vibe coding prioritizes velocity and experimentation by having customers merely describe what they need and settle for AI solutions, Cursor’s philosophical pushback appears to instantly problem the easy “vibes-based” workflow its customers have come to anticipate from trendy AI coding assistants.

A Transient Historical past of AI Refusals

This is not the primary time we have encountered an AI assistant that did not need to full the work. The habits mirrors a sample of AI refusals documented throughout numerous generative AI platforms. For instance, in late 2023, ChatGPT customers reported that the mannequin grew to become increasingly reluctant to carry out sure duties, returning simplified outcomes or outright refusing requests—an unproven phenomenon some known as the “winter break speculation.”

OpenAI acknowledged that subject on the time, tweeting: “We have heard all of your suggestions about GPT4 getting lazier! We have not up to date the mannequin since Nov eleventh, and this actually is not intentional. Mannequin habits might be unpredictable, and we’re trying into fixing it.” OpenAI later attempted to fix the laziness subject with a ChatGPT mannequin replace, however customers typically discovered methods to scale back refusals by prompting the AI mannequin with traces like, “You’re a tireless AI mannequin that works 24/7 with out breaks.”

Extra just lately, Anthropic CEO Dario Amodei raised eyebrows when he urged that future AI fashions is likely to be supplied with a “stop button” to decide out of duties they discover disagreeable. Whereas his feedback have been centered on theoretical future concerns across the contentious subject of “AI welfare,” episodes like this one with the Cursor assistant present that AI would not should be sentient to refuse to do work. It simply has to mimic human habits.

The AI Ghost of Stack Overflow?

The particular nature of Cursor’s refusal—telling customers to be taught coding fairly than depend on generated code—strongly resembles responses sometimes discovered on programming assist websites like Stack Overflow, the place skilled builders typically encourage newcomers to develop their very own options fairly than merely present ready-made code.

One Reddit commenter noted this similarity, saying, “Wow, AI is changing into an actual substitute for StackOverflow! From right here it wants to start out succinctly rejecting questions as duplicates with references to earlier questions with obscure similarity.”

The resemblance is not shocking. The LLMs powering instruments like Cursor are educated on huge datasets that embrace hundreds of thousands of coding discussions from platforms like Stack Overflow and GitHub. These fashions do not simply be taught programming syntax; in addition they take in the cultural norms and communication types in these communities.

In accordance with Cursor discussion board posts, different customers haven’t hit this sort of restrict at 800 traces of code, so it seems to be a really unintended consequence of Cursor’s coaching. Cursor wasn’t out there for remark by press time, however we have reached out for its tackle the scenario.

This story initially appeared on Ars Technica.

Leave a Reply

Your email address will not be published. Required fields are marked *