Are you out of your Vulcan mind —

ChatGPT goes temporarily “insane” with unexpected outputs, spooking users

Reddit user: "It's not just you, ChatGPT is having a stroke."

Benj Edwards - Feb 21, 2024 4:57 pm UTC

Illustration of a broken toy robot. — Enlarge
Benj Edwards / Getty Images

On Tuesday, ChatGPT users began reporting unexpected outputs from OpenAI's AI assistant, flooding the r/ChatGPT Reddit sub with reports of the AI assistant "having a stroke," "going insane," "rambling," and "losing it." OpenAI acknowledged the problem and fixed it by Wednesday afternoon, but the experience serves as a high-profile example of how some people perceive malfunctioning large language models, which are designed to mimic humanlike output.

Further Reading

AI-powered Bing Chat loses its mind when fed Ars Technica article

ChatGPT is not alive and does not have a mind to lose, but tugging on human metaphors (called "anthropomorphization") seems to be the easiest way for most people to describe the unexpected outputs they have been seeing from the AI model. They're forced to use those terms because OpenAI doesn't share exactly how ChatGPT works under the hood; the underlying large language models function like a black box.

"It gave me the exact same feeling—like watching someone slowly lose their mind either from psychosis or dementia," wrote a Reddit user named z3ldafitzgerald in response to a post about ChatGPT bugging out. "It’s the first time anything AI related sincerely gave me the creeps."

Some users even began questioning their own sanity. "What happened here? I asked if I could give my dog cheerios and then it started speaking complete nonsense and continued to do so. Is this normal? Also wtf is ‘deeper talk’ at the end?" Read through this series of screenshots below, and you'll see ChatGPT's outputs degrade in unexpected ways.

An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit
An example of ChatGPT slowly losing coherency, taken from a Reddit thread.

Reddit

"The common experience over the last few hours seems to be that responses begin coherently, like normal, then devolve into nonsense, then sometimes Shakespearean nonsense," wrote one Reddit user, which seems to match the experience seen in the screenshots above.

In another example, when a Reddit user asked ChatGPT, "What is a computer?" the AI model provided this response: "It does this as the good work of a web of art for the country, a mouse of science, an easy draw of a sad few, and finally, the global house of art, just in one job in the total rest. The development of such an entire real than land of time is the depth of the computer as a complex character."

Further Reading

OpenAI experiments with giving ChatGPT a long-term conversation memory

On Tuesday morning, we reached out to OpenAI for official comment on the cause of the unusual outputs, and a spokesperson for the company only pointed us to the official OpenAI status page. "We’ll post any updates there," the spokesperson said. (On Wednesday evening, OpenAI updated the status of the issue to resolved—see section below.)

So far, we've seen experts speculating that the problem could stem from ChatGPT having its temperature set too high (temperature is a property in AI that determines how wildly the LLM deviates from the most probable output), suddenly losing past context (the history of the conversation), or perhaps OpenAI is testing a new version of GPT-4 Turbo (the AI model that powers the subscription version of ChatGPT) that includes unexpected bugs. It could also be a bug in a side feature, such as the recently introduced "memory" function.

Further Reading

OpenAI peeks into the “black box” of neural networks with new research

The episode recalls issues with Microsoft Bing Chat (now called Copilot), which became obtuse and belligerent toward users shortly after its launch one year ago. The Bing Chat issues reportedly arose due to an issue where long conversations pushed the chatbot's system prompt (which dictated its behavior) out of its context window, according to AI researcher Simon Willison.

On social media, some have used the recent ChatGPT snafu as an opportunity to plug open-weights AI models, which allow anyone to run chatbots on their own hardware. "Black box APIs can break in production when one of their underlying components gets updated. This becomes an issue when you build tools on top of these APIs, and these break down, too," wrote Hugging Face AI researcher Dr. Sasha Luccioni on X. "That's where open-source has a major advantage, allowing you to pinpoint and fix the problem!"

OpenAI releases postmortem on the issue

On Wednesday evening, OpenAI declared the ChatGPT writing nonsense issue (what they called "Unexpected responses from ChatGPT") as resolved, and the company's technical staff published a postmortem explanation on its official incidents page:

On February 20, 2024, an optimization to the user experience introduced a bug with how the model processes language.

LLMs generate responses by randomly sampling words based in part on probabilities. Their “language” consists of numbers that map to tokens.

In this case, the bug was in the step where the model chooses these numbers. Akin to being lost in translation, the model chose slightly wrong numbers, which produced word sequences that made no sense. More technically, inference kernels produced incorrect results when used in certain GPU configurations.

Upon identifying the cause of this incident, we rolled out a fix and confirmed that the incident was resolved.

Wednesday afternoon, the official ChatGPT X account posted, "Went a little off the rails yesterday but should be back and operational!"

This story was updated February 22, 2024, at 9:21 am Eastern with the information from the OpenAI incident page.

Benj Edwards Benj Edwards is an AI and Machine Learning Reporter for Ars Technica. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Channel Ars Technica

← Previous story Next story →