OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests

OpenAI, together with Georgia Tech, has released new research that takes a close look at why chatbots keep making errors. The study argues that the root issue is not in how the systems are built but in how they are trained and scored. Current evaluation tests grade answers as right or wrong with no reward for admitting a lack of knowledge. As a result, models such as ChatGPT from OpenAI and DeepSeek-V3 learn to guess with confidence instead of holding back when unsure.

Elevate Your Investing Strategy:

Take advantage of TipRanks Premium at 50% off! Unlock powerful investing tools, advanced data, and expert analyst insights to help you invest with confidence.

The team shows that hallucinations, or incorrect answers, follow the same math rules as simple test errors. For instance, if a fact only shows up one time in training data, the model will almost always struggle with it later. In a test, even leading models gave several wrong birthdays for one of the authors, rather than saying they did not know. This shows how the push to answer outweighs the push to pause.

Proposed Fix and What It Means for Trust

The researchers suggest that the fix lies in how answers are scored. They propose a new system that gives points for correct answers, removes points for wrong ones, and leaves a zero score for a clear “I don’t know.” In trials, models that skipped answers more often ended up with fewer errors overall, even though their accuracy rate looked lower on paper.

For investors and users, this study highlights that the problem of AI errors is tied to training rules more than hidden faults. It also shows that better scoring could build more trust in AI systems used in fields such as finance, health, and law. Trust is a keyword for all AI systems. Naturally, the more we trust the AI chatbot, the greater the potential to boost the company’s top line.

Using TipRanks’ Comparison Tool, we analyzed several leading companies developing AI chatbots similar to ChatGPT. This side-by-side view helps investors better understand each stock as well as the broader AI chatbot market.

Disclaimer & Disclosure Report an Issue

OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests

Elevate Your Investing Strategy:

Proposed Fix and What It Means for Trust

Latest News Feed

More Articles

Stock Comparison

Investment Ideas