OpenAI’s GPT-4.5 Model Hallucinates 37% of the Time, New Benchmark Reveals

OpenAI has revealed that its latest large language model, GPT-4.5, hallucinates 37 percent of the time according to its in-house factuality benchmarking tool, SimpleQA. This means the model confidently generates inaccurate information over one-third of the time.

This admission raises concerns about the reliability of AI outputs, especially from a company valued at hundreds of billions of dollars. In comparison, OpenAI’s previous models have even higher hallucination rates. The GPT-4o model hallucinates 61.8 percent of the time, while the smaller o3-mini model has a staggering rate of 80.3 percent.

The issue of AI hallucination is not unique to OpenAI. Research indicates that even the best AI models can produce factually correct outputs only about 35 percent of the time. Experts emphasize that users should approach AI-generated content with caution due to these significant inaccuracies.

This situation highlights the challenges facing the AI industry as it strives to develop systems that mimic human intelligence. As OpenAI’s models plateau in performance, the company may need to seek genuine breakthroughs to maintain its leading position in the market.

For more information, visit the original article on Futurism.

OpenAI’s GPT-4.5 Model Hallucinates 37% of the Time, New Benchmark Reveals

Categories

Tech & Science(427)

Movies & TV(284)

Gaming(469)

People Reads

USCIS Proposes Social Media Disclosure for Citizenship Applicants

ChatGPT vs. Deep Research: A Comparison of AI Answering Styles

Kickstarter Launches for Wrath of the Wyvern: A Solo Dark Fantasy RPG

Fallout: Factions Core Rulebook Available for Pre-Order

Categories

Legals

DOJ Charges 12 Chinese Hackers in Major Cybercrime Case

Emerging Trend: Vibe Coding Transforms Software Development

USCIS Proposes Social Media Disclosure for Citizenship Applicants

ChatGPT vs. Deep Research: A Comparison of AI Answering Styles

Kickstarter Launches for Wrath of the Wyvern: A Solo Dark Fantasy RPG

Tags

Follow Us

Categories

Tech & Science(427)

Movies & TV(284)

Gaming(469)

People Reads

Categories

Legals

Subscribe Newsletter

Tags

Follow Us