Researchers suggest that Whisper, OpenAI’s transcription software, is facing problems related to hallucinations, according to a recent study.

A recent story by the Associated Press has raised alarm bells among software engineers, developers, and academic researchers regarding the accuracy of transcriptions produced by OpenAI’s Whisper.

There has been ongoing debate about the propensity of generative AI to “hallucinate” or create false information. It is somewhat unexpected to find this issue appearing in the realm of transcription, where the expectation is that the text should accurately reflect the audio being transcribed.

According to the Associated Press, researchers have discovered that Whisper transcriptions have included everything from unsolicited racial observations to fabricated medical treatments. This could pose a significant risk, especially as Whisper is being implemented in sensitive areas such as healthcare.

A study conducted by a researcher at the University of Michigan found hallucinations in 80% of audio transcriptions from public meetings. A machine learning engineer who reviewed over 100 hours of Whisper transcriptions found hallucinations in more than half. In another case, a developer reported finding hallucinations in nearly all of the 26,000 transcriptions he developed using Whisper.

A representative from OpenAI has stated that the company is persistently striving to enhance the precision of their models and minimize hallucinations. They also emphasized that the use of Whisper is prohibited in situations where high-stakes decisions are involved.

They expressed gratitude to the researchers for bringing their findings to their attention.

Comments are closed.