Saturday, April 20, 2024
Outlook.com
Outlook India
Outlook Business

OpenAI Launches Whisper API for Multilingual Speech-to-Text Transcription

Despite its limitations, the AI-powered language learning app Speak is already using the Whisper API to power a new in-app virtual speaking companion

OpenAI Launches Whisper API for Multilingual Speech-to-Text Transcription

Outlook Start-Up Desk

POSTED ON March 04, 2023 11:53 AM

OpenAI has launched Whisper API, a hosted version of its open-source Whisper speech-to-text model, which was released in September 2021. Priced at $0.006 per minute, Whisper provides automatic speech recognition and translation from multiple languages into English. 

OpenAI claims the system allows for “robust” transcription across various languages and unique accents, background noise and technical jargon. It accepts various file formats including M4A, MP3, MP4, MPEG, MPGA, WAV and WEBM. The system has been trained on 680,000 hours of multilingual and “multitask” data gathered from the web. The data set has enabled improved recognition of accents and technical jargon. The Whisper API is optimised for convenience and speed.

Enterprise adoption of voice transcription technology has been slowed by several barriers including cost, accent- or dialect-related recognition issues, and accuracy. OpenAI acknowledges the limitations of Whisper, which include next-word prediction, and warns that it might transcribe words that were not spoken due to its attempt to predict the next word in the audio recording and transcribe it. Additionally, the system does not perform equally well across languages and has a higher error rate for underrepresented languages.

OpenAI sees Whisper’s transcription capabilities as being useful for improving existing products and tools. The AI-powered language learning app Speak is already using the Whisper API to power a new in-app virtual speaking companion. If OpenAI is able to break into the speech-to-text market, it could be highly profitable for the Microsoft-backed company. The segment is projected to be worth $5.4bn by 2026, up from $2.2bn in 2021.

OpenAI’s goal is to become a “universal intelligence” that can take in any kind of data and perform any task. While the company acknowledges the issue of bias in speech recognition systems, it is optimistic about the potential for Whisper and believes that it will be a force multiplier for attention.

  • Related Articles

    From post-booking support, itinerary planning, travel marketing, fraud detection, and virtual guide for customers to training travel agents, ChatGPT offers numerous applications for the travel sector...

    Can ChatGPT Become A Game-Changer Travel Start-Ups?

    The Chinese e-commerce giant Alibaba joins technology firms including Alphabet and Baidu to compete to OpenAI and Microsoft’s ChatGPT AI chatbot

    Alibaba Shares Jump As It Announces Testing Of ChatGPT rival: Report

    Google’s AI service, Bard, is seen as the tech giant’s answer to rival Microsoft, which aims to take its products and services to the next level using OpenAI’s ChatGPT. But, with ChatGPT having...

    Can Google Bard Stop Microsoft-Backed ChatGPT's Rise In Conversational AI Market?