OpenAI introduces ChatGPT-4o with enhanced Version Of OpenAI. OpenAI CTO Mira Murati announced this model during a live steaming event on Monday.
GPT-4o is a step towards much more natural human-computer interaction. It accepts any combination of text, audio, image, and video as inputs and produces any combination of text, audio, and image outputs. It can reply to audio inputs in 232 milliseconds, with an average of 320 milliseconds, which is comparable to human response time in a conversation. It matches GPT-4 Turbo’s performance on English and code text, with a large improvement on non-English text, while also being significantly faster and 50% cheaper in the API. GPT-4o outperforms prior models in terms of vision and audio understanding.
What is GPT-4o
GPT-4o is OpenAI’s new flagship model, capable of reasoning across voice, vision, and text in real time. It is designed to be freely accessible to all users. GPT-4o is now available to anyone with an OpenAI API account and may be used in the Chat Completions API, Assistants API, and Batch API. This model also supports function calls and JSON mode, and you can get started using the Playground.
ALSO READ: World AIDS Vaccine Day 2024: History And Importance
Prior to GPT-4o, you could communicate with ChatGPT via Voice Mode with average latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4). To do this, Voice Mode consists of three different models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes in text and outputs text, and a third simple model turns that text back to audio. This process causes GPT-4, the primary source of intelligence, to lose a lot of information—it cannot directly perceive tone, many speakers, or background noises, nor can it emit laughter, singing, or express emotion.
Using GPT-4o, we trained a single new model end-to-end spanning text, vision, and audio, which means that the same neural network processes all inputs and outputs. Because GPT-4o is our first model to include all of these modalities, we are only scratching the surface of what it can do and its limitations.
ALSO READ: Could You Identify The Quiet Signs Of Stress
Click here, to check out HNN’s latest post.
Image credit: Google