OpenAI unveils new artificial intelligence model named GPT-4o

OpenAI introduced its latest model, GPT-4o, showcasing significant improvements in speed and versatility. Dubbed "omni" for its ability to process various input modalities (text, audio, and image) and generate outputs in the same modalities, GPT-4o marks a significant step towards natural human-computer interaction.

Tech
Anadolu Agency
Published Date: 07:06 | 14 May 2024
Modified Date: 07:06 | 14 May 2024

American artificial intelligence (AI) company OpenAI unveiled Monday its new model, named GPT-4o, which is much faster compared to previous models.

GPT-4o, in which "o" stands for "omni," is a step towards more natural human-computer interaction, as it accepts input of any combination of text, audio and image, while it generates any combination of text, audio, and image outputs, the company said in a statement.

"It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation," it added.

In addition, GPT-4o is better at vision and audio understanding compared to existing models, while it can reason across audio, vision, and text in real time, according to the company.

While GPT-4 loses a lot of information since it cannot directly observe tone, multiple speakers, or background noises, and it cannot output laughter, singing, or express emotion; GPT-4o provides all inputs and outputs being processed by the same neural network.

Microsoft-backed OpenAI said GPT-4o also has undergone extensive teaming with more than 70 experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced by the newly added modalities.

OTHER TECH NEWS