By Anadolu Agency
May 14, 2024 6:43 amISTANBUL
American artificial intelligence (AI) company OpenAI unveiled Monday its new model, named GPT-4o, which is much faster compared to previous models.
GPT-4o, in which “o” stands for “omni,” is a step towards more natural human-computer interaction, as it accepts input of any combination of text, audio and image, while it generates any combination of text, audio, and image outputs, the company said in a statement.
“It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation,” it added.
In addition, GPT-4o is better at vision and audio understanding compared to existing models, while it can reason across audio, vision, and text in real time, according to the company.
While GPT-4 loses a lot of information since it cannot directly observe tone, multiple speakers, or background noises, and it cannot output laughter, singing, or express emotion; GPT-4o provides all inputs and outputs being processed by the same neural network.
Microsoft-backed OpenAI said GPT-4o also has undergone extensive teaming with more than 70 experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced by the newly added modalities.
We use cookies on our website to give you a better experience, improve performance, and for analytics. For more information, please see our Cookie Policy By clicking “Accept” you agree to our use of cookies.
Read More