OpenAI recently announced the launch of its newest large language model, GPT-4o. GPT-4o, which stands for “the fastest and largest” model the firm has yet produced, the company has claimed, is likely to increase ChatGPT’s language abilities and make it easier. OpenAI’s best model before was GPT-3 and exclusive access cost money, but with GPT-4o, all users can use it without charge.
What is GPT-4o?
GPT-4o, or “o” for “omni”, is considered to be “the most significant breakthrough in AI systems that enhance interaction with humans and computers”. Moreover, unlike all previous models, it is multimodal since it can reply in three formats, text, audio, and images, to anyone who inputs in all three. Describing the new model, OpenAI CTO Mira Murati emphasized the significant leap in ease of use it represents.
GPT-4o interacts via text and vision. It can therefore assess and have discussions about screenshots, pictures, papers, or diagrams users upload. According to OpenAI, the new ChatGPT model will have even more extensive memory capabilities and will profit from past interactions between users.
Technology Behind GPT-4o
LLMs are large language models on which AI chatbots are based. They have the ability to learn from large data sets. Unlike the prior versions, which required users to train many models for numerous duties, GPT-4o was developed with multi-modality, which means a single-stage design was developed across multiple modalities – text, vision, and sound.
Features and Abilities
It is speedy and efficient; for example, it takes 232 to 320 milliseconds to converse, equivalent to a human conversation, and it is constantly available. Multi-language support has also been expanded, as has the ability to function in languages other than English.
Availability
Text and image capabilities were released in February 2021 to chat with the ChatGPT model within an auto-responder. Following the extended version of the tool, ChatGPT, audio capabilities will be made available in November 2021, and video capabilities will be introduced in January 2022, during larger-scale initial access with the GPT-4o Framework.
Limitations and Safety Concerns
Even with the prolonged access to limited audio outputs, the initial access will provide exclusively limited abilities and preset voices. This pragmatic approach reduces the number of quadrants that need to be used for safety or usage evaluation. OpenAI has taken significant precautions to evaluate risks including cybersecurity, misinformation, and bias. While GPT-4o is currently assessed as posing a Medium-level risk in these areas, ongoing efforts are underway to identify and mitigate emerging risks.