These AI companies are now trying out experimental models on the LMSYS Chatbot Arena while maintaining a low profile and deploying them with unusual names, without release notes. Over the past few days, many of our users have noticed that ChatGPT was better at dealing with coding and creative tasks. So, the speculation is that a new OpenAI model (maybe something Project Strawberry-level? It was reasoned to be an advanced reasoning engine) made use of all these images.
More About the New Major Update in ChatGPT
OpenAI has now officially confirmed that they released a new model called ChatGPT, which is an updated iteration of GPT-4o and not on the same scale as their frontier models. According to the release note, this new model called chatgpt-4o-latest has been optimized for chats that have also experienced fine-tuning using the aforementioned qualitative feedback and experimental results.
The organization is also working to improve the training dataset by scrubbing bad data, adding more good information as well as testing out new research methods. Similar to the FPT training strategy Strawberry rumor is introduced with in post-training reasoning boost The new ChatGPT model could arguably be already using this engine.
Indeed, numerous X users attest that Chat GPT now uses multi-step reasoning to get the correct answers. This method includes generating rationales for their step-by-step answers until they arrive at the correct solution.
On LMSYS, OpenAI was able to test the new ChatGPT model labeled as “anonymous chatbot,” giving it more than 11,000 votes. Now, the top score holder has been chatgpt-4o-latest overtaking all AI models from Google, Anthropic, and Meta for the first time scoring 1314 in LMSYS Arena. In tests of the updated ChatGPT model using reasoning prompts, there was little noticeable difference from the older version.
For instance, when asked to compare 9.11 and 9.9, it correctly identified the larger number, just as before. Other commonsense reasoning questions produced similar results. However, the model still struggles with certain prompts. For example, when asked how to stack a book, 9 eggs, a laptop, a bottle, and a nail, it suggested placing 9 eggs on top of the bottle, which is impossible. In another test, it incorrectly stated that the word “strawberry” contains only two “R”s.
It’s possible that the new ChatGPT model has not yet been fully rolled out. Regardless, further improvements in key areas are expected with OpenAI’s new model. Feel free to share any questions in the comments.
FAQs
What’s new in the latest ChatGPT update?
The update introduces an improved GPT-4o model optimized for chat performance.
Is the new ChatGPT model using Project Strawberry?
It’s speculated but not confirmed if the model uses the Project Strawberry engine.