‘Her’ Is Here: Everything You Need To Know About GPT-4o, Announced By OpenAI This Morning

3 min read

This article is for general information purposes only and isn’t intended to be financial product advice. You should always obtain your own independent advice before making any financial decisions. The Chainsaw and its contributors aren’t liable for any decisions based on this content.



Remember the 2013 movie Her about a lonely writer played by Joaquin Phoenix, who falls in love with his AI assistant, voiced by Scarlett Johansson? Well, reality just took a major step closer to that fiction with today’s announcement of GPT-4o from OpenAI.

At 3am Tuesday AEST (10am Monday Pacific Time) OpenAI Chief Technology Officer Mira Murati took to the stage at the OpenAI offices in a live presentation of the new GPT-4o model, which will be free for all users.

The o stands for omni and that’s exactly what this new AI model is — an omni talented AI that can seamlessly interact using audio, images, and text in real time. Put simply, it can see, hear, and understand language in a way that’s eerily close to human cognition. It can detect the subtleties of your speech, see your slight smirk. It’s basically Her. And Sam Altman is leaning in.

Imagine having a conversation with your computer or smartphone and it responding verbally, naturally, while also understanding the photos you show it, all with human-like speed and fluency. That’s the promise of GPT-4o.

Planning a trip to Italy but don’t speak Italian? Non c’è problema. GPT-4o can understand and communicate in multiple languages with near-human fluency, thanks to its real-time translation capabilities. It can even look at an image that contains text in a different language, and make sense of it all. With GPT-4o, language barriers could soon be a thing of the past. 

Staying on language, this morning’s announcement has major implications for non-English-speaking users of ChatGPT, with GPT-4o marking a significant improvement in its understanding and generation of non-English languages.

But GPT-4o’s understanding goes beyond language. It doesn’t just comprehend the images you show it — it can analyse them in depth, picking out details, recognising objects and scenes, and even drawing insights. Don’t know what to wear on your date? You now have an AI companion with you to provide commentary on each outfit in real-time, and it won’t ever lose patience. Great news for the group chat that was just about to block you.

Because GPT-4o can work with text, audio, and images simultaneously, the potential uses are nearly limitless. It could be a tutor that walks you through complex concepts with a combination of verbal explanations, written examples, and illustrative diagrams. It could be a design assistant that collaborates with you on visual projects in real time. It could even be a composer that writes music to accompany your lyrics. The thing can even sing.

And when you ask GPT-4o a question out loud, it responds in about a third of a second — roughly the same amount of time it takes a human to reply. This near-instantaneous response will make interacting with GPT-4o feel less like using a computer and more like having a conversation with a friend — a friend with the power of all human knowledge at its metaphorical fingertips — but a friend nonetheless.

Perhaps what’s most extraordinary about GPT-4o is how it combines all these capabilities to enable entirely new ways of interacting with AI. Altman was right — the demos released by OpenAI this morning do seem like magic. 

This morning provided a glimpse into a future where computers aren’t just tools we use, but intelligent companions we converse and collaborate with. A future where accessing the vast knowledge and capabilities of AI is as easy as having a chat.

As for when you can get your hands on it, GPT-4o is already being rolled out to ChatGPT users, with free tier access and higher message limits for Plus users. Developers can start integrating GPT-4o into their applications via the API, with extended features like real-time audio and visual processing expected to be available soon.

Of course, as with any powerful new technology, GPT-4o raises important questions about safety and responsible use. OpenAI seems to be taking this seriously. The company said in today’s announcement that it intends to take a cautious approach to the model’s release, gradually rolling out features while continuing to test and refine its safeguards. 

Even with a careful rollout, GPT-4o is set to make waves. It represents a major leap forward in AI’s ability to understand and interact with the world the way humans do. As developers begin integrating GPT-4o into applications and services, we can expect to see a surge of innovation in areas including education, health care, and entertainment.

Our devices are about to become a whole lot more helpful, intelligent, and easy to talk to. The age of truly conversational AI has arrived. 

Remember this day — it’s one for the history books.

You can watch the full announcement below.