GPT-4o: What Is It, How to Use It, And Is It Free?

5 min read

This article is for general information purposes only and isn’t intended to be financial product advice. You should always obtain your own independent advice before making any financial decisions. The Chainsaw and its contributors aren’t liable for any decisions based on this content.



OpenAI has once again pushed the boundaries of artificial intelligence with the release of GPT-4o, its most advanced and capable model to date. This multimodal marvel has the potential to revolutionise a huge number of industries and change the way we interact with AI.

But what exactly is GPT-4o, how does it work, and can you access it without breaking the bank? Let’s take a closer look.

What is GPT-4o?

GPT-4o, short for Generative Pre-trained Transformer 4 Omni, is OpenAI’s latest AI model, and it shatters the boundaries of previous text-based AI interactions. This isn’t just another AI chatbot — it’s the first of its kind to truly bring us closer to the world envisioned in movies like Her. Stuff that genuinely seemed like science fiction just a few years ago is now here.

GPT-4o is multimodal, meaning it can process and generate both text and images. But the real game-changer is its ability to interact through voice and even video, opening up a whole new dimension of communication with AI. 

Imagine having a natural conversation with your AI assistant, where it not only understands your words but also your tone of voice and facial expressions. It can see what you see, hear what you hear, and respond in kind, creating a truly immersive and personalised experience.

This isn’t just about convenience or novelty — it’s a fundamental shift in how we interact with technology. 

When was the ChatGPT-4o release date?

GPT-4o was officially unveiled by OpenAI’s CTO, Mira Murati, during a live-streamed demo on May 13, 2024. In a move that delighted AI and tech nerds worldwide, OpenAI made GPT-4o available to the public on the same day, allowing users to experience some of its capabilities first hand — although not all the features were made available on day dot. Most notably, the Her-like voice chat feature is yet to be rolled out.

How to use GPT-4o

Ready to harness the power of GPT-4o? Here’s a step-by-step guide to get you started:

Step 1: Sign in to ChatGPT

Head over to the ChatGPT website or download the app, and sign into your account. If you don’t have one, simply sign up and join the GPT-4o revolution.

Step 2: Download the ChatGPT macOS app (optional)

For all you Apple freaks out there, OpenAI has launched a sleek macOS app that seamlessly integrates GPT-4o. 

Step 3: Check your model choices

Once you’re in, look for the drop-down menu at the top of the screen. If you’re one of the lucky ones with access to GPT-4o (people worldwide are getting staggered entry, and there is no confirmed timeline for a full rollout at the time of writing), it should be listed among your model choices. On mobile, you’ll see “ChatGPT 4o” in the navigation bar if it’s available to you.

Step 4: Start chatting

If GPT-4o is at your disposal, dive right in and start chatting away. Keep in mind there are rate limits, especially for free plan users, so make each interaction count. According to OpenAI, GPT-4o users can send up to 80 messages every three hours, with the potential for this limit to be adjusted during peak hours to maintain service availability.

Step 5: Upload files

One of GPT-4o’s coolest features is its ability to analyse uploaded files, including images, videos, and PDFs. Free plan users can now enjoy this perk, so go ahead and feed GPT-4o all the juicy data you want analysed.

Is GPT-4o free?

The good news is that GPT-4o is available to free ChatGPT users, albeit with some limitations. Free users will have restricted message access and won’t be able to indulge in advanced features like vision, file uploads, and data analysis. But hey, it’s still an incredible opportunity to experience the power of GPT-4o.

If you want the full GPT-4o experience without any restrictions, you might want to consider upgrading to ChatGPT Plus, priced at USD $20 a month (so about AU $30, give or take each month depending on the exchange rate). Paid subscribers get unlimited access to all of GPT-4o’s capabilities, most notably, the Her-like voice assistant.

GPT-4o vs. GPT-4

GPT-4 and GPT-4o are both highly advanced language models developed by OpenAI, but there are key differences. GPT-4 is primarily focused on text-based tasks and shines in areas like written content generation, summarisation, and translation. It has a strong grasp of language and can produce high-quality, coherent text across a wide range of domains.

On the other hand, GPT-4o is a multimodal model, so it can process text, visual, and audio inputs. This allows it to perform all the impressive tasks showcased in OpenAI’s demo — real time conversation, image recognition, the ability to pick up on subtle audio and visual cues, and pretty much anything else that involves ‘hearing’ and ‘seeing’. The ability to work beyond just text makes GPT-4o more versatile and opens up new possibilities for applications requiring the integration of language and vision.

However, the multimodal nature of GPT-4o comes with a trade-off. Due to its additional capabilities, GPT-4o may have slightly lower performance on purely text-based tasks. This is because the model’s capacity is split between processing text and images, whereas GPT-4 can dedicate all its resources to text-based tasks.

How does GPT-4o work?

GPT-4o works by using a complex network of artificial neurons that have been trained on massive amounts of text, images, and audio data. This training process allows it to recognise patterns, understand the relationships between different forms of media, and generate appropriate responses.

Think of it like a giant puzzle:

Input: You give it a piece of the puzzle, like a picture, a sentence, or a sound clip.

Analysis: GPT-4o breaks down that input into smaller pieces, such as the colours in an image, the words in a sentence, or the frequencies in a sound.

Pattern Recognition: It compares these smaller pieces to the millions of examples it has seen (borrowed permanently without recompense) during training to identify patterns and understand what it’s looking at or hearing.

Prediction: Based on these patterns, it predicts the most likely response, whether that’s a description of an image, an answer to a question, or a continuation of a conversation.

Output: Finally, it puts those smaller pieces back together to create a coherent response in the form of text, an image, or even audio.

This process occurs incredibly quickly, allowing GPT-4o to respond in real time to a wide range of requests. It’s constantly learning and improving as it processes more data, making it a powerful tool for understanding and generating different forms of media.

Main image: Getty