gpt4 gpt3 chatgpt openai ai chatbot

GPT-4: OpenAI’s New Model, Explained

4 min read
Disclaimer

This article is for general information purposes only and isn’t intended to be financial product advice. You should always obtain your own independent advice before making any financial decisions. The Chainsaw and its contributors aren’t liable for any decisions based on this content.

Share

Follow

Yesterday morning, OpenAI released GPT-4, the latest version of its AI chatbot.

GPT-4 is a significant step up from ChatGPT: OpenAI says the chatbot “surpasses ChatGPT in its reasoning capabilities.” But what exactly does it do, and how does it differ from GPT-3.5?

Bigger, better, stronger

According to OpenAI, GPT-4 can handle over 25,000 words of text, “allowing for use cases like long form content creation, extended conversations, and document search and analysis”, like this example they displayed about Rihanna’s Super Bowl.

chatgpt4
Screenshot of GPT-4 explaining why Rihanna’s Super Bowl performance was special. Source: OpenAI

GPT-3.5, the version that the ‘OG’ ChatGPT was powered by, could only handle around 8,000 words of text. This means that the latest version triples that.

Moments before GPT-4 went live, OpenAI CEO Sam Altman shared a cheeky tweet hinting at the new model’s launch. “Excited 4 today,” he wrote. Get it? 4, as in GPT-4?

GPT-4 is a smart student

GPT-4 is described as a “large multimodal model”. This means that it has the capacity to handle and process information in different formats including texts and images, as we’ll elaborate below.  

Since ChatGPT’s launch, researchers have put it through medical school, the bar exam, and an MBA exam to test out its intelligence. It didn’t disappoint and nearly passed those exams. 

OpenAI is seemingly aware of this and similarly tested GPT-4’s performance in simulated exams. The team behind it spent six months “iteratively aligning GPT-4 using lessons from our adversarial testing program” — this means they spent time teaching GPT-4 how to follow instructions more faithfully. The outcome, OpenAI says, is “best-ever results (although far from perfect) on factuality, steerability, and refusing to go outside of guardrails.”

In other words, it’s a good student. Have a look at the results it scored in different exams like AP and SAT compared to GPT-3.5:

chatgpt4
Screenshot of GPT-4’s performance across different simulated exams. Source: OpenAI

From the graph above, you can see that GPT-4 landed around the top 10 percentile in the GRE Verbal test, a paper that forms part of an entry exam for MBA programs worldwide. GPT-3.5, in comparison, had a ~60 percentile. 

GPT-4 also “passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%.”

Try uploading images 

You can probably forget about AI art generators like Stable Diffusion and Midjourney for now, because GPT-4 now includes visual input. Users are now able to upload an image to GPT-4 alongside a query, and GPT-4 will produce “captions, classifications, and analyses.”

chatgpt4
Screenshot of GPT-4’s visual input feature. Source: OpenAI

For example, simply upload a random funny photo asking GPT-4 what it is, and it will first describe the photo then detect which part of it is funny:

class=wp-image-2350595/
ChatGPT understands some humour. Source: OpenAI

GPT-4’s impressive feats (so far)

In the 48 hours immediately following launch, users had already used GPT-4 to deliver some impressive tasks. For example, a director at crypto exchange Coinbase dumped code for an Ethereum smart contract into GPT-4, and the chatbot picked out a number of vulnerabilities, including that: it’s a Ponzi scheme, and it’s not secure enough to protect itself against hacks.

The boss of a company that provides legal services via a chatbot shared that he got his new hire to draft a lawsuit. Joshua Browder, the CEO of DoNotPay, says they plan to use ChatGPT to deliver customers “one-click lawsuits” to sue robocallers.

Browder says “GPT-3.5 was not good enough, but GPT-4 handles the job extremely well”, as you can see from the video he shared below. Would you consult an AI lawyer?

A design manager at financial products company Brex, Ammaar Reshi, even got it to write code for the classic Snake game – game devs be aware!

Does it have limitations?

Of course. OpenAI’s researchers stressed in their technical report that although GPT-4 is a significant upgrade from its predecessors, it can still “suffer from ‘hallucinations’:”

This tendency can be particularly harmful as models become increasingly convincing and believable, leading to overreliance on them by users… Counterintuitively, hallucinations can become more dangerous as models become more truthful, as users build trust in the model when it provides truthful information in areas where they have some familiarity.

OpenAI

Presumably, they mean that it remains capable of making shit up, like crucial medical information

How to use GPT-4

Wanna hop on and try it? Sure, but it’ll cost you: Only users who are subscribed to ChatGPT Plus – OpenAI’s monthly subscription service – currently have access to GPT-4. That service sets users back US$20 (~AU$30) per month..

But is paying for an AI chatbot assistant worth it? We discussed it here.

For developers who want to use it as an API, a waitlist is available to join.