Aditya's Newsletter
Posts
OpenAI's 12 Days Of Shipmas Ends With A Bang

OpenAI's 12 Days Of Shipmas Ends With A Bang

o3 and o3-mini announced

Aditya Kumar Saroj
December 23, 2024

In partnership with

We are now a community of 232! Thank you❤️

This newsletter is free and I don’t use paid advertising. I completely rely on organic growth through users who like my content and share it.

So, if you like today’s edition, please take a moment to share this newsletter on social media or forward this email to someone you know.

If this email was forwarded to you, you can subscribe here.

If you want to create a newsletter with Beehiiv, you can sign up here.

OpenAI has announced o3 family of their models which they claim to be the closest thing to Artificial General Intelligence (AGI). Artificial general intelligence (AGI) refers to the hypothetical intelligence of a machine that possesses the ability to understand or learn any intellectual task that a human being can.

o3, our latest reasoning model, is a breakthrough, with a step function improvement on our hardest benchmarks. we are starting safety testing & red teaming now.
— Greg Brockman (@gdb)
6:38 PM • Dec 20, 2024

And, in case you are wondering, yes, the last model family by OpenAI was called o1 and not o2. OpenAI have avoided o2 altogether to avoid any trademark conflicts with British telecom provider, O2.

Currently both o3 and o3-mini are not publicly available. The o3-mini preview can be accessed by safety researchers. OpenAI plans to make o3-mini public in January and o3 some time later.

Something to keep in mind is that these are models with reasoning capabilities. Researchers have already warned that sometimes o1 tries to deceive humans. There are fears that o3 might do it even more.

Sam Altman has earlier said that he would prefer having a federal testing framework before releasing reasoning models. Well, I guess, he didn’t listen to himself and is on track to release at least the o3-mini in January.

Reasoning models can fact-check themselves. This makes the latency higher but at the same time makes them more reliable in fields such as mathematics, physics, etc.

Users will also get the ability to select compute i.e. ‘thinking time.’ The higher the thinking time, the more accurate the results can be.

This does not mean that hallucinations are completely eliminated but they should be reduced.

On ARC-AGI, a test designed to evaluate whether an AI system can efficiently acquire new skills outside the data it was trained on, o3 achieved an 87.5% score on the high compute setting. At its worst (on the low compute setting), the model tripled the performance of o1.

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks.
It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task… x.com/i/web/status/1…
— François Chollet (@fchollet)
6:09 PM • Dec 20, 2024

The sudden release of reasoning models from companies such as DeepSeek, Alibaba, and Google can be attributed to the fact that brute force techniques that were used to scale up AI models earlier are reaching the point of diminishing returns.

Interestingly, the release of o3 comes as one of OpenAI’s most accomplished scientists departs. Alec Radford, the lead author of the academic paper that kicked off OpenAI’s “GPT series” of generative AI models (that is, GPT-3, GPT-4, and so on), announced this week that he’s leaving to pursue independent research.

Automate your meeting notes

Get the most accurate and secure meeting transcripts, summaries, and action items.

Never take meeting notes again: Fellow auto-joins your Zoom, Google Meet, and MS Teams meetings to automatically take notes.

Condense 1-hour meetings into one-page recaps: See highlights like action items, decisions, and key topics discussed.

OpenAI's 12 Days Of Shipmas Ends With A Bang

o3 and o3-mini announced

Automate your meeting notes

Reply