OpenAI Announces New Open‑Weight AI Models
OpenAI dropped a bombshell on Tuesday, unveiling two brand‑new open‑weight AI models that match the prowess of its so‑called O‑series. The company says the models are “state of the art” when measured against the usual benchmarking tests for open models.
Meet the Two Sizes
- gpt‑oss‑120b – A power‑house that runs comfortably on a single Nvidia GPU.
- gpt‑oss‑20b – A lightweight champ that can be deployed on a standard laptop with 16 GB of RAM.
Both are free to download straight from Hugging Face’s developer hub, and they’re ready to jump into your projects.
It’s Been a Long Time Coming
This marks OpenAI’s first “open” language model since the wild‑cards of GPT‑2 came out over five years ago. The company’s early days were all about openness, but lately it’s leaned heavily on a proprietary model that fuels its profitable API business.
Open Models, Sharpbacks
But there’s an interesting twist: if your open model can’t handle a fancy task—say, crunching an image—developers can hook it up to one of OpenAI’s higher‑tuned closed models hosted in the cloud. It’s like having a versatile Swiss‑army knife that can enlist a bigger gadget when needed.
Why the Shift?
CEO Sam Altman has been open about feeling the company has been “on the wrong side of history” with its closed‑source strategy. The latest move comes amid mounting pressure from rival Chinese labs such as DeepSeek, Alibaba’s Qwen, and Moonshot AI, which have been rolling out top‑synergy open models. META’s Llama line, once a darling, has struggled in recent years.
Political Play‑book
In July, the Trump administration urged U.S. AI developers to democratize more technology to spread AI that aligns with American values. So, while the industry’s in a tug‑of‑war between commercial interests and open innovation, OpenAI is now nudging a new wave of openness.
Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $600+ before prices rise.
Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise.
OpenAI Drops GPT‑OSS – A New Tune for Developers & Politicians
Location: San Francisco
Date: October 27‑29, 2025
Ready to hop on the bandwagon? Register now and snag your spot at the event that’s got everyone talking.
Why GPT‑OSS Matters
- Built from the heart of the U.S., the new open‑source stack carries a strong democratic backbone.
- It gives developers a free, ready‑to‑use playground that can level the playing field with any emerging tech.
- It’s a subtle nod to the folks who keep an eye on China’s AI surge—an invitation to collaborate, not compete.
Altman’s Take
When OpenAI co‑founder Sahil Altman reflected on the company’s roots in 2015, he highlighted a bigger picture: “We’re all about crafting AGI that works for everyone.” He added, “We’re thrilled to let the world build on an American‑born AI stack, one that’s absolutely free, democratic, and wide‑ranging.”
What’s Next?
With GPT‑OSS officially live, the horizon looks expansive. Developers will get the tools to innovate, while policymakers might find a new ally in ensuring our future AI is as inclusive as it is powerful.

Image Credits
- Photo: Tomohiro Ohsumi via Getty Images
How the models performed
OpenAI’s “Open‑Weight” Supremacy – A Light‑Hearted Recap
Why the Buzz Around Open Models?
OpenAI set out with a bold mission: lean its open‑weight lineup into the crowd king of all open AI models. They’re saying the goal was hit, and the hype is real.
Codeforces Showdown – The Numbers in a Nutshell
- GPT‑OSS‑120B – 2622 points
- GPT‑OSS‑20B – 2516 points
- These two powerhouses beat DeepSeek’s R1, but still trail behind o3 and o4‑mini.
What That Means for Us
Even if they’re not topping every single contender, OpenAI’s models are carving out a respectable lane. If you’re wrestling with code, these numbers reflect that OpenAI is staying competitive, but it’s still a marathon – not a sprint.
Takeaway
OpenAI’s open‑weight soldiers are getting more up‑to‑speed; they’re taking a sturdy step ahead of some rivals, even if they’re still chasing the leaders of the pack. Keep an eye on how the leaderboard shifts in the next code challenge – and enjoy the thrill of watching a game of AI size that keeps pushing boundaries!

OpenAI’s Open Models Take on Codeforces (and Face the HLE)
When it came time to put OpenAI’s open‑source cousins to the test, the results were a mixed bag of triumphs and turbulence. Picture this: gpt‑oss‑120b and gpt‑oss‑20b stepped up to Humanity’s Last Exam (HLE) – a wild playground of crowdsourced questions that squeeze every tool in the toolbox. Their scores? 19 % for the 120‑b variant and a bit shy at 17.3 % for the 20‑b one.
How Do They Stack Up?
- Underperforming o3: The o3 model still pulls ahead – a small but noteworthy gap.
- Outshining the Competition: Against the likes of DeepSeek and Qwen, OpenAI’s open models are the dark horse champs.
What This Means for the Future
While the scores might not scream “super‑human,” the fact that OpenAI’s open models are beating some of the big names in the field shows promise. It’s a sign that the open‑source community is punching up, and the showdown on Codeforces is just the first round.
Takeaway
The tech world is all eyes on the next leap. If OpenAI keeps refining its open models, the next HLE could very well see a surge of fresh stories – maybe even a heart‑warming, “look, we’ve got this!” moment for the wider community.

OpenAI’s Open Models Are Short on Facts
When it comes to getting straight answers, the newer OpenAI “reasoning” models outshine the older, fully commercial ones. The big‑shot models like o3 and o4‑mini keep their guesses in check, while the open‑source cousins are a bit more creative with the truth.
What’s Going Wrong?
OpenAI’s own white paper admits that smaller models simply don’t know everything. With fewer parameters, they don’t see the world as clearly and end up “hallucinating” more.
Hallucination Numbers That Shock
- gpt-oss-120b → 49 % of the time it made a wrong claim on PersonQA.
- gpt-oss-20b → 53 % hallucinations.
- Compare that to the newer o1 model, which only stuck to the facts 84 % of the time.
- Even the o4‑mini kept up the accuracy a bit better at 64 % correctness.
Why the Gap?
Think of the models like students: o1 and o4‑mini are the college graduates with a full thesis, while gpt-oss-120b and gpt-oss-20b are still finishing high school. The let‑down in knowledge leads to more made‑up answers.
Bottom Line
OpenAI’s data suggests the larger, newer reasoning models do a better job at staying grounded in reality. The smaller, open‑source variants might still be useful for some applications, but remember: they’re more likely to spin up a story than a fact.
Image Credits: OpenAI
Training the new models
OpenAI’s Open Models: A Fresh Take on AI Power
What’s the Big Deal?
OpenAI has rolled out two open‑source beasts, gpt-oss-120b and gpt-oss-20b, under the super‑permissive Apache 2.0 license. That means companies can monetize these models without buying a license or asking for special permission. Pretty sweet, right?
Under the Hood
- Mixture‑of‑Experts (MoE): Instead of shouting out all 117 billion parameters, GPT‑OSS curates only the sweet 5.1 billion that actually matter for each token. Think of it as a smart autopilot that only pulls in the deck you need.
- High‑Compute Reinforcement Learning: After the initial train, OpenAI gave the models a reality check with simulated environments on massive Nvidia GPU clusters. It’s like having a civil‑engineering lab run your football coach.
- Chain‑of‑Thought: These models take the extra time to wrangle complex answers, which helps when they need to call tools like web searches or run Python code. They’re the calculators that ask, “Let me think for a minute,” before answering.
What They Can (and Can’t) Do
- Text‑Only: The open models stay in the word land. No image or audio output—just pure prose. Easier license, but still a powerhouse for chatting and LLM tasks.
- Agent Capabilities: With RL training and MoE efficiency, GPT‑OSS can become an AI agent that leverages hands‑on tools. Picture a model that can read the internet live or debug code on the fly.
Why the Delay? Safety First
OpenAI has taken its time, re‑examining the security angle. A white‑paper study looked at whether hackers could tweak GPT‑OSS for cyberattacks or biological weapon design.
Findings
- Some slight bump in biological “capabilities” but never crossed the “high‑danger” threshold, even after fine‑tuning.
- No evidence that downstream users could weaponise the models to a scary degree.
What’s Next in the Open‑AI Ecosystem
While GPT‑OSS already tops the open‑model leaderboard, the grow‑till you’re hungry—
- DeepSeek R2: The next reasoning model from DeepSeek is generating buzz.
- Meta’s Superintelligence Lab: A fresh open model from Meta could disrupt the scene.
Bottom Line
OpenAI’s new open‑source models are the efficient, well‑trained, text‑only silver bullet for developers. They’re free to use, yet not without some caution over potential misuse. Keep an eye on what the community builds around them—they’re set to become the backbone of next‑gen AI tools.