You’re wrong about DeepSeek
Last week I told you about the Chinese AI company DeepSeek’s recent model releases and why they’re such a technical achievement. The DeepSeek team seems to have gotten great mileage out of teaching their model to figure out quickly what answer it would have given with lots of time to think, a key step in previous machine learning breakthroughs that allows for rapid and cheap improvements.
This week I want to jump to a related question: Why are we all talking about DeepSeek? It’s been called America’s AI Sputnik moment. It’s at the top of the iPhone App Store, displacing OpenAI’s ChatGPT. The CEOs of major AI companies are defensively posting on X about it. People who usually ignore AI are saying to me, hey, have you seen DeepSeek?
I have, and don’t get me wrong, it’s a good model. But so are OpenAI’s most advanced models o1 and o3, and the current best-performing LLM on the chatbot arena leaderboard is actually Google’s Gemini (DeepSeek R1 is fourth).
All of which raises a question: What makes some AI developments break through to the general public, while other, equally impressive ones are only noticed by insiders?
The lesson of ChatGPT
Several months before the launch of ChatGPT in late 2022, OpenAI released the model — GPT 3.5 — which would later be the one underlying ChatGPT. Anyone could access GPT 3.5 for free by going to OpenAI’s sandbox, a website for experimenting with their latest LLMs.
GPT 3.5 was a big step forward for large language models; I explored what it could do and was impressed. So were many other people who closely followed AI advances. And yet, virtually no one else heard about it or discussed it.
When OpenAI launched ChatGPT, it © Vox
![](https://cgsyufnvda.cloudimg.io/https://qoshe.com/img/icon/go.png)