2025 was a wild ride in AI. Here's what kept us up at night, made us laugh, and occasionally question our career choices.
The Karpathy-Dwarkesh Debate: AGI is a Decade Away
October 18 saw one of the most intellectually rigorous discussions in the Research group when Karpathy dropped his AGI timeline views.
"If agents with many tools and turns were at 60% at the start of year, now we are at 75 → still useless. And every next 5% is becoming uphill to the point it's breaking the practicality."
— Pratik, Oct 18, 2025
Nico's takeaway resonated deeply:
"By far the best explanation thus far of why we're going to need real breakthroughs in order to advance the field materially. Engineers don't understand human society and how it works. Yet, they're the only ones permitted in the discussion."
— Nico, Oct 18, 2025
Yam captured the expert paradox perfectly: "People who say LLMs will soon replace profession X usually aren't truly of that profession themselves." The discussion ended with: "Pratik - the optimist. Nico - the pragmatist. We need a podcast I'm telling you."
The GPT-5.2 vs Gemini 3 Wars
December 2025 saw the frontier models duke it out. When GPT-5.2 dropped:
"The benchmarks of gpt-5.2 are just incredible!!! First model to cross 80 percent in SWE-bench."
— Ankur Bohra, Dec 12, 2025
But Devansh kept it real: "in practice it was so verbose it still ended costing more"
Then Gemini 3 Flash landed on Dec 17: "Gemini 3 flash seems like a different beast!" — Nico. The subscription poll results? Claude (because of Claude Code) won with 8 votes.
The MCP Revolution: Code Mode Enters the Chat
November saw deep discussions on Anthropic's MCP. Sumedh dropped knowledge from the AI Engineer summit:
"Every tool definition eats up tokens. If we scale to 50+ tools, we flood the context window. The 8-10 tool number is where this trade-off starts making sense."
— Sumedh Khodke, Nov 23, 2025
But then the plot twist: "One OpenAI engineer at the expo commented privately, codemode was a deadend since LLMs are only going to get better"
The "Measuring Agents in Production" Reality Check
December brought a landmark study that validated what the community had been saying:
"68% of agents run ≤10 steps before human intervention. 70% use off-the-shelf models. 85% are custom-built, avoiding framework abstractions. 80% use static workflows."
— Pratik summarizing the research, Dec 8, 2025
The core insight? "Successful production teams deliberately trade capability for controllability."
The Great Debate: China's Innovation Model
December brought intense geopolitical AI discussions:
"60-70% of AI patents are now coming from Chinese research institutions. Over 50% of the world's AI researchers come from Chinese institutions."
— Nico, Dec 7, 2025
Nico's counter was sharp: "China doesn't need the same CapEx as the USA. Pushing OSS innovation threatens the backbone economics of the US AI Infra buildout."
The DeepSeek Saga: When China Dropped a Bomb
In late January, DeepSeek R1 arrived. The API reliability became the drama:
"the deepseek API is absolute sh*te. Sometimes it answered, sometimes it didn't."
— Nico, Feb 13, 2025
But the verdict was clear: "On Live code bench deepseek is better than sonnet" — Ankur Bohra, Jan 26
Wisdom Nuggets: The Quotable Maxpool
Some messages deserved to be framed:
"Research that aims to replace human decision makers, while being performed by those who don't understand how experts make decisions — that research is doomed."
— Yam Parlant, Oct 18, 2025
"We are now using one model to create another model. The world is a computation and all it wants is to find the answer which is probably 42."
— Pratik, Dec 10, 2025
"Anthropic's execution and thought leadership has been out of this world. They have been sent from the future by Skynet to alter their source code."
— Pratik, Dec 17, 2025
"Maxpool is the place to be for Gen AI. I rarely message but I read everything and it expands my knowledge and viewpoint a lot."
— Roger, Oct 19, 2025
"You try to delegate domain understanding to the LLM, but you can't. You can only delegate execution."
— Yam Parlant, Dec 4, 2025
"Humans have high mental entropy vs the reduced entropy of LLMs. Karpathy couldn't articulate where this could be solved."
— Nico, Oct 18, 2025
The Comedy Hour: Best One-Liners of 2025
AI is serious business, but we found time to laugh:
"production reliability and Agent in the same sentence xD"
— Atharv Jairath, Dec 3, 2025
"The day the world stops turning we're all going to die." / "We are all going to die even if the world does not stop turning"
— Nico & Suvrankar's existential exchange, Dec 12, 2025
"So basically all SaaS revenue is now agentic AI revenue" / "Seems like they misspelt loss as revenue"
— Devansh & ~ R, Dec 11, 2025
"@Pratik can summon your AI waifu opus and tell us?"
— Ankur Bohra, Dec 12, 2025
"I use all of them 🫣 Except Gemini Ultra" / "The man who funded AI!!"
— Nico & Ankur on subscriptions, Dec 9, 2025
"Bubble bubble bubble!!!!!!"
— Ankur Bohra on CoreWeave's 50% stock drop, Dec 17, 2025
"Yes in Indian context, vapi is a place in Gujarat where you can hire people to do sales calls at cheap prices 😂"
— Abhinav, Feb 26, 2025
"The parameters almost fit in a Google sheet 😂"
— Nico on a surprisingly small model, Oct 9, 2025
What 2025 Taught Us
01
Agents need constraints, not capabilities
68% of production agents run ≤10 steps. Reliability beats autonomy.
02
China is a serious contender
DeepSeek, Qwen, and 60%+ of AI patents changed the game.
03
Claude Code won the coding wars
Opus 4.5 + Claude Code became the developer's choice.
04
AGI is a decade away (per Karpathy)
Current agents lack continual learning, memory, and true generalization.
05
Community knowledge compounds faster than models
900+ practitioners sharing real production learnings outpaces any benchmark.