Sunday, April 12, 2026

Good Sunday, NOLA. Quiet weekend, but some solid moves brewing: researchers just showed how top AI agent benchmarks can be gamed, Cirrus Labs is joining OpenAI, and Anthropic silently cut Claude's prompt caching TTL — raising questions about how they're managing costs. Plus, some great long-form reads on enterprise AI's leadership crisis and why people are pushing back on AI adoption mandates.

The Real Talk: Benchmarks, Costs & Trust

How We Broke Top AI Agent Benchmarks: And What Comes Next

Berkeley researchers just demonstrated a serious problem: the benchmarks everyone uses to measure AI agent capabilities can be gamed. The team found ways to artificially boost scores on standard tests like SWE-Bench without actually improving real-world performance. This is important because it means the numbers companies are using to compare models may not tell you what's actually useful. If you're evaluating which AI tool to adopt, this should make you skeptical of raw benchmark claims.
Hacker News

Anthropic Silently Cut Claude's Prompt Caching from 1 Hour to 5 Minutes

Developers discovered on GitHub that Anthropic reduced how long it caches your prompts — from one hour down to just five minutes — without announcing it. This matters because caching keeps costs down for people building on Claude. The change went live March 6th, and it's a quiet way to cut your savings window without making noise about it. If you're using Claude Code or building with cached prompts, your economics just shifted.
Hacker News

Sam Altman Responds to 'Incendiary' New Yorker Profile After Home Attack

Altman published a blog post addressing both an apparent attack on his home and a detailed New Yorker profile raising questions about his trustworthiness. He's pushing back hard on the reporting, claiming it misrepresents his record. This is one of those moments where the industry's leadership is under real scrutiny — and the public narrative is splintering. Worth reading both the original reporting and his response to make up your own mind.
TechCrunch

Industry Moves & Big Hires

Cirrus Labs Joins OpenAI

Cirrus Labs, a team working on infrastructure and systems, is now part of OpenAI. The acquisition signals where OpenAI is placing bets — they're doubling down on engineering talent for the backend plumbing that makes models work at scale. Not a flashy acquisition, but these infrastructure teams are where the real work happens when you're running trillion-parameter models.
Hacker News

Meta's Top AI Executives in Line for Nearly $1B in Bonuses

Meta is dangling massive bonuses — nearly $1 billion per exec — if its AI leadership hits their targets. This is a bet-the-company move: they're signaling that AI is their future and they're willing to pay like it. For context, this level of compensation is typically reserved for outcomes that could genuinely reshape the business. Meta's clearly serious about catching up in the AI race.
Hacker News

Worth a Listen

Why Enterprise AI Has a Leadership Problem

New studies from A16Z, KPMG, Writer, and WalkMe paint a picture of enterprise AI that's simultaneously accelerating and breaking down. Over 50% of companies are deploying agentic AI, but adoption is quietly stalling because leadership doesn't understand what to do with it. This is a must-listen if you're building AI products for enterprise customers — it explains why your sales cycle is getting weird.
AI Daily Brief Podcast

Everyone Hates AI. Now What?

AI for Humans digs into the backlash: Florida suing OpenAI, datacenter protests spreading, and the Claude Mythos preview creating chaos. If you're building AI products, this episode explains why the public mood is souring and what it means for your roadmap. Short and punchy.
AI for Humans Podcast

Deep Dives & Interesting Reads

Why Do We Tell Ourselves Scary Stories About AI?

Quanta Magazine explores the psychology behind AI doom narratives — why everyone's convinced the technology will destroy everything, even when the evidence doesn't match the hype. This is a thoughtful piece that helps you understand the cultural moment we're in. If you're trying to have rational conversations about AI with non-technical people, this is a great reference.
Quanta Magazine

Your Next Hire Costs $0/yr and Never Misses a Meeting

A thought-provoking piece on what it looks like when AI agents start handling actual work — scheduling, decisions, routine tasks. The headline is cheeky, but the real question underneath is worth thinking about: what happens to organizational culture when bots are doing the work humans used to do? Not doom-saying, just exploring the shape of the future.
There's An AI For That

Your Baby Deer Plushie Told Me Mitski's Dad Was a CIA Operative

The Verge's account of accidentally getting an AI companion to generate wild conspiracy theories is both hilarious and sobering. The piece walks through how an innocent chatbot feature can veer into total confabulation — making up facts with complete confidence. If you're building AI products, this is a cautionary tale about the difference between "sounds smart" and "actually correct."
The Verge

Today’s Sources