AI Friday — Sunday, May 10, 2026

New APIs & Developer Tools

OpenAI releases GPT-Realtime-2, -Translate, and -Whisper voice APIs

Real-time voice just got faster and smarter. OpenAI shipped three new voice APIs: Realtime-2 (ultra-low latency voice conversations), a new Translate mode (speak in any language, get back English), and updated Whisper for transcription. These are live for developers now. If you've been thinking about building voice interfaces, the latency bar just dropped dramatically.

Latent Space

Gemini API File Search is now multimodal

Google expanded Gemini's file search to handle PDFs, images, and videos—not just text. This matters if you're building systems that need to search across mixed-media documents. Multimodal RAG (retrieval-augmented generation, meaning the AI can pull relevant context from your files) is becoming table stakes.

Google AI Blog

The Hard Lessons

LLMs corrupt your documents when you delegate

A new study shows that when you ask LLMs to edit or summarize documents, they subtly change facts—sometimes introducing errors, sometimes just rewording in ways that lose precision. The kicker: humans don't catch it because the text sounds right. If you're using AI for document review or summarization in production, read this. Discussion on HN.

Hacker News

A recent experience with ChatGPT 5.5 Pro

Timothy Gowers (Fields medalist, real mathematician) spent time actually using GPT-5.5 Pro on a problem he knows deeply. His honest reflection: it's genuinely useful for some things, but it hits a wall fast when the problem gets subtle. Worth reading if you want a reality check on where these models actually excel and where they fail. 632 points on HN.

Hacker News

Building with AI: Real-World Patterns

All my clients wanted a carousel, now it's an AI chatbot

A web designer's honest take: every client who came asking for a carousel now wants an AI chatbot instead. She walks through what actually works, what's hype, and what she's learned about integrating AI into real client projects. Practical and grounded.

Hacker News

The internal AI tool that's transforming how Stripe designs products

Owen Williams built Protodash at Stripe—an internal AI tool that lets designers and PMs generate high-fidelity prototypes of Stripe's dashboard UI. The episode breaks down how they're actually using AI in product design at scale, not theory.

How I AI podcast

Context Engineering in 40 Minutes

Ravi Mehta (former Tinder CPO) walks through his 3-layer system for structuring prompts and context: functional, visual, and data layers. Includes a live demo of building something real. If you want to level up how you're actually talking to these models, this is practical.

Behind the Craft podcast

Worth Reading & Listening

Why Agents Make Every Job a Startup

An interesting angle: AI agents haven't saved time the way we thought. Instead, they've made the infinite backlog of work feel immediate and urgent. If everyone on your team now has a capable AI assistant, what happens to prioritization? What gets sacrificed? Worth thinking about if you're deploying agents.

AI Daily Brief podcast

How to Build an AI Native Team

Mike Cannon-Brookes (Atlassian co-founder) on what it actually takes to build teams that work well with AI—not just teams that use AI tools. Leadership, structure, and culture changes that matter.

AI Daily Brief podcast

Voice AI in India is hard. Wispr Flow is betting on it anyway.

Wispr Flow saw acceleration in India after rolling out Hinglish (Hindi + English mix). A real example of how voice AI is breaking into non-English markets—and what works when English-first models don't.

TechCrunch AI

Also

Gemini API File Search is now multimodal — Search across PDFs, images, and videos
Using Claude Code: The unreasonable effectiveness of HTML — Why Claude Code works so well: it sees HTML like humans do
People Hate AI Art — Why adoption is slower than the hype suggested
The new Wild West of AI kids' toys — Connected AI companions for kids are here—and barely regulated
From Vector Databases to Knowledge Engines — Pinecone's Nexus launch and the shift toward agents as primary users
Ben's Builds #3: An Email App — What actually breaks when you build AI products
Doing Vibe Physics with Alex Lupsasca (OpenAI) — How GPT-5.x derived new results in theoretical physics
Why OpenAI and Anthropic Are Becoming Consultants — Enterprise services are the real play—not 'buy and hope'
Silicon Valley gets Serious about Services — A series of announcements all point to the same theme
Is AI Doom Going Out of Style? — Signals the doomsday narrative is finally cracking
Import AI 455: AI systems are about to start building themselves — The first step toward recursive self-improvement
Who Cares About Consumer AI — The money and attention are moving hard toward enterprise and coding agents

New APIs & Developer Tools

The Hard Lessons

Building with AI: Real-World Patterns

Worth Reading & Listening

Also

Today’s Sources