PodcastsScienceVanishing Gradients

Vanishing Gradients

Hugo Bowne-Anderson
Vanishing Gradients
Latest episode

72 episodes

  • Vanishing Gradients

    Episode 72: Why Agents Solve the Wrong Problem (and What Data Scientists Do Instead)

    20/03/2026 | 1h 33 mins.
    I often see what I would consider to be b******t evals, especially in data, like write this dumb SQL. Almost every one of these dumb SQL questions that I’ve seen for benchmarks are just so either obviously easy or overwhelmingly adversarial. They just, they don’t feel valuable as a data scientist, it’s something that you probably would never ask a real data scientist to do. So I went out my way to create real ones. Let me read one to you.
    Bryan Bischof, Head of AI at Theory Ventures, joins Hugo to talk about what happened when 150 people spent six hours using AI agents to answer real data science questions across SQL tables, log files, and 750,000 PDFs.
    They Discuss:
    * Failure Funnels, pinpoint where agent reasoning breaks down using causal-chain binary evaluations instead of vague 1-5 scales;
    * Median Score: 23 out of 65, what happened when world-class engineers turned agents loose on real data work, and why general-purpose coding agents with human prodding beat fancy frameworks;
    * Zero-Cost Submissions Kill Trust, without a penalty for wrong answers, agents hill-climb to correct submissions through brute force instead of building confidence;
    * Data Science is “Zooming”, moving beyond binary decisions to iterative problem framing, refining “does our inventory suck?” into a tractable hypothesis;
    * MCP as Semantic Layer, model your organization’s proprietary knowledge once and distribute it to whatever LLM interface your team prefers;
    * The Subagent vs. Tool Debate, a distinction that adds cognitive load without hiding complexity;
    * Self-Orchestration Gap, agents don’t yet realize they should trigger specialized extraction frameworks like DocETL instead of reading 750K PDFs one by one;
    * The Future of Evals, from vibe checks to objective functions and continuous user feedback that lets systems converge on reliability.
    You can also find the full episode on Spotify, Apple Podcasts, and YouTube.
    You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!
    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort has started. Registration is still open. All sessions are recorded so don’t worry about having missed any. Here is a 25% discount code for readers. 👈
    LINKS
    * Bryan Bischof on Twitter/X
    * Bryan Bischof on LinkedIn
    * Theory Ventures
    * The Hunt for a Trustworthy Data Agent (blog post)
    * America’s Next Top Modeler GitHub repo
    * Hamel’s evals FAQ: How do I evaluate agentic workflows?
    * DocETL
    * LLM Judges and AI Agents at Scale (Hugo’s podcast with Shreya Shankar)
    * When Your Metrics Are Lying (Cimo Labs)
    * Lessons from a Year of Building with LLMs (livestream on YouTube)
    * Bryan Bischof: The Map is Not the Territory (YouTube)
    * Upcoming Events on Luma
    * Vanishing Gradients on YouTube
    * Watch the podcast video on YouTube

    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort has started. Registration is still open. All sessions are recorded so don’t worry about having missed any. Here is a 25% discount code for readers. 👈


    Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe
  • Vanishing Gradients

    Episode 71: Durable Agents - How to Build AI Systems That Survive a Crash with Samuel Colvin

    18/02/2026 | 51 mins.
    Our thesis is that AI is still just engineering… those people who tell us for fun and profit, that somehow AI is so, so profound, so new, so different from anything that’s gone before that it somehow eclipses the need for good engineering practice are wrong. We need that good engineering practice still, and for the most part, most things are not new. But there are some things that have become more important with AI. One of those is durability.
    Samuel Colvin, Creator of Pydantic AI, joins Hugo to talk about applying battle-tested software engineering principles to build durable and reliable AI agents.
    They Discuss:
    * Production agents require engineering-grade reliability: Unlike messy coding agents, production agents need high constraint, reliability, and the ability to perform hundreds of tasks without drifting into unusual behavior;
    * Agents are the new “quantum” of AI software: Modern architecture uses discrete “agentlets”: small, specialized building blocks stitched together for sub-tasks within larger, durable systems;
    * Stop building “chocolate teapot” execution frameworks: Ditch rudimentary snapshotting; use battle-tested durable execution engines like Temporal for robust retry logic and state management;
    * AI observability will be a native feature: In five years, AI observability will be integrated, with token counts and prompt traces becoming standard features of all observability platforms;
    * Split agents into deterministic workflows and stochastic activities: Ensure true durability by isolating deterministic workflow logic from stochastic activities (IO, LLM calls) to cache results and prevent redundant model calls;
    * Type safety is essential for enterprise agents: Sacrificing type safety for flexible graphs leads to unmaintainable software; professional AI engineering demands strict type definitions for parallel node execution and state recovery;
    * Standardize on OpenTelemetry for portability: Use OpenTelemetry (OTel) to ensure agent traces and logs are portable, preventing vendor lock-in and integrating seamlessly into existing enterprise monitoring.
    You can also find the full episode on Spotify, Apple Podcasts, and YouTube.
    You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!

    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a 25% discount code for listeners. 👈
    LINKS
    * Samuel Colvin on LinkedIn
    * Pydantic
    * Pydantic Stack Demo repo
    * Deep research example code
    * Temporal
    * DBOS (Postgres alternative to Temporal)
    * Upcoming Events on Luma
    * Vanishing Gradients on YouTube
    * Watch the podcast video on YouTube
    👉Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for listeners.👈
    https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs


    Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe
  • Vanishing Gradients

    Episode 70: 1,400 Production AI Deployments

    12/02/2026 | 1h 9 mins.
    There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month.
    It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agent reports it has succeeded when it didn’t!
    We Discuss:
    * Why the most successful teams are ripping out and rebuilding their agent systems every few weeks as models improve, and why over-engineering now creates technical debt you can’t afford later;
    * The $50,000 infinite loop disaster and why “silent failures” are the biggest risk in production: agents confidently report success while spiraling into expensive mistakes;
    * How ELIOS built emergency voice agents with sub-400ms response times by aggressively throwing away context every few seconds, and why these extreme patterns are becoming standard practice;
    * Why DoorDash uses a three-tier agent architecture (manager, progress tracker, and specialists) with a persistent workspace that lets agents collaborate across hours or days;
    * Why simple text files and markdown are emerging as the best “continual learning” layer: human-readable memory that persists across sessions without fine-tuning models;
    * The 100-to-1 problem: for every useful output, tool-calling agents generate 100 tokens of noise, and the three tactics (reduce, offload, isolate) teams use to manage it;
    * Why companies are choosing Gemini Flash for document processing and Opus for long reasoning chains, and how to match models to your actual usage patterns;
    * The debate over vector databases versus simple grep and cat, and why giving agents standard command-line tools often beats complex APIs;
    * What “re-architect” as a job title reveals about the shift from 70% scaffolding / 30% model to 90% model / 10% scaffolding, and why knowing when to rip things out is the may be the most important skill today.
    You can also find the full episode on Spotify, Apple Podcasts, and YouTube.
    You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!

    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈
    Show Notes Links
    * Alex Strick van Linschoten on LinkedIn
    * Alex Strick van Linschoten on Twitter/X
    * LLMOps Database
    * LLMOps Database Dataset on Hugging Face
    * Hugo’s MCP Server for LLMOps Database
    * Alex’s Blog: What 1,200+ Production Deployments Reveal About LLMOps in 2025
    * Previous Episode: Practical Lessons from 750 Real-World LLM Deployments
    * Previous Episode: Tales from 400 LLM Deployments
    * Context Rot Research by Chroma
    * Hugo’s Post: AI Agent Harness - 3 Principles for Context Engineering
    * Hugo’s Post: The Rise of Agentic Search
    * Episode with Nick Moy: The Post-Coding Era
    * Hugo’s Personal Podcast Prep Skill Gist
    * Claude Tool Search Documentation
    * Gastown on GitHub (Steve Yegge)
    * Welcome to Gastown by Steve Yegge
    * ZenML - Open Source MLOps & LLMOps Framework
    * Upcoming Events on Luma
    * Vanishing Gradients on YouTube
    * Watch the podcast livestream on YouTube
    * Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners)
    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈


    Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe
  • Vanishing Gradients

    Episode 69: Python is Dead. Long Live Python! With the Creators of pandas & Parquet

    03/02/2026 | 55 mins.
    > It’s the agent writing the code. And it’s the development loop of writing the code, building testing, write the code, build test and iterating. And so I do think we’ll see for many types of software, a shift away from Python towards other programming languages. I think Go is probably the best language for those like other types of software projects. And like I said, I haven’t written a line of Go code in my life.
    – Wes McKinney (creator of pandas Principal Architect at Posit),
    Wes McKinney, Marcel Kornacker, and Alison Hill join Hugo to talk about the architectural shift for multimodal AI, the rise of “agent ergonomics,” and the evolving role of developers in an AI-generated future.
    We Discuss:
    * Agent Ergonomics: Optimize for agent iteration speed, shifting from human coding to fast test environments, potentially favoring languages like Go;
    * Adversarial Code Review: Deploy diverse AI models to peer-review agent-generated code, catching subtle bugs humans miss;
    * Multimodal Data Verbs: Make operations like resizing and rotating native to your database to eliminate data-plumbing bottlenecks;
    * Taste as Differentiator: Value “taste”—the ability to curate and refine the best output from countless AI-generated options—over sheer execution speed;
    * 100x Software Volume: Embrace ephemeral, just-in-time software; prioritize aggressive generation and adversarial testing over careful planning for quality.
    You can also find the full episode on Spotify, Apple Podcasts, and YouTube.
    You can also interact directly with the transcript of the workshop & fireside chat here in NotebookLM: If you do so, let us know anything you find in the comments!
    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈
    This was a fireside chat at the end of a livestreamed workshop we did on building multimodal AI systems with Pixeltable. Check out the full workshop below (all code here on Github):
    Links and Resources
    * Wes McKinney on LinkedIn
    * Marcel Kornacker on LinkedIn
    * Alison Hill on LinkedIn
    * Spicy Takes
    * Palmer Penguins
    * Pixeltable
    * Posit
    * Positron
    * Building Multimodal AI Systems Workshop Repository
    * Pixeltable Docs: LLM Tool Calling with MCP Servers
    * Pixeltable Docs: Working with Pydantic
    * Upcoming Events on Luma
    * Vanishing Gradients on YouTube
    * Watch the podcast video on YouTube
    * Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners)
    https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs

    What people said during the workshop
    “I think the interface looks amazing/simple. Strong work! 🦾” — @goldentribe

    “This is quite amazing. Watching this I felt the same way when I first leant pandas, NumPy and scikit and how well i was able to manipulate and wrangle data. PixelTable feels seamless and looks as good as those legendary frameworks but for Multimodal Data.” — @vinod7

    “This is all extremely cool to see, I love the API and the approach.” — @steveb4191

    “Thanks so much, Hugo! That was very insightful! Great work Alison and Marcel!” — @vinod7

    “Just wrapped up watching a replay of the Pixeltable workshop. So cool!! Love the notebooks and working examples. The important parts were covered and worked beautifully 🕺” — @therobbrennan

    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈


    Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe
  • Vanishing Gradients

    Episode 68: A Builder’s Guide to Agentic Search & Retrieval with Doug Turnbull & John Berryman

    23/01/2026 | 1h 28 mins.
    The best way to build a horrible search product? Don’t ever measure anything against what a user wants.
    Search veterans Doug Turnbull (Led Search at Reddit + Shopify; Wrote Relevant Search + AI Powered Search) and John Berryman (Early Engineer on Github Copilot; Author of Relevant Search + Prompt Engineering for LLMs), join Hugo to talk about how to build Agentic Search Applications.
    We Discuss:
    * The evolution of information retrieval as it moves from traditional keyword search toward “agentic search“ and what this means for builders.
    * John’s five-level maturity model (you can prototype today!) for AI adoption, moving from Trad Search to conversational AI to asynchronous research assistants that reason about result quality.
    * The Agentic Search Builders Playbook, including why and how you should “hand-roll” your own agentic loops to maintain control;
    * The importance of “revealed preferences” that LLM-judges often miss (evaluations must use real clickstream data to capture “revealed preferences” that semantic relevance alone cannot infer)
    * Patterns and Anti-Patterns for Agentic Search Applications
    * Learning and teaching Search in the Age of Agents
    You can find the full episode on Spotify, Apple Podcasts, and YouTube.
    You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!
    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈

    Doug and Hugo are also doing a free lightning lesson on Feb 20 about How To Build Your First Agentic Search Application! You’ll walk away with a framework & code to build your first agentic search app. Register here to join live or get the recording after.

    Links and Resources
    Guests
    * Arcturus Labs (John’s website)
    * Software Doug (Doug’s website)
    * John Berryman on LinkedIn
    * Doug Turnbull on LinkedIn
    Books
    * Relevant Search by Doug Turnbull & John Berryman (Manning)
    * AI-Powered Search by Doug Turnbull (Manning)
    * Prompt Engineering for LLMs by John Berryman (O’Reilly)
    Blog Posts
    * Incremental AI Adoption for E-commerce by John Berryman
    * Roaming RAG – RAG without the Vector Database by John Berryman
    * Agents Turn Simple Keyword Search into Compelling Search Experiences by Doug Turnbull
    * A Simple Agentic Loop with Just Python Functions by Doug Turnbull
    * Agentic Code Generation to Optimize a Search Reranker by Doug Turnbull
    * LLM Judges Aren’t the Shortcut You Think by Doug Turnbul (Hugo’s 5 minute video below)
    * Malleable Software by Ink & Switch (inc. Geoffrey Lit)
    * Patterns and Anti-Patterns for Building with AI by Hugo Bowne-Anderson
    Other Resources
    * The Rise of Agentic Search, a recent VG Podcast with Jeff Huber
    * Karpathy on Cognitive Core LLMs
    * Cheat at Search with Agents course by Doug Turnbull (use code: vanishinggradients for $200 off)
    * Upcoming Events on Luma
    * Vanishing Gradients on YouTube
    * Watch the podcast video on YouTube
    * Join the final cohort of our Building AI Applications course in Q1, 2026 (25% off for listeners)

    Timestamps (for YouTube livestream)
    00:00 How to Build Agentic Search & Retrieval Systems
    02:48 Defining Search and AI
    03:26 Evolution of Search Technologies08:46 Search in E-commerce and Other Domains
    12:15 Combining Search and AI: RAG and LLMs
    23:50 User Intent and Search Optimization
    29:47 Levels of AI Integration in Search
    32:25 Exploring the Complexity of Search in Various Domains
    33:49 The Evolution and Impact of Agentic Search
    34:07 Defining Terms: RAG and Agentic Search
    34:52 The Research Loop and Tool Interaction
    35:55 Formal Protocols and Structured Outputs
    38:39 Building Agentic Search Experiences: Tips and Advice
    41:50 The Importance of Empathy in AI and Search Development
    54:30 The Role of UX in Search Applications
    01:01:15 Future of Search: Malleable User Interfaces
    01:02:38 Exploring Malleable Software
    01:04:20 The Coordination Challenge in Software Development
    01:05:23 The Impact of Claude Code & Claude Cowork
    01:06:22 The Future of Knowledge Work with AI
    01:12:39 Evaluating Search Algorithms with AI
    01:15:15 The Role of Agents in Search Optimization
    01:29:55 Teaching AI and Search Techniques
    01:34:25 Final Thoughts and Farewell
    👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a discount code for readers. 👈
    https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgpod


    Get full access to Vanishing Gradients at hugobowne.substack.com/subscribe

More Science podcasts

About Vanishing Gradients

A podcast for people who build with AI. Long-format conversations with people shaping the field about agents, evals, multimodal systems, data infrastructure, and the tools behind them. Guests include Jeremy Howard (fast.ai), Hamel Husain (Parlance Labs), Shreya Shankar (UC Berkeley), Wes McKinney (creator of pandas), Samuel Colvin (Pydantic) and more. hugobowne.substack.com
Podcast website

Listen to Vanishing Gradients, StarTalk Radio and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v8.8.4| © 2007-2026 radio.de GmbH
Generated: 3/27/2026 - 11:20:01 PM