MLOps.community podcast | Listen online for free

534 episodes

Coding Agents Are Secretly General Agents
27/06/2026 | 1h 12 mins.
In this episode:
🧠 Coding agents are generalist agents — why "positive transfer" means an agent that's better at code is better at everything, and how that makes them "AGI-complete"
⏳ "Code will be solved in a year" — what the automation of knowledge work actually looks like, and why Jay joined ClickUp to be on it
🏗️ Why the labs are crushing AI startups — free-for-two-years deals, Windsurf losing Claude access, and the brutal economics of building on top of frontier models
🔗 The real moat is convergence — context, surfaces, and unit economics, a.k.a. "Cursor for your whole job"
💬 Slack's data walls & the Glean problem — why fragmentation is the enemy and a single system of record wins
🧪 RLVR & verifiability — why code became the perfect training ground for agents, and how to tell if you're even getting better
🔬 LLMs are running the frontier of science — Putnam 12/12, Erdős problems, simulating a cell, and vibe-writing economics papers
🚗 The car wash test that still breaks GPT-5 — spiky models, world models, Plato's cave, and the "stochastic parrot" debate
🏖️ Plus: mechanistic interpretability as "brain surgery," catastrophic forgetting, the danger of deleting knowledge from models, and a pitch for a "resort for LLMs"

Whether you're building agents, leading an AI team, or just trying to figure out what "agentic" really means for everyday work — this one's a fun, deep ride.

🔗 Links & Resources
Jay Hack: linkedin.com/in/jayhack
ClickUp: clickup.com
MLOps Community: go.mlops.community

Mentioned: Gödel, Escher, Bach (Douglas Hofstadter) · "Machine Learning: The High-Interest Credit Card of Technical Debt" (Sculley et al.) · Periodic Labs · Ginkgo Bioworks · Physical Intelligence
The Dark Side of MCP Servers
23/06/2026 | 1h 9 mins.
Sam Partee (CTO & co-founder of Arcade.dev) and Nate Barbettini (Founding Engineer at Arcade.dev) sit down at the MCP Dev Summit to unpack what nobody wants to admit about the Model Context Protocol: the security model is still full of sharp edges. From tool poisoning and prompt injection to why OAuth got bolted onto the spec, this is a builder 's-eye view of where MCP breaks — and how to ship agents safely anyway.
What we get into:🔓 OAuth on MCP — Why the spec adopted OAuth as its authorization standard, and the class of spoofing attacks it shuts down.☠️ Tool poisoning — How a malicious server hides instructions in tool descriptions, and why your agent trusts them by default.🧪 MCP Debugger & ToolBench — Shining a light on the rough edges by grading servers from S-tier to F-tier.🖥️ Sandboxing agents — Giving an agent a shell and a file system without handing over the keys to your machine.📜 Allow lists — Why MCP has client-level allow lists but skills mostly don't — and why that worries them.🔄 The auto-update problem — How skills and servers that silently update become a supply-chain risk ("rug pulls").✅ SOC 2, honestly — Why the controls are voluntary, misunderstood, and actually about best practices.🤖 AI-generated PRs — The new behaviors to watch for as agents start writing and merging code.
If you build agents, ship MCP servers, or are responsible for AI security at your company, this one's for you.
🔗 Links & ResourcesArcade.dev: https://www.arcade.devArcade MCP framework (GitHub): https://github.com/ArcadeAI/arcade-mcpSam Partee (GitHub): https://github.com/sparteeNate Barbettini (LinkedIn): https://www.linkedin.com/in/nbarbettiniMLOps.community: https://mlops.community
⏱️ Timestamps[00:00] Skills, agents, and local context
[08:36] MCP Debugger grades your server
[10:34] Why AI clients are still buggy
[20:54] Why agents shouldn’t always have shell access
[22:44] “I have a spicy take.”
[26:27] “Do not build your own auth.”
[31:14] The “checking someone else’s email” problem
[35:40] “OAuth is the best worst option.”
[43:50] The future of AI entertainment
[46:19] Tool poisoning explained
[50:49] “Trust me, bro,” is not a security solution
[52:45] MCP registries as the App Store model
[1:00:28] AI-generated PRs and speed vs quality
[1:02:37] Why behavior-driven development is coming back
[1:08:11] Have we already reached AGI?

#MCP #AIAgentSecurity #ToolPoisoning
Sandboxing, Agent Harnesses, and Agent Teamwork
19/06/2026 | 1h 19 mins.
Shahram Anver is the Co-Founder and CEO of Cleric, the autonomous AI SRE that investigates and root-causes production issues like an experienced teammate — often in under two minutes. Before Cleric, Shahram led MLOps, DevOps, and FinOps platform engineering at Gojek, Southeast Asia's super-app. In this conversation, he breaks down why production operations never kept pace with AI-accelerated development, and why the real unlock for an AI SRE isn't faster triage — it's an agent that *learns* and compounds operational memory across your whole org.

In this episode:
🔧 The on-call problem — Why one broken service still drags ten engineers onto a call, and how AI changes that
🤖 What an AI SRE actually is — How Cleric investigates across your existing observability stack instead of adding another tool
🧠 Learning over MTTR — Why Shahram argues the value isn't alert triage, it's an agent that gets better every investigation
🪜 Ramping like a new engineer — Explore the environment, learn from the work, talk to the team
🔁 The investigate–measure–learn loop — Turning what worked on one incident into context for the next
🕸️ Knowledge graphs & operational memory — Mapping teams, clusters, and dependencies so insight from one team helps another
⚡ Under two minutes to root cause — What "fast" really requires in a live production environment
🚀 The road to autonomy — From assisted investigation toward self-healing infrastructure
If you're an SRE, platform engineer, DevOps lead, or anyone building or buying AI agents for production, this one's for you.

🔗 Links & Resources
Cleric: https://cleric.ai
Shahram on LinkedIn: https://www.linkedin.com/in/shahramanver/
Willem Pienaar (Co-Founder/CTO): https://www.linkedin.com/in/willempienaar/
Cleric launches the first self-learning AI SRE: https://cleric.ai/blog/cleric-launches-the-first-self-learning-ai-sre
MLOps Community: https://mlops.community
Join the community: https://go.mlops.community/slack

⏱️ Timestamps
[00:00] Tech Jargon Confusion
[00:27] Harness vs Model
[08:48] Model Evolution in Cleric
[13:36] Sandboxing and Simulated Environments
[20:40] Shifting AI Perceptions
[24:10] Managing Humans vs Agents
[31:32] Steering Parallel Agents
[34:16] Human Decision Integration in Models
[43:28] 80/20 Data Split
[49:40] Becoming a Skill
[53:35] 2027 Agent Autonomy
[59:14] Agent Learning in Production
[1:04:31] Software as Personal Capabilities
[1:08:31] Vibe Coding vs Durability
[1:18:23] Wrap up

#AISRE #SiteReliabilityEngineering #AIAgents
Zipline Roundtable episode: Building Real-Time ML Systems with Zipline + Chronon
17/06/2026 | 51 mins.
Zipline Roundtable episode: Building Real-Time ML Systems with Zipline + ChrononJoin the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletterMLOps GPU Guide: https://go.mlops.community/gpuguideBig shout-out to ZiplineAI for the collaboration!// AbstractReal-time ML use cases like personalization and risk decisioning come with a unique set of challenges: serving fresh feature values at low latency for inference, generating temporally consistent backfills for training, and building complex chains of on-demand, batch, and streaming transformations. In this roundtable, practitioners from Intuit, CreditKarma, Depop, and OpenAI share how they use Zipline and the OSS Chronon project to solve these challenges and deploy real-time ML use cases in production.// BioGerman KrikorianGerman is a Software Engineer on the Feature Platform team at Credit Karma. Since joining the company during the early development of its recommendation system, they have played a key role in building and scaling the platform over the years. Their work focuses on feature pipelines and the feature store, which serves as critical infrastructure supporting numerous teams and business verticals across the organization.Ben MagyarBen is an engineer at Depop working on ML and data systems. Before Depop, he worked on Search at Etsy. Most of his work is around the infrastructure and operational problems that come with running ML systems at scale.Raj KatakamRaj architects ML Infrastructure at Credit Karma (Intuit). He holds a Master's in Software Engineering from Carnegie Mellon and a B.Tech in EECE from IIT Kharagpur. His interests include ML Infrastructure, Distributed Systems, Real-Time Data Processing, and Generative AI. His current focus is on providing feature engineering platforms, production GenAI infrastructure, vector databases, ML model serving, and MLOps pipelines for fraud detection, personalized recommendations, financial insights, and model explainability.Mick JermsurawongLed Flyte ML training/experimentation at Stripe, and now led Chronon for ML features at OpenAIHosted by Demetrios// Related LinksWebsite: https://zipline.ai/https://chronon.ai/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with German on LinkedIn: /e2zdkwh8cxghydg/Connect with Raj on LinkedIn: /rajkiran2190Connect with Mick on LinkedIn:/mick-jermsurawong/
MCP Servers Are Becoming the UI for AI Agents
16/06/2026 | 47 mins.
Naseem Al-Naji is the co-founder of MCPcat.io and the creator of Opal — a builder with deep roots in privacy-first developer tooling. In this conversation, he breaks down why MCP servers have become a black box in production, and how MCPcat gives teams X-ray vision into how agents and users actually behave.

What we get into:
🐱 What MCPcat Is — Open-source analytics and live debugging built specifically for MCP servers
🎬 Session Replay — Watch an agent's full journey through your server, tool call by tool call
🎯 Agent Intent & Goals — Understand "why" a tool was called, not just that it was
🔍 Trace Debugging — Find exactly where agents and users get stuck or confused
🚨 Catching Hallucinations — How issue tracking surfaces when an LLM goes off the rails
🔒 Privacy-First by Design — Client-side redaction so sensitive data never leaves your environment
⚡ One-Line Integration — Python, TypeScript, and Go SDKs that drop into existing stacks
📊 Works With Your Stack — Native support for OpenTelemetry, Datadog, and Sentry
🚀 The Future of MCP — Where agent observability and the MCP ecosystem are heading

If you build, ship, or maintain MCP servers — or you're trying to figure out why your AI agents misbehave in production — this one's for you.

🔔 Subscribe, like, and share for more conversations on agentic AI:
▶️ YouTube: https://www.youtube.com/@AAIFAgenticConversations🎧 Spotify: https://open.spotify.com/show/033rZZJrQOVSSmhcStFhZA?si=rUNjFuNqRvGvAEWwqms7TA

Links & Resources:
🐱 MCPcat: https://mcpcat.io
💻 MCPcat on GitHub: https://github.com/mcpcat
👤 Naseem on LinkedIn: https://www.linkedin.com/in/naseem-al-naji
🐙 Naseem on GitHub: https://github.com/naji247

Timestamps:
[00:00] Intro
[01:41] MCP Needs Gatekeepers
[06:32] Measuring MCP Success
[13:57] MCPAT Feature Rollouts
[18:50] MCP Server Query Optimization
[26:48] UI Design Shift
[29:14] MCP Server Design Choices
[33:51] User Journey Traceability
[40:40] Agent Experience Evaluation
[45:23] AI Model Improvement Strategies

#MCP #AIAgents #Observability