Powered by RND
PodcastsTechnologyInference by Turing Post

Inference by Turing Post

Turing Post
Inference by Turing Post
Latest episode

Available Episodes

5 of 5
  • When Will We Train Once and Learn Forever? Insights from Dev Rishi, CEO and co-founder ⁨@Predibase ​
    What it actually takes to build models that improve over time. In this episode, I sit down with Devvret Rishi, CEO and co-founder of Predibase, to talk about the shift from static models to continuous learning loops, the rise of reinforcement fine-tuning (RFT), and why the real future of enterprise AI isn’t chatty generalists – it’s focused, specialized agents that get the job done.We cover: The real meaning behind "train once, learn forever" How RFT works (and why it might replace traditional fine-tuning) What makes inference so hard in production Open-source model gaps—and why evaluation is still mostly vibes Dev’s take on agentic workflows, intelligent inference, and the road ahead If you're building with LLMs, this conversation is packed with hard-earned insights from someone who's doing the work – and shipping real systems. Dev is super structural! I really enjoyed this conversation. Did you like the video? You know what to do: 📌 Subscribe for more deep dives with the minds shaping AI. Leave a comment if you have something to say. Like it if you liked it. That’s it. Oh yeap, one more thing: Thank you for watching and sharing this video. We truly appreciate you. Guest: Devvret Rishi, co-founder and CEO at Predibase https://predibase.com/ If you don’t see a transcript, subscribe to receive our edited conversation as a newsletter: https://www.turingpost.com/subscribe Chapters: 00:00 - Intro 00:07 - When Will We Train Once and Learn Forever? 01:04 - Reinforcement Fine-Tuning (RFT): What It Is and Why It Matters 03:37 - Continuous Feedback Loops in Production 04:38 - What's Blocking Companies From Adopting Feedback Loops? 05:40 - Upcoming Features at Predibase 06:11 - Agentic Workflows: Definition and Challenges 08:08 - Lessons From Google Assistant and Agent Design 08:27 - Balancing Product and Research in a Fast-Moving Space 10:18 - Pivoting After the ChatGPT Moment 12:53 - The Rise of Narrow AI Use Cases 14:53 - Strategic Planning in a Shifting Landscape 16:51 - Why Inference Gets Hard at Scale 20:06 - Intelligent Inference: The Next Evolution 20:41 - Gaps in the Open Source AI Stack 22:06 - How Companies Actually Evaluate LLMs 23:48 - Open Source vs. Closed Source Reasoning 25:03 - Dev’s Perspective on AGI 26:55 - Hype vs. Real Value in AI 30:25 - How Startups Are Redefining AI Development 30:39 - Book That Shaped Dev’s Thinking 31:53 - Is Predibase a Happy Organization? 32:25 - Closing Thoughts Turing Post is a newsletter about AI's past, present, and future. Publisher Ksenia Semenova explores how intelligent systems are built – and how they’re changing how we think, work, and live. Sign up: Turing Post: https://www.turingpost.com FOLLOW US Devvret and Predibase: https://devinthedetail.substack.com/ https://www.linkedin.com/company/predibase/ Ksenia and Turing Post: https://x.com/TheTuringPost https://www.linkedin.com/in/ksenia-se https://huggingface.co/Kseniase
    --------  
    28:16
  • When Will We Give AI True Memory? A conversation with Edo Liberty, CEO and founder @ Pinecone
    What happens when one of the architects of modern vector search asks whether AI can remember like a seasoned engineer, not a gold‑fish savant? In this episode, Edo Liberty – founder & CEO of Pinecone and one‑time Amazon scientist – joins me to discuss true memory in LLMs. We unpack the gap between raw cognitive skill and workable knowledge, why RAG still feels pre‑ChatGPT, and the breakthroughs needed to move from demo‑ware to dependable memory stacks. Edo explains why a vector database needs to be built from the ground (and then rebuilt many times), that storage – not compute – has become the next hardware frontier, and predicts a near‑term future where ingesting a million documents is table stakes for any serious agent. We also touch the thorny issues of truth, contested data, and whether knowledgeable AI is an inevitable waypoint on the road to AGI. Whether you wrangle embeddings for a living, scout the next infrastructure wave, or simply wonder how machines will keep their facts straight, this conversation will sharpen your view of “memory” in the age of autonomous agents. Let’s find out when tomorrow’s AI will finally remember what matters. (CORRECTION: the opening slide introduces Edo Liberty as a co-founder. We apologize for this error: Edo Liberty is the Founder and CEO of Pinecone.) Did you like the video? You know what to do: Subscribe to the channel. Leave a comment if you have something to say. Like it if you liked it. That’s all. Thanks. Guest: Edo Liberty, CEO and founder at Pinecone Website: https://www.pinecone.io/ Additional Reading: https://www.turingpost.com/ Chapters 00:00 Intro & The Big Question – When will we give AI true memory? 01:20 Defining AI Memory and Knowledge 02:50 The Current State of Memory Systems in AI 04:35 What’s Missing for “True Memory”? 06:00 Hardware and Software Scaling Challenges 07:45 Contextual Models and Memory-Aware Retrieval 08:55 Query Understanding as a Task, Not a String 10:00 Pinecone’s Full Stack Approach 11:00 Commoditization of Vector Databases? 13:00 When Scale Breaks Your Architecture 15:00 The Rise of Multi-Tenant & Micro-Indexing 17:25 Dynamically Choosing the Right Indexing Method 19:05 Infrastructure for Agentic Workflows 20:15 The Hard Questions: What is Knowledge? 21:55 Truth vs Frequency in AI 22:45 What is “Knowledgeable AI”? 23:35 Is Memory a Path to AGI? 24:40 A Book That Shaped a CEO – *Endurance* by Shackleton 26:45 What Excites or Worries You About AI’s Future? 29:10 Final Thoughts: Sea Change is Here In Turing Post we love machine learning and AI so deeply that we cover it extensively from all perspectives: past of it, its present, and our joint-future. We explain what happens the way you will understand. Sign up: Turing Post: https://www.turingpost.com FOLLOW US Edo Liberty: https://www.linkedin.com/in/edo-liberty-4380164/ Pinecone: https://x.com/pinecone Ksenia and Turing Post: Hugging Face: https://huggingface.co/Kseniase Turing Post: https://x.com/TheTuringPost Ksenia: https://x.com/Kseniase_ Linkedin: TuringPost: https://www.linkedin.com/company/theturingpost Ksenia: https://www.linkedin.com/in/ksenia-se
    --------  
    31:01
  • When Will We Stop Coding? A conversation with Amjad Masad, CEO and co-founder @ Replit
    What happens when the biggest advocate for coding literacy starts telling people not to learn to code? In this episode, Amjad Masad, CEO and co-founder at Replit, joins me to talk about his controversial shift in thinking – from teaching millions how to code to building agents that do it for you. Are we entering a post-coding world? What even is programming when you're just texting with a machine?We talk about Replit's evolving vision, how software agents are already powering real businesses, and why the next billion-dollar startups might be solo founders augmented by AI. Amjad also shares what still stands in the way of fully autonomous agents, how AGI fits into his long-term view, and why open source still matters in the age of AI. Whether you're a developer, founder, or just AI-curious, this conversation will make you rethink what it means to “build software” in 2025. Did you like the video? You know what to do: Subscribe to the channel. Leave a comment if you have something to say. Like it if you liked it. That’s all. Thanks. Guest: Amjad Masad, CEO and co-founder at Replit Website: https://replit.com/~ Additional Reading: https://www.turingpost.com/p/amjad Chapters 00:00 Why Amjad changed his mind about coding 00:55 From code to agents: the next abstraction layer 02:05 Cognitive dissonance and the birth of Replit agents 03:38 Agent V3: toward fully autonomous software developers 04:51 Engineering platforms for long-running agents 05:30 Do agents actually work in 2025? 05:48 Real-world examples: Replit agents in action 06:36 Is Replit still a coding platform? 07:43 Why code generation beats no-code platforms 08:22 Can AI agents really create billionaires? 10:59 Every startup is now an AI startup 12:31 Solo founders and the rise of one-person AI companies 14:00 What Amjad thinks AGI really is 17:46 Replit as a habitat for AI 19:50 Open source tools vs internal no-code systems 21:02 Replit's evolving community vision 22:19 MCP vs A2A: who’s winning the protocol game 23:48 The books that shaped Amjad’s thinking about AI 25:47 What excites Amjad most about an AI-powered future Turing Post is a newsletter about AI's past, present, and future. Publisher Ksenia Semenova explores how intelligent systems are built – and how they’re changing how we think, work, and live. Sign up: Turing Post: https://www.turingpost.com FOLLOW US Amjad: https://x.com/amasad Replit: https://x.com/replit Ksenia and Turing Post: Hugging Face: https://huggingface.co/KseniaseTuring Post: https://x.com/TheTuringPost Ksenia: https://x.com/Kseniase_ Linkedin: TuringPost: https://www.linkedin.com/company/theturingpost Ksenia: https://www.linkedin.com/in/ksenia-se
    --------  
    20:19
  • When Will We Solve AI Hallucinations? A conversation with Sharon Zhou, CEO @ Lamini
    In the episode 001: the incredible Sharon Zhou, co-founder and CEO of Lamini. She’s a generative AI trailblazer, a Stanford-trained protégé of Andrew Ng – who, along with Andrej Karpathy and others, is also an investor in her company Lamini. From co-creating one of Coursera’s top AI courses to making MIT’s prestigious “35 under 35” list, Sharon turns complex tech into everyday magic.She is also super fun to talk to! We discussed: – How to empower developers to understand and work with AI – Lamini's technical approach to AI hallucinations (it's solvable!) – Why benchmarks ≠ reality – A notable industry use case and the importance of focusing on objective outputs: Subjective goals confuse it! – And one of my favourite moments: Sharon crushes two of the hottest topics – agents and RAG. Turns out researchers don’t understand why there’s all this hype around these two. – We also talked about open-source and its importance. – And last but not least, Sharon (who teaches millions on Coursera) shared how to fight the lack of knowledge about AI. Her recipe: lower the barrier to entry, help people level up – plus memes! Please give this video a watch and tell us what you think! Likes and subscribing to the channel are hugely appreciated. 00:00 Intro & Sharon Zhou’s Early Days in GenAI 01:25 Maternal Instincts for AI Models 02:42 From Classics to Code: Language, Product, and AI 04:30 The Spark Behind Lamini 07:45 Solving Hallucinations at a Technical Level 09:20 Benchmarks That Matter to Enterprises 11:58 Staying Technical as a Founder 13:27 The Agent & RAG Hype: Industry Misconceptions 18:44 Use Cases: From Colgate to Cancer Research 20:07 The Power of Objective Use Cases 22:28 What Comes After Hallucinations? 23:21 Following AI Research (and When It’s Useful) 26:23 Open Source & Model Ownership Philosophy 28:06 Bringing AI Education to Everyone 32:36 AI Natives & Edutainment for the Next Gen 34:18 Outro Lamini Website - https://www.lamini.ai Twitter - https://x.com/laminiai Sharon Zhou LinkedIn - https://www.linkedin.com/in/zhousharon/ Twitter - https://x.com/realSharonZhou/ Turing Post Website - https://www.turingpost.com/ Twitter - https://x.com/TheTuringPost Ksenia Se (publisher) LinkedIn - https://www.linkedin.com/in/ksenia-se Twitter - https://x.com/kseniase_
    --------  
    34:16
  • When Will We Speak Without Language Barrier? A conversation with Mati Staniszewski, CEO @ ElevenLabs
    In this episode of Inference, I sit down with Mati Staniszewski, co-founder and CEO of ElevenLabs, to explore the future of AI voice, real-time multilingual translation, and emotionally rich speech synthesis. We dive into what still makes dubbing hard, how Lex Fridman's podcast was localized, and what it takes to preserve tone, timing, and emotion across languages. Mati shares why speaker detection in noisy rooms is tricky, how fast their models really are (70ms TTS!), and the deeper strategy behind partnering with creators and enterprises to show – not just tell – what the tech can do. What needs to happen for natural, free-flowing multilingual conversations to become reality? Mati says: give it two or three years. Watch to learn more! Guest: Mati Staniszewski, co-founder and CEO at ElevenLabs Website: https://elevenlabs.io/ Additional Reading: https://www.turingpost.com/p/mati Chapters 0:00 Real-time voice translation 0:11 Language barriers and AI 0:29 Why ElevenLabs started 0:36 Dubbing in Poland 0:45 Preserving emotion in translation 1:06 Tech challenges in real-time translation 1:17 Ideal device setup 2:32 Speaker diarization and emotional nuance 3:04 Speech-to-text to LLM to TTS pipeline 5:51 Concrete examples: healthcare & customer support 7:05 Real-time AI dubbing use cases 8:02 Lex Fridman podcast dubbing challenge 13:01 Audio model performance & latency 14:44 Conversational AI & multimodal future 16:57 Product vs research focus at ElevenLabs 20:42 Why ElevenLabs didn't open source (yet) 21:28 Strategy: creators, enterprises & brand building Turing Post is a newsletter about AI's past, present, and future. Publisher Ksenia Semenova explores how intelligent systems are built—and how they’re changing how we think, work, and live. Sign up: Turing Post: https://www.turingpost.com FOLLOW US ON SOCIAL Twitter (X): Mati: https://x.com/matistanis ElevenLabs: https://x.com/elevenlabsio Turing Post: https://x.com/TheTuringPost Ksenia: https://x.com/Kseniase_ Linkedin: TuringPost: https://www.linkedin.com/company/theturing... Ksenia: https://www.linkedin.com/in/ksenia-se SUBSCRIBE TO OUR CHANNEL, SHARE YOUR FEEDBACK
    --------  
    22:22

More Technology podcasts

About Inference by Turing Post

Inference is Turing Post’s way of asking the big questions about AI — and refusing easy answers. Each episode starts with a simple prompt: “When will we…?” – and follows it wherever it leads.Host Ksenia Se sits down with the people shaping the future firsthand: researchers, founders, engineers, and entrepreneurs. The conversations are candid, sharp, and sometimes surprising – less about polished visions, more about the real work happening behind the scenes.It’s called Inference for a reason: opinions are great, but we want to connect the dots – between research breakthroughs, business moves, technical hurdles, and shifting ambitions.If you’re tired of vague futurism and ready for real conversations about what’s coming (and what’s not), this is your feed. Join us – and draw your own inference.
Podcast website

Listen to Inference by Turing Post, FT Tech Tonic and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v7.18.3 | © 2007-2025 radio.de GmbH
Generated: 6/6/2025 - 1:14:53 PM