The Generative AI Meetup Podcast

75 episodes

Has China finally caught up?
27/07/2026 | 1h 47 mins.
https://novacut.ai/

In this episode, we break down the biggest stories shaping the AI landscape — from Anthropic's regulatory stance and OpenAI's monetization shift to the latest open-source breakthroughs and model pricing wars.

0:00 Anthropic’s Frustrating Stance
5:32 AI Access and the Intelligence Gap
14:17 Defending Anthropic’s Regulation Approach
25:10 Google DeepMind and Cybersecurity Risks
29:27 GPT 5.6 Soul and Coding Abilities
35:24 Alignment and the Knife Analogy
39:42 Google Gemma, Flash, and Market Value
51:11 AI Lab Focus: Speed vs. Specialization
58:48 Inkling: Mira Murati’s New Model
1:06:21 Open Source and Fine-Tuning
1:12:43 On-Premise Hardware Costs
1:19:54 Open Source Models Hit Frontier
1:23:32 Model Pricing Comparison
1:32:43 The Model Routing Problem
1:35:08 Grok 4.5 and Cursor Partnership
1:40:45 OpenAI’s Monetization Shift
1:42:42 Sponsor: Nova Cut AI
The Chinese DoorDash just entered the AI Race
07/07/2026 | 1h 33 mins.
https://novacut.ai/
https://genaimeetup.com/

0:00 Longcat: 1.6T Model Without US GPUs
1:08 Meituan: The Super App Behind Longcat
4:54 China's Exploding AI Competitor Scene
8:23 Inside Huawei's Ascend GPU Architecture
17:14 Cost & Energy: Huawei vs Nvidia
29:21 OpenAI's Custom Inference Chip Strategy
36:23 Software Optimizations: The Path to 10,000x
45:38 GPT-5.6: Sol, Terra, Luna Models
58:02 CursorBench & the New Coding Benchmarks
1:22:16 Meta's Non-Invasive Brain-to-Text
1:26:30 Anthropic Science: AI for Researchers
1:30:59 Outro & Community Ask

China just dropped a 1.6-trillion-parameter model without access to US GPUs — and it's running on Huawei's homegrown Ascend chips. In this episode, we break down:

🔹 Longcat — the massive model built by Meituan, China's super app giant 🔹 China's exploding AI competitor ecosystem 🔹 Inside the **Huawei Ascend GPU architecture **: specs, costs, and energy tradeoffs vs. Nvidia 🔹 OpenAI's custom inference chip strategy 🔹 The software optimizations driving a 10,000x efficiency leap 🔹 GPT-5.6: Sol, Terra, and Luna models explained 🔹 New coding benchmarks with CursorBench 🔹 Meta's non-invasive brain-to-text research 🔹 Anthropic Science — AI built for researchers
What happened to my Fable?
23/06/2026 | 1h 29 mins.
https://novacut.ai/

Description:

Anthropic pulls access to Fable, and China responds the same day with GLM 5.2. In this episode we break down the escalating AI arms race, US export controls on chips and frontier models, and whether the "Great Firewall of America" is already here.

⏱️ Topics:

Anthropic restricts Fable — what happened and why

China's GLM 5.2 release and how close they're catching up

US trust, surveillance, and AI gatekeeping

Token pricing chaos — cost per task vs. cost per token

Model routing, loop engineering, and autonomous agents

Anthropic's Mythos model and Fable safeguard philosophy

Xiaomi NEMO V2.5 Pro Ultra Speed

Midjourney's bizarre health spa pivot

AI Engineer Conference wrap-up

🔗 Links & Resources:

Fable

https://www.anthropic.com/news/claude-fable-5-mythos-5

https://www.theregister.com/security/2026/06/15/feds-freaked-over-fable-5-after-simple-fix-this-code-prompt-not-jailbreak-says-researcher/5255827 “Fix this code”
https://support.claude.com/en/articles/14328960-identity-verification-on-claude

Midjourney

https://www.midjourney.com/medical/blogpost
Full body ultrasound CT scanner

Xiaomi 1000tps

https://mimo.xiaomi.com/blog/mimo-tilert-1000tps (MiMo-V2.5-Pro-UltraSpeed: Pushing 1T-Parameter Model Generation Speed to 1000 TPS

Best opensource model

https://z.ai/blog/glm-5.2

📌 Timestamps in the chapters section above.

#AIPodcast #Anthropic #Fable #GLM52 #AIArmsRace #LLM #GenAI

0:00 Intro: Anthropic restricts Fable access
1:00 China's response and GLM 5.2 release
1:55 US trust and AI model reliability
2:37 Geopolitics and AI regulations
4:17 AI arms race and export control limits
5:49 Fable usage experience and value
6:30 Z.AI subscription and pricing comparison
9:07 Subscription limits vs. API usage
10:30 GLM 5.2 token limits and utility
13:52 China catches up: GLM vs. US models
15:06 Intelligence index and model cost trend
17:40 Token pricing complexity and value
23:30 Cost per task vs. cost per token
24:35 Model routing and usage optimization
29:43 Loop engineering and autonomous agents
33:23 NovaCut: AI video editor and ad loops
37:07 Anthropic's Fable re-release timeline
38:41 US gatekeeping and China's advantage
40:11 Great Firewall of America risks
44:07 Mass surveillance and free speech
46:01 Global AI trust and market shift
47:48 Enforcing identity checks on AI
48:54 Export bans on chips and hardware
51:52 US restricts allies from frontier models
52:39 Anthropic's talent and Mythos model
53:51 Fable safeguards and US government view
1:00:12 Xiaomi NEMO V2.5 Pro Ultra Speed model
1:02:39 Optimizing for intelligence, cost, speed, and size
1:13:19 Midjourney's unexpected health spa pivot
1:28:30 AI Engineer conference and podcast wrap
The Best Open Source US Model (Right behind China)
07/06/2026 | 1h 54 mins.
https://novacut.ai/

https://genaimeetup.com/

Anthropic has officially closed a $65 billion Series H at a $965 billion valuation, nearly 2.5x its valuation from just 100 days ago. Meanwhile, funding is flowing across the ecosystem: Frameworks AI at $15B, Baseten at $11B, OpenRouter's $113M Series B, and Cognition AI's $1B Series D.

NVIDIA went on an open-source super week with Nemotron 3 Ultra, Cosmos 3, and Nemotron 3.5 ASR. Microsoft dropped 5 new MAI models. Google released Gemma 4 12B, and Anthropic shipped Opus 4.8.

On the benchmarks front, DeepSWE crowns GPT-5.5 as the leader in long-horizon coding tasks, while ITBench shows even frontier models struggle with real-world SRE incidents — Claude Opus 4.7 tops out at just 47%.

Plus: Cloudflare acquires VoidZero to build the future of AI-native edge development, and Google is paying SpaceX $920M/month for compute.

Topics covered: • Anthropic's $65B Series H and path to $1T • Fireworks AI, Baseten, OpenRouter & Cognition funding rounds • Microsoft's 5 new MAI models • NVIDIA's open-source super week (Nemotron, Cosmos 3) • MiniMax M3, Gemma 4 12B, JetBrains Mellum2, Opus 4.8 • DeepSWE benchmark: GPT-5.5 leads long-horizon coding • ITBench: Frontier models under 50% on real SRE tasks • Cloudflare + VoidZero for AI-native edge dev • Google's $920M/month SpaceX compute deal

#AI #Anthropic #NVIDIA #OpenAI #AInews #TechNews #LLM

Funding rounds
Anthropic formally confirmed the closure of its $65 billion Series H funding round at a post-money valuation of $965 billion. This represents a 2.5-fold increase over its $380 billion Series G valuation from February 2026, adding $585 billion in value in approximately 100 days

https://www.anthropic.com/news/series-h

Frameworks AI raising at 15B valuation representing a near fourfold increase from its $4 billion Series C valuation recorded in October 2025

processing 15 trillion tokens daily for major production clients including Cursor, Notion, and Perplexity

https://finance.yahoo.com/sectors/technology/articles/fireworks-ai-eyes-15-billion-174609357.html

Baseten is raising 1B at 11B valuation

annualized revenue, which skyrocketed from $200 million to $600 million over a single quarter

https://techstartups.com/2026/05/26/ai-inference-startup-baseten-in-talks-to-raise-1-billion-at-11-billion-valuation/

OpenRouter has secured a $113 million Series B funding

OpenRouter has experienced exponential traffic growth, with weekly production throughput expanding fivefold from 5 trillion to 25 trillion tokens over a six-month horizon

https://www.businesswire.com/news/home/20260526953416/en/OpenRouter-Raises-%24113-Million-CapitalG-led-Series-B-as-Weekly-Volume-Explodes-to-25T-Tokens

Further up the stack: Cognition AI secured a $1 billion Series D round led by Lux Capital and 8VC

https://cognition.ai/blog/series-d

Model Releases
MAI models:

MAI-Code-1-Flash: A 5-billion active parameter model optimized for ultra-low latency within GitHub Copilot and VS Code.

MAI-Image-2.5: A high-fidelity image generation model ranking third on global image evaluation arenas, outperforming competing architectures like Nano Banana Pro.

MAI-Transcribe-1.5: A multi-lingual speech processing engine offering fivefold speed improvements across 43 languages.

MAI-Voice-2: Natural audio and voice generation across 15 languages, available at a highly competitive price point.

Web IQ: A search-grounding API engineered to directly compete with Perplexity.

https://microsoft.ai/models/

https://www.peoplematters.in/news/ai-and-emerging-tech/uber-imposes-dollar1500-monthly-ai-spending-limit-on-employees-amid-rising-costs-50073

Nvidia has executed an "Open-Source Super Week," positioning itself as a dominant software and model publisher:

Nemotron 3 Ultra (best US open source open weights model but behind china): A massive 550-billion parameter MoE (55 billion active) designed with a 1-million token context window, optimized specifically for high-throughput, cyclical agent loops. It achieved peak throughput rates of 400 tokens per second on day-zero optimized clusters.

Cosmos 3: A physical AI world-modeling framework comprising 16-billion Nano and 64-billion Super variants. Built on a Mixture-of-Transformers (MoT) architecture, Cosmos 3 natively binds textual, visual, auditory, and physical kinetic vectors.

Nemotron 3.5 ASR: A highly compact 0.6-billion parameter streaming speech recognition model pushing sub-100 millisecond latencies across 40 language locales.

https://www.minimax.io/models/text/m3

MiniMax M3: A 1-million token context model hitting 59.0% on SWE-Bench Pro and 74.2% on MCP Atlas, though noted for high token consumption due to intensive internal self-validation loops.

https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/

Gemma 4 12B: Google's Apache 2.0 on-device model, which utilizes an encoder-free architecture that projects vision and audio vectors directly into the text-token space, bypassing separate CLIP-style encoders to minimize local memory footprints.

https://www.jetbrains.com/mellum/

JetBrains Mellum2: A compact 12-billion parameter MoE (2.5 billion active) engineered for ultra-low latency routing and retrieval-augmented generation (RAG) sub-agents within developer IDEs.

Opus 4.8

https://www.anthropic.com/news/claude-opus-4-8

https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai-compute-capacity.html

Benchmarks:

https://deepswe.d atacurve.ai/blog
https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-finds-claude-opus-exploiting-a-benchmark-loophole (GPT 5.5 the winner in long horizon tasks)

a highly complex software engineering benchmark focused on original, long-horizon tasks across five distinct programming languages. Comprising 113 chaotic tasks across 91 live, production-grade repositories, DeepSWE forces agents to generate 5.5 times more code and modify an average of 7 separate files per task compared to standard evaluations. On this challenging leaderboard, GPT-5.5 leads with a score of 70%, establishing a significant 16-percentage-point lead over contemporary alternatives

I think older benchmarks where models reach ~90% accuracy can be considered saturated. Few percentage points don’t give us any good signal.

https://research.ibm.com/publications/developing-ai-agents-for-it-automation-tasks-with-itbench

ITBench-AA, an evaluation framework focusing on live Kubernetes incident response and Site Reliability Engineering (SRE) operations. Comprising 59 live, containerized SRE incident snapshots, the results are remarkably sobering: every frontier model scored under 50% on successful incident resolution, with Claude Opus 4.7 leading at 47% and GPT-5.5 following closely at 46%.

Edge AI announcements:
https://www.cloudflare.com/press/press-releases/2026/cloudflare-acquires-voidzero-to-build-the-future-of-the-ai-native-web/

The consolidation of the AI-native developer stack has reached the runtime virtualization layer. Cloudflare recently completed the acquisition of VoidZero, the development group responsible for Vite, Vitest, Rolldown, and Oxc, backing the transaction with a $1 million open-source ecosystem fund. This acquisition is highly strategic; as autonomous agents write an increasing proportion of production software, local development environments, compilation pipelines, and bundlers must be optimized for execution speeds that match agent speeds.

Cloudflare's goal is to construct a localized, full-stack edge playground. In this sandbox, AI agents can generate, test, bundle (utilizing the highly parallelized, Rust-based Oxc and Rolldown engines), and deploy entire web applications end-to-end within milliseconds. This architecture completely bypasses traditional local machine container bottlenecks, enabling high-velocity agent loops to execute in a fully sandboxed, web-scale edge runtime.
Karpathy Joins Anthropic and the AI Compute Gold Rush
24/05/2026 | 1h 39 mins.
This week on AI Meta, we break down Andrej Karpathy’s move to Anthropic, Claude’s growing developer
mindshare, and why recursive self-improvement may be the next major frontier in AI. We also cover
Google’s latest Gemini announcements, Anthropic’s reported compute deal with xAI/SpaceX, the rise of
gray-market Claude API access in China, OpenAI’s ongoing drama, Cerebras, Nvidia, Intel, and Leopold
Aschenbrenner’s massive AI infrastructure bets.

Plus: SpaceX IPO speculation, Cursor, Grok, and why the AI economy increasingly looks like a global
casino. Not financial advice.

https://novacut.ai

More Technology podcasts

Trending Technology podcasts

About The Generative AI Meetup Podcast

Hosted by Mark and Shashank, software engineers and organizers in Silicon Valley. Get their grounded perspective each week as they explore the generative AI landscape through news analysis, tech discussions, hands-on experiments, and clear explanations.Dive into the latest language models, AI agent capabilities, and RAG techniques. Understand the hardware race, key research, startup trends, benchmarks, and the real-world impact of AI across industries like healthcare, robotics, and creative work. We also test AI limits, explain core concepts, discuss ethics, and interview builders shaping the field.For engineers, developers, researchers, and anyone seeking a practical understanding of AI’s rapid evolution and its applications.

Podcast website

Technology

Listen to The Generative AI Meetup Podcast, Technology Now and many other podcasts from around the world with the radio.net app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Open app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

The Generative AI Meetup Podcast

Scan code,
download the app,
start listening.