Powered by RND
PodcastsTechnologyDataTalks.Club
Listen to DataTalks.Club in the App
Listen to DataTalks.Club in the App
(7,438)(250,057)
Save favourites
Alarm
Sleep timer

DataTalks.Club

Podcast DataTalks.Club
DataTalks.Club
DataTalks.Club - the place to talk about data!

Available Episodes

5 of 181
  • Data Intensive AI - Bartosz Mikulski
    In this podcast episode, we talked with Bartosz Mikulski about Data Intensive AI.About the Speaker:Bartosz is an AI and data engineer. He specializes in moving AI projects from the good-enough-for-a-demo phase to production by building a testing infrastructure and fixing the issues detected by tests. On top of that, he teaches programmers and non-programmers how to use AI. He contributed one chapter to the book 97 Things Every Data Engineer Should Know, and he was a speaker at several conferences, including Data Natives, Berlin Buzzwords, and Global AI Developer Days.Ā In this episode, we discuss Bartoszā€™s career journey, the importance of testing in data pipelines, and how AI tools like ChatGPT and Cursor are transforming development workflows. From prompt engineering to building Chrome extensions with AI, we dive into practical use cases, tools, and insights for anyone working in data-intensive AI projects. Whether youā€™re a data engineer, AI enthusiast, or just curious about the future of AI in tech, this episode offers valuable takeaways and real-world experiences.0:00 Introduction to Bartosz and his background4:00 Bartoszā€™s career journey from Java development to AI engineering9:05 The importance of testing in data engineering11:19 How to create tests for data pipelines13:14 Tools and approaches for testing data pipelines17:10 Choosing Spark for data engineering projects19:05 The connection between data engineering and AI tools21:39 Use cases of AI in data engineering and MLOps25:13 Prompt engineering techniques and best practices31:45 Prompt compression and caching in AI models33:35 Thoughts on DeepSeek and open-source AI models35:54 Using AI for lead classification and LinkedIn automation41:04 Building Chrome extensions with AI integration43:51 Comparing Cursor and GitHub Copilot for coding47:11 Using ChatGPT and Perplexity for AI-assisted tasks52:09 Hosting static websites and using AI for development54:27 How blogging helps attract clients and share knowledge58:15 Using AI to assist with writing and content creationšŸ”— CONNECT WITH BartoszLinkedIn: https://www.linkedin.com/in/mikulskibartosz/ Github: https://github.com/mikulskibartoszWebsite: https://mikulskibartosz.name/blog/šŸ”— CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ Check other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/
    -------- Ā 
    54:54
  • MLOps in Corporations and Startups - Nemanja Radojkovic
    In this podcast episode, we talked with Nemanja Radojkovic about MLOps in Corporations and Startups.About the Speaker: Nemanja Radojkovic is Senior Machine Learning Engineer at Euroclear.In this event,weā€™re diving into the world of MLOps, comparing life in startups versus big corporations. Joining us again is Nemanja, a seasoned machine learning engineer with experience spanning Fortune 500 companies and agile startups. We explore the challenges of scaling MLOps on a shoestring budget, the trade-offs between corporate stability and startup agility, and practical advice for engineers deciding between these two career paths. Whether youā€™re navigating legacy frameworks or experimenting with cutting-edge tools.1:00 MLOps in corporations versus startups6:03 The agility and pace of startups7:54 MLOps on a shoestring budget12:54 Cloud solutions for startups15:06 Challenges of cloud complexity versus on-premise19:19 Selecting tools and avoiding vendor lock-in22:22 Choosing between a startup and a corporation27:30 Flexibility and risks in startups29:37 Bureaucracy and processes in corporations33:17 The role of frameworks in corporations34:32 Advantages of large teams in corporations40:01 Challenges of technical debt in startups43:12 Career advice for junior data scientists44:10 Tools and frameworks for MLOps projects49:00 Balancing new and old technologies in skill development55:43 Data engineering challenges and reliability in LLMs57:09 On-premise vs. cloud solutions in data-sensitive industries59:29 Alternatives like Dask for distributed systemsšŸ”— CONNECT WITH NEMANJALinkedIn - Ā  / radojkovicĀ Ā Github - https://github.com/baskervilskišŸ”— CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsĀ LinkedIn - Ā  / datatalks-clubĀ  Ā Twitter - Ā  / datatalksclubĀ  Ā Website - https://datatalks.club/Ā 
    -------- Ā 
    58:03
  • Trends in Data Engineering ā€“ Adrian Brudaru
    In this podcast episode, we talked with Adrian Brudaru about ā€‹the past, present and future of data engineering.About the speaker:Adrian Brudaru studied economics in Romania but soon got bored with how creative the industry was, and chose to go instead for the more factual side. He ended up in Berlin at the age of 25 and started a role as a business analyst. At the age of 30, he had enough of startups and decided to join a corporation, but quickly found out that it did not provide the challenge he wanted.As going back to startups was not a desirable option either, he decided to postpone his decision by taking freelance work and has never looked back since. Five years later, he co-founded a company in the data space to try new things. This company is also looking to release open source tools to help democratize data engineering.0:00 Introduction to DataTalks.Club1:05 Discussing trends in data engineering with Adrian2:03 Adrian's background and journey into data engineering5:04 Growth and updates on Adrian's company, DLT Hub9:05 Challenges and specialization in data engineering today13:00 Opportunities for data engineers entering the field15:00 The "Modern Data Stack" and its evolution17:25 Emerging trends: AI integration and Iceberg technology27:40 DuckDB and the emergence of portable, cost-effective data stacks32:14 The rise and impact of dbt in data engineering34:08 Alternatives to dbt: SQLMesh and others35:25 Workflow orchestration tools: Airflow, Dagster, Prefect, and GitHub Actions37:20 Audience questions: Career focus in data roles and AI engineering overlaps39:00 The role of semantics in data and AI workflows41:11 Focusing on learning concepts over tools when entering the field 45:15 Transitioning from backend to data engineering: challenges and opportunities 47:48 Current state of the data engineering job market in Europe and beyond 49:05 Introduction to Apache Iceberg, Delta, and Hudi file formats 50:40 Suitability of these formats for batch and streaming workloads 52:29 Tools for streaming: Kafka, SQS, and related trends 58:07 Building AI agents and enabling intelligent data applications 59:09Closing discussion on the place of tools like DBT in the ecosystemšŸ”— CONNECT WITH ADRIAN BRUDARULinkedin -Ā Ā / data-teamĀ Ā  Website - https://adrian.brudaru.com/ šŸ”— CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn -Ā Ā /datatalks-clubĀ Ā  Twitter -Ā Ā /datatalksclubĀ Ā  Website - https://datatalks.club/
    -------- Ā 
    56:59
  • Competitive Machine Leaning And Teaching ā€“ Alexander Guschin
    In this podcast episode, we talked with Alexander Guschin about launching a career off Kaggle.About the Speaker: Alexander Guschin is a Machine Learning Engineer with 10+ years of experience, a Kaggle Grandmaster ranked 5th globally, and a teacher to 100K+ students. He leads DS and SE teams and contributes to open-source ML tools.0:00 Starting with Machine Learning: Challenges and Early Steps 13:05 Community and Learning Through Kaggle Sessions 17:10 Broadening Skills Through Kaggle Participation 18:54 Early Competitions and Lessons Learned 21:10 Transitioning to Simpler Solutions Over Time 23:51 Benefits of Kaggle for Starting a Career in Machine Learning 29:08 Teamwork vs. Solo Participation in Competitions 31:14 Schoolchildren in AI Competitions42:33 Transition to Industry and MLOps50:13 Encouraging teamwork in student projects50:48 Designing competitive machine learning tasks52:22 Leaderboard types for tracking performance53:44 Managing small-scale university classes54:17 Experience with Coursera and online teaching59:40 Convincing managers about Kaggle's value61:38 Secrets of Kaggle competition success63:11 Generative AI's impact on competitive ML65:13 Evolution of automated ML solutions66:22 Reflecting on competitive data science experiencešŸ”— CONNECT WITH ALEXANDER GUSCHINLinkedin - https://www.linkedin.com/in/1aguschin/Website - https://www.aguschin.com/šŸ”— CONNECT WITH DataTalksClubJoin DataTalks.Club:ā ā ā ā https://datatalks.club/slack.htmlā ā ā ā Our events:ā ā ā ā https://datatalks.club/events.htmlā ā ā ā Datalike Substack -ā ā ā ā https://datalike.substack.com/ā ā ā ā LinkedIn:ā ā ā ā Ā Ā /Ā datatalks-clubĀ Ā ā 
    -------- Ā 
    53:27
  • Redefining AI Infrastructure: Open-Source, Chips, and the Future Beyond Kubernetes ā€“ Andrey Cheptsov
    In this podcast episode, we talked with Andrey Cheptsov about ā€‹The future of AI infrastructure.About the Speaker:Andrey Cheptsov is the founder and CEO of dstack, an open-source alternative to Kubernetes and Slurm, built to simplify the orchestration of AI infrastructure. Before dstack, Andrey worked at JetBrains for over a decade helping different teams make the best developer tools.During the event, the guest, Andrey Cheptsov, founder and CEO of dstack, discussed the complexities of AI infrastructure. We explore topics like the challenges of using Kubernetes for AI workloads, the need to rethink container orchestration, and the future of hybrid and cloud-only infrastructures. Andrey also shares insights into the role of on-premise and bare-metal solutions, edge computing, and federated learning.00:00 Andrey's Career Journey: From JetBrains to DStack5:00 The Motivation Behind DStack7:00 Challenges in Machine Learning Infrastructure10:00 Transitioning from Cloud to On-Prem Solutions14:30 Reflections on OpenAI's Evolution17:30 Open Source vs Proprietary Models: A Balanced Perspective21:01 Monolithic vs. Decentralized AI businesses22:05 The role of privacy and control in AI for industries like banking and healthcare30:00 Challenges in training large AI models: GPUs and distributed systems37:03 DeepSpeed's efficient training approach vs. brute force methods39:00 Challenges for small and medium businesses: hosting and fine-tuning models47:01 Managing Kubernetes challenges for AI teams52:00 Hybrid vs. cloud-only infrastructure56:03 On-premise vs. bare-metal solutions58:05 Exploring edge computing and its challengesšŸ”— CONNECT WITH ANDREY CHEPTSOVTwitter -Ā Ā / andrey_cheptsovĀ Ā Linkedin -Ā Ā / andrey-cheptsovĀ Ā GitHub - https://github.com/dstackai/dstack/Website - https://dstack.ai/šŸ”— CONNECT WITH DataTalksClubJoin DataTalks.Club:ā ā ā https://datatalks.club/slack.htmlā ā ā Our events:ā ā ā https://datatalks.club/events.htmlā ā ā Datalike Substack -ā ā ā https://datalike.substack.com/ā ā ā LinkedIn:ā ā ā Ā Ā /Ā datatalks-clubĀ Ā ā 
    -------- Ā 
    56:55

More Technology podcasts

About DataTalks.Club

DataTalks.Club - the place to talk about data!
Podcast website

Listen to DataTalks.Club, BG2Pod with Brad Gerstner and Bill Gurley and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v7.11.0 | Ā© 2007-2025 radio.de GmbH
Generated: 3/24/2025 - 9:34:24 PM