Data Augmentation in Natural Language Processing

The Data Exchange with Ben Lorica

Player FM - Internet Radio Done Right

81 subscribers

הוסף לפני six שנים

תוכן מסופק על ידי Ben Lorica. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Ben Lorica או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

Before The Chorus

1
LIVE: Before the Chorus & Open Folk Present: In These Lines feat. Gaby Moreno, Lily Kershaw & James Spaite 33:58

לפני 14 ימים33:58

הפעל מאוחר יותר

רשימות

לייק

אהבתי

33:58

On June 25th 2025, in collaboration with Open Folk, we presented our first ever live interview event in Los Angeles. As Open Folk put it: "In These Lines is a live event where three artists each bring one song — not just to perform, but to explore. They sit down with Sofia Loporcaro, host of Before The Chorus, to talk about where the song came from, what it meant to write it, and what it still holds. Then they play it. Just the song, and the truth behind it." Find Open Folk on Instagram: @openfolkla Find Gaby on Spotify: https://open.spotify.com/artist/0K9pSmFx0kWESA9jqx8aCW?si=Wz4RUP88Qlm_RKs7QTLvWQ On Apple Music: https://music.apple.com/us/artist/gaby-moreno/472697737 Instagram: https://www.instagram.com/gaby_moreno/ Find Lily on Spotify: https://open.spotify.com/artist/0p0ksmwMDQlAM24TWKu4Ua?si=Bmdg-uIUTHu-zRUc_dqL3g On Apple Music: https://music.apple.com/us/artist/lily-kershaw/526884610 Instagram: https://www.instagram.com/lilykershaw/ Find James on Spotify: https://open.spotify.com/artist/3u50TPoLvMBXNT1KrLa3iT?si=OoLoq7ZTRZyUiytQcz0FsQ On Apple Music: https://music.apple.com/us/artist/james-spaite/905076868 Instagram: https://www.instagram.com/jamesspaite/ Subscribe: ⁠⁠⁠⁠⁠⁠⁠https://beforethechorus.bio.to/listen⁠⁠⁠⁠⁠⁠⁠ Sign up for our newsletter: ⁠⁠⁠⁠⁠⁠⁠https://www.beforethechorus.com/⁠⁠⁠⁠⁠⁠⁠ Follow on Instagram: ⁠⁠⁠⁠⁠⁠⁠@beforethechoruspodcast⁠⁠⁠⁠⁠⁠⁠ & ⁠⁠⁠⁠⁠⁠⁠@soundslikesofia⁠⁠⁠⁠⁠⁠⁠ About the podcast: Welcome to Before the Chorus , where we go beyond the sounds of our favourite songs to hear the stories of the artists who wrote them. Before a song is released, a record is produced, or a chorus is written, the musicians that write them think. A lot. They live. A lot. And they feel. A LOT. Hosted by award-winning interviewer Sofia Loporcaro, Before the Chorus explores the genuine human experiences behind the music. Sofia’s deep knowledge of music and personal journey with mental health help her connect with artists on a meaningful level. This is a space where fans connect with artists, and listeners from all walks of life feel seen through the stories that shape the music we love. About the host: Sofia Loporcaro is an award-winning interviewer and radio host who’s spent over 8 years helping musicians share their stories. She’s hosted shows for Amazing Radio, and Transmission Roundhouse. Now on Before the Chorus, she’s had the chance to host guests like Glass Animals, Feist, Madison Cunningham, Mick Jenkins, & Ru Paul's Drag Race winner Shea Couleé. Learn more about your ad choices. Visit megaphone.fm/adchoices…

לפני 4 שנים 51:44

MP3•בית הפרקים

This week’s guests are Steven Feng, Graduate Student and Ed Hovy, Research Professor, both from the Language Technologies Institute of Carnegie Mellon University. We discussed their recent survey paper on Data Augmentation Approaches in NLP (GitHub), an active field of research on techniques for increasing the diversity of training examples without explicitly collecting new data. One key reason why such strategies are important is that augmented data can act as a regularizer to reduce overfitting when training models.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.

294 פרקים

#Business News #News #Tech News #Ben Lorica #Data #Machine Learning #Data Science #Data Engineering #Cloud Computing #Tech

The Data Exchange with Ben Lorica

Data Augmentation in Natural Language Processing

The Data Exchange with Ben Lorica

81 subscribers

published לפני 4 שנים

שתפו

MP3•בית הפרקים

294 פרקים

#Business News #News #Tech News #Ben Lorica #Data #Machine Learning #Data Science #Data Engineering #Cloud Computing #Tech

Tất cả các tập

The Data Exchange with Ben Lorica

1
The Quantum Advantage Is Real—But Where's the Infrastructure? 45:53

לפני 4 ימים45:53

45:53

Jennifer Prendki explains that while universal quantum computers are a decade away, specialized quantum accelerators are already tackling AI problems in finance and pharma. She argues the biggest hurdle isn’t the hardware but the profound software and infrastructure gap, as fundamental principles like the “no-cloning theorem” break traditional MLOps. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
From Human-Readable to Machine-Usable: The New API Stack 38:23

לפני 11 ימים38:23

38:23

Sagar Batchu , CEO of Speakeasy , joins the podcast to discuss the critical shift in API development as AI agents become primary consumers. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Why Voice Security Is Your Next Big Problem 41:37

לפני 18 ימים41:37

41:37

In this episode, Yishay Carmiel and Roy Zanbel of Apollo Defend discuss the rapidly evolving landscape of voice AI and its emerging security threats. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Unlocking Unstructured Data with LLMs 27:46

לפני 25 ימים27:46

27:46

Shreya Shankar is a PhD student at UC Berkeley in the EECS department. This episode explores how Large Language Models (LLMs) are revolutionizing the processing of unstructured enterprise data like text documents and PDFs. It introduces DocETL , a framework using a MapReduce approach with LLMs for semantic extraction, thematic analysis, and summarization at scale. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Building Production-Grade RAG at Scale 31:24

לפני 5 weeks31:24

31:24

Douwe Kiela , Founder and CEO of Contextual AI , discusses why RAG isn’t obsolete despite massive context windows, explaining how RAG 2.0 represents a fundamental shift to treating retrieval-augmented generation as an end-to-end trainable system. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Unlocking AI Superpowers in Your Terminal 44:59

לפני 6 weeks44:59

44:59

Zach Lloyd , Founder/CEO of Warp , joins the podcast to discuss how Warp is revolutionizing the command-line terminal by integrating AI. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
From Vibe Coding to Autonomous Agents 51:16

לפני 7 weeks51:16

51:16

Jackie Brosamer and Brad Axen from Block discuss codename goose (Goose), their open-source AI agent designed to automate complex engineering and knowledge work. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
How a Public-Benefit Startup Plans to Make Open Source the Default for Serious AI 48:45

לפני 8 weeks48:45

48:45

Oumi Labs CEO Manos Koukoumidis lays out a vision for “unconditionally open” foundation models—where data, code, weights, and recipes are all transparent and reproducible—arguing this is the only path to production-grade, trustworthy AI. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
The Highly Uncertain Future of OpenAI’s Dominance 54:07

לפני 9 weeks54:07

54:07

Dan Schwarz (CEO & Co-Founder, Futuresearch ) explains why his firm finds OpenAI’s $125 billion revenue projection highly implausible, citing fierce competition, pressure on ChatGPT/API revenue, and fleeting technical advantages. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Beyond Guardrails: Defending LLMs Against Sophisticated Attacks 44:31

לפני 10 weeks44:31

44:31

Jason Martin is an AI Security Researcher at HiddenLayer . This episode explores “policy puppetry,” a universal attack technique bypassing safety features in all major language models using structured formats like XML or JSON. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Navigating the Generative AI Maze in Business 49:35

לפני 11 weeks49:35

49:35

Evangelos Simoudis is Managing Director at Synapse Partners , a firm that helps corporations apply AI and invests in startups developing data-driven AI applications. This episode explores the current state of enterprise AI adoption, distinguishing between the steady progress of traditional AI and the experimental phase of generative AI. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
The Practical Realities of AI Development 37:30

לפני 12 weeks37:30

37:30

Lin Qiao , CEO of Fireworks AI , dives into the practical challenges AI developers face, from UX/DX hurdles to complex systems engineering. Discover key trends like the convergence of open-source and proprietary models, the rise of agentic workflows, and strategies for optimizing quality, speed, and cost. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Beyond the Demo: Building AI Systems That Actually Work 27:36

לפני 13 weeks27:36

27:36

Hamel Husain is the founder of Parlance Labs . He discusses how successful AI implementation requires fundamental data science skills often overlooked in current educational resources that focus too heavily on tools and frameworks. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Support our work by leaving a small tip 💰 https://buymeacoffee.com/gradientflow Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
Vibe Coding and the Rise of AI Agents: The Future of Software Development is Here 36:35

לפני 14 weeks36:35

36:35

Steve Yegge is an evangelist at Sourcegraph , a startups that is industrializing software development with AI agents. This episode explores the paradigm shift in software development with the rise of “vibe coding” and AI agents, moving beyond traditional code completion. It discusses how developers are transitioning from line-by-line coding to orchestrating AI, emphasizing the crucial need for trust, verification, and new skill sets like AI engineering and humanities. Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

The Data Exchange with Ben Lorica

1
2025 Artificial Intelligence Index 51:44

לפני 15 weeks51:44

51:44

Nestor Maslej is a Research Manager at Stanford's HAI , and editor-in-chief of the 2025 AI Index Report . Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/ Support our work by leaving a small tip 💰 https://buymeacoffee.com/gradientflow Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS . Detailed show notes - with links to many references - can be found on The Data Exchange web site .…

ברוכים הבאים אל Player FM!

Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.

תקשיבו ל-500+ נושאים

81 subscribers

דומה לThe Data Exchange with Ben Lorica

Bounty Paper Towels Quick Size, White, 16 Family Rolls = 40 Regular Rolls

Premier Protein Shake, Chocolate, 30g Protein 1g Sugar 24 Vitamins Minerals Nutrients to Support Immune Health, 11.5 fl oz (Pack of 12)

Amazon eGift Card - Bright Balloons (Animated)

פודקאסטים ששווה להאזין

The Data Exchange with Ben Lorica « » Data Augmentation in Natural Language Processing

Data Augmentation in Natural Language Processing

פודקאסטים ששווה להאזין

ברוכים הבאים אל Player FM!

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

Mini Mic Pro (Latest Model) - Professional Wireless Microphone for iPhone, iPad, Android, Lavalier Microphone for Video Recording - iPhone Mic Crystal Clear Recording with USB-C for Content Creators

Microsoft Office Home 2024 | Classic Apps: Word, Excel, PowerPoint | One-Time Purchase for 1 PC/MAC | Instant Download | Formerly Home & Student 2021 [PC/Mac Online Code]

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

דומה לThe Data Exchange with Ben Lorica

מדריך עזר מהיר

The Data Exchange with Ben Lorica « »
Data Augmentation in Natural Language Processing