14 subscribers
התחל במצב לא מקוון עם האפליקציה Player FM !
פודקאסטים ששווה להאזין
בחסות


Multimodal Video Understanding
Manage episode 388154465 series 3370867
Jae Lee is the cofounder and CEO of Twelve Labs, where they are building video understanding infrastructure to help developers build programs that can see, hear, and understand the world. He was previously the Lead Data Scientist at the Ministry of National Defense in South Korea. He has a bachelors in computer science from UC Berkeley.
In this episode, we cover a range of topics including:
- What is multimodal video understanding
- State of play in multimodal video
- The founding of Twelve Labs
- The launch of Pegasus-1
- Four core principles: Efficient Long-form Video Processing, Multimodal Understanding, Video-native Embeddings, Deep Alignment between Video and Language Embeddings
- Differences between multimodal vs traditional video analysis
- In what ways can malicious actors misuse this technology?
- The future of multimodal video understanding
Jae's favorite books:
- Deep Learning (Authors: Ian Goodfellow, Yoshua Bengio, Aaron Courville)
- The Giving Tree (Author: Shel Silverstein)
--------
Where to find Prateek Joshi:
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19
Twitter: https://twitter.com/prateekvjoshi
171 פרקים
Manage episode 388154465 series 3370867
Jae Lee is the cofounder and CEO of Twelve Labs, where they are building video understanding infrastructure to help developers build programs that can see, hear, and understand the world. He was previously the Lead Data Scientist at the Ministry of National Defense in South Korea. He has a bachelors in computer science from UC Berkeley.
In this episode, we cover a range of topics including:
- What is multimodal video understanding
- State of play in multimodal video
- The founding of Twelve Labs
- The launch of Pegasus-1
- Four core principles: Efficient Long-form Video Processing, Multimodal Understanding, Video-native Embeddings, Deep Alignment between Video and Language Embeddings
- Differences between multimodal vs traditional video analysis
- In what ways can malicious actors misuse this technology?
- The future of multimodal video understanding
Jae's favorite books:
- Deep Learning (Authors: Ian Goodfellow, Yoshua Bengio, Aaron Courville)
- The Giving Tree (Author: Shel Silverstein)
--------
Where to find Prateek Joshi:
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19
Twitter: https://twitter.com/prateekvjoshi
171 פרקים
כל הפרקים
×
1 Converting Cameras into Autonomous AI Agents | Rish Gupta, CEO of Spot AI 38:50

1 Are AI Phone Agents Ready for Prime Time? | Alex Levin, CEO of Regal 45:21

1 What it Takes to Build a BI Platform | Colin Zima, CEO of Omni 40:07

1 Building Billing Infrastructure for AI Companies | Alvaro Morales, CEO of Orb 38:21

1 Turning Legal Services to APIs | Jay Madheswaran, CEO of Eve 41:02

1 Is LLM the New Operating System? | Anant Bhardwaj, CEO of Instabase 45:37

1 Building AI Agents That Actually Work | Malte Kosub, CEO of Parloa 33:54

1 3000 Customers, One Bold Pivot: Building the First Generative AI Copilot for Lawyers | Scott Stevenson, CEO of Spellbook 44:07

1 The Outer Loop of AI-Powered Coding | Merrill Lutsky, CEO of Graphite 41:26

1 Behind the Scenes of AI Video | Amit Jain, founder of Luma AI 48:19

1 Building an AI-Powered Terminal | Zach Lloyd 38:06

1 When Robots Go Haywire, Who Picks Up The Tab? | Amias Gerety 48:54

1 Building MotherDuck to a $400M Company 49:18

1 AI Agents Have Brains, But Where Are Their Wallets? 47:27



1 AI's Role In Physics, Chemistry, and Beyond 39:27



1 Discovering New Materials With AI 39:35

1 Designing Printed Circuit Boards With AI 39:26


1 Modifying Speech Accents In Real Time With AI 34:34


1 MANG VC "Round Trip" Phenomenon in AI 40:38




1 Building and Investing in Consumer AI 40:51

1 Building Autonomous Greenhouses with AI and Robotics 37:45

1 Developing Battery Materials with AI 33:27

1 Voice-to-Voice Foundation Models 39:08

1 Digital Replicas That Can Have Real Conversations 37:40




1 Breaking New Ground With Collaborative Robots 49:22


1 How to extract intelligence from speech data with AI 44:56


1 The Long Tail of AI: Understanding and Resolving Edge Cases 37:53

1 How Symbolic AI is Transforming Critical Infrastructure 38:08

1 AI Disruption: Startups vs Incumbents in the Tech Stack 46:57

1 Unpacking AI Startups: Metrics, Playbooks, and the Future 33:09


1 Biosimulation for Drug Development 32:03








1 Evolution of intelligence, Digital life, AGI | Flo Crivello, founder and CEO of Lindy 30:57

1 Thoughts on OpenAI DevDay announcements | Vikram Sreekanti, cofounder and CEO of RunLLM 33:24

1 Artificial Specialized Intelligence, Executive Order on AI, Open Source AI | Douwe Kiela, cofounder and CEO of Contextual AI 41:41

1 AI compute market, LLM infrastructure, Industrial AI | Ville Tuulos, cofounder and CEO of Outerbounds 38:02

1 Prateek talks about Nvidia's new AI agent that can train robots 25:07

1 Prateek talks about Generative AI in Biology 27:38
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.