Artwork

תוכן מסופק על ידי Demetrios. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Demetrios או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.
Player FM - אפליקציית פודקאסט
התחל במצב לא מקוון עם האפליקציה Player FM !

Efficient GPU infrastructure at LinkedIn // Animesh Singh // MLOps Podcast #299

59:13
 
שתפו
 

Manage episode 473967142 series 3241972
תוכן מסופק על ידי Demetrios. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Demetrios או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

Building Trust Through Technology: Responsible AI in Practice // MLOps Podcast #299 with Animesh Singh, Executive Director, AI Platform and Infrastructure of LinkedIn.

Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter

// AbstractAnimesh discusses LLMs at scale, GPU infrastructure, and optimization strategies. He highlights LinkedIn's use of LLMs for features like profile summarization and hiring assistants, the rising cost of GPUs, and the trade-offs in model deployment. Animesh also touches on real-time training, inference efficiency, and balancing infrastructure costs with AI advancements. The conversation explores the evolving AI landscape, compliance challenges, and simplifying architecture to enhance scalability and talent acquisition.

// BioExecutive Director, AI and ML Platform at LinkedIn | Ex IBM Senior Director and Distinguished Engineer, Watson AI and Data | Founder at Kubeflow | Ex LFAI Trusted AI NA Chair

Animesh is the Executive Director leading the next-generation AI and ML Platform at LinkedIn, enabling the creation of the AI Foundation Models Platform, serving the needs of 930+ Million members of LinkedIn. Building Distributed Training Platforms, Machine Learning Pipelines, Feature Pipelines, Metadata engines, etc. Leading the creation of the LinkedIn GAI platform for fine-tuning, experimentation and inference needs. Animesh has more than 20 patents and 50+ publications.

Past IBM Watson AI and Data Open Tech CTO, Senior Director, and Distinguished Engineer, with 20+ years experience in the Software industry, and 15+ years in AI, Data, and Cloud Platform. Led globally dispersed teams, managed globally distributed projects, and served as a trusted adviser to Fortune 500 firms. Played a leadership role in creating, designing, and implementing Data and AI engines for AI and ML platforms, led Trusted AI efforts, and drove the strategy and execution for Kubeflow, OpenDataHub, and execution in products like Watson OpenScale and Watson Machine Learning.

// Related LinksComposable Memory for GPU Optimization // Bernie Wu // Pod #270 - https://youtu.be/ccaDEFoKwko

~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Animesh on LinkedIn: /animeshsingh1

Timestamps:[00:00] Animesh's preferred coffee[00:16] Takeaways[02:12] What is working? [07:00] What's not working?[13:40] LLM vs Rexis Efficiency[21:49] GPU Utilization and Architecture[27:32] GPU reliability concerns[36:50] Memory Bottleneck in AI[41:06] Optimizing LLM Checkpointing[46:51] Checkpoint Offloading and Platform Design[54:55] Workflow Divergence Points[58:41] Wrap up

  continue reading

446 פרקים

Artwork
iconשתפו
 
Manage episode 473967142 series 3241972
תוכן מסופק על ידי Demetrios. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Demetrios או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

Building Trust Through Technology: Responsible AI in Practice // MLOps Podcast #299 with Animesh Singh, Executive Director, AI Platform and Infrastructure of LinkedIn.

Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter

// AbstractAnimesh discusses LLMs at scale, GPU infrastructure, and optimization strategies. He highlights LinkedIn's use of LLMs for features like profile summarization and hiring assistants, the rising cost of GPUs, and the trade-offs in model deployment. Animesh also touches on real-time training, inference efficiency, and balancing infrastructure costs with AI advancements. The conversation explores the evolving AI landscape, compliance challenges, and simplifying architecture to enhance scalability and talent acquisition.

// BioExecutive Director, AI and ML Platform at LinkedIn | Ex IBM Senior Director and Distinguished Engineer, Watson AI and Data | Founder at Kubeflow | Ex LFAI Trusted AI NA Chair

Animesh is the Executive Director leading the next-generation AI and ML Platform at LinkedIn, enabling the creation of the AI Foundation Models Platform, serving the needs of 930+ Million members of LinkedIn. Building Distributed Training Platforms, Machine Learning Pipelines, Feature Pipelines, Metadata engines, etc. Leading the creation of the LinkedIn GAI platform for fine-tuning, experimentation and inference needs. Animesh has more than 20 patents and 50+ publications.

Past IBM Watson AI and Data Open Tech CTO, Senior Director, and Distinguished Engineer, with 20+ years experience in the Software industry, and 15+ years in AI, Data, and Cloud Platform. Led globally dispersed teams, managed globally distributed projects, and served as a trusted adviser to Fortune 500 firms. Played a leadership role in creating, designing, and implementing Data and AI engines for AI and ML platforms, led Trusted AI efforts, and drove the strategy and execution for Kubeflow, OpenDataHub, and execution in products like Watson OpenScale and Watson Machine Learning.

// Related LinksComposable Memory for GPU Optimization // Bernie Wu // Pod #270 - https://youtu.be/ccaDEFoKwko

~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Animesh on LinkedIn: /animeshsingh1

Timestamps:[00:00] Animesh's preferred coffee[00:16] Takeaways[02:12] What is working? [07:00] What's not working?[13:40] LLM vs Rexis Efficiency[21:49] GPU Utilization and Architecture[27:32] GPU reliability concerns[36:50] Memory Bottleneck in AI[41:06] Optimizing LLM Checkpointing[46:51] Checkpoint Offloading and Platform Design[54:55] Workflow Divergence Points[58:41] Wrap up

  continue reading

446 פרקים

כל הפרקים

×
 
Loading …

ברוכים הבאים אל Player FM!

Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.

 

מדריך עזר מהיר

האזן לתוכנית הזו בזמן שאתה חוקר
הפעלה