Artwork

תוכן מסופק על ידי Valentino Stoll, Joe Leo, Valentino Stoll, and Joe Leo. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Valentino Stoll, Joe Leo, Valentino Stoll, and Joe Leo או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.
Player FM - אפליקציית פודקאסט
התחל במצב לא מקוון עם האפליקציה Player FM !

Evaluating LLMs with Leva

1:00:00
 
שתפו
 

Manage episode 502564807 series 3642718
תוכן מסופק על ידי Valentino Stoll, Joe Leo, Valentino Stoll, and Joe Leo. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Valentino Stoll, Joe Leo, Valentino Stoll, and Joe Leo או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

In this episode of the Ruby AI Podcast, host Valentino Stoll talks with special guest Kieran, a prominent figure in the Ruby AI space. Kieran recently gave a talk at the San Francisco Ruby Meetup about his new gem, Leva, which focuses on LLM evaluations in Ruby. Kieran discusses his background, his passion for AI and Ruby, as well as his journey in building AI products, including his tool Cora, which helps manage email inboxes by categorizing and summarizing emails using AI. Together, Valentino and Kieran explore the process, challenges, and best practices of creating AI-driven gems and tools in Ruby, the importance of evaluations, and the fun and creative aspects of integrating AI into Ruby on Rails projects.

Mentioned in the show:

  • Kieran Klaassen – Ruby developer, creator of Cora and Leva.
  • Leva gem – Kieran's LLM evaluation framework for Rails.
  • Jumpstart Pro – “is the best Ruby on Rails SaaS template out there”.
  • Stepper / Stepper Motor (workflow engine) – a “journey” with steps for background jobs.
  • Jaccard Index – A metric for set similarity (|A∩B|/|A∪B|).
  • LangSmith – a platform for building production-grade LLM applications.
  • Morph LLM – The Fastest Way to Apply AI Edits (4500+ tokens/sec).
  • Friday AI Agent – An AI-powered coding agent that handles PRs from start to finish.
  • DSPy.rb – Framework for building AI agents and optimizing prompts.

Highlights:

00:00 Introduction and Guest Welcome

00:53 Kieran's Background and AI Journey

01:20 Building AI Tools and the Leva Gem

03:47 Challenges and Best Practices in AI Development

07:16 Evaluations and Real-World Applications

07:36 Community Recognition and Adoption

12:37 Prompt Engineering and Model Testing

22:06 Leveraging AI for Workflow Optimization

28:35 Visualizing Workflows and Tools

31:44 Exploring Hybrid Orchestration Layers

33:15 Debating Deterministic Workflows vs. Agent Flows

34:28 The Fun of Experimenting with AI and Ruby

34:55 Building Gems and Learning Through Creation

40:03 The Value of Rails in AI Development

46:28 Evaluating AI Outputs and Metrics

50:40 Annotation and Continuous Improvement

53:50 Future of AI and Rails Integration

54:54 Closing Thoughts and Recommendations

  continue reading

פרקים

1. Evaluating LLMs with Leva (00:00:00)

2. Kieran's Background and AI Journey (00:00:53)

3. Building AI Tools and the Leva Gem (00:01:10)

4. Challenges and Best Practices in AI Development (00:03:47)

5. Evaluations and Real-World Applications (00:07:16)

6. Community Recognition and Adoption (00:07:36)

7. Prompt Engineering and Model Testing (00:12:37)

8. Leveraging AI for Workflow Optimization (00:22:06)

9. Visualizing Workflows and Tools (00:28:35)

10. Exploring Hybrid Orchestration Layers (00:31:44)

11. Debating Deterministic Workflows vs. Agent Flows (00:33:15)

12. The Fun of Experimenting with AI and Ruby (00:34:28)

13. Building Gems and Learning Through Creation (00:34:55)

14. The Value of Rails in AI Development (00:40:03)

15. Evaluating AI Outputs and Metrics (00:46:28)

16. Annotation and Continuous Improvement (00:50:40)

17. Future of AI and Rails Integration (00:53:50)

18. Closing Thoughts and Recommendations (00:54:54)

10 פרקים

Artwork
iconשתפו
 
Manage episode 502564807 series 3642718
תוכן מסופק על ידי Valentino Stoll, Joe Leo, Valentino Stoll, and Joe Leo. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Valentino Stoll, Joe Leo, Valentino Stoll, and Joe Leo או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

In this episode of the Ruby AI Podcast, host Valentino Stoll talks with special guest Kieran, a prominent figure in the Ruby AI space. Kieran recently gave a talk at the San Francisco Ruby Meetup about his new gem, Leva, which focuses on LLM evaluations in Ruby. Kieran discusses his background, his passion for AI and Ruby, as well as his journey in building AI products, including his tool Cora, which helps manage email inboxes by categorizing and summarizing emails using AI. Together, Valentino and Kieran explore the process, challenges, and best practices of creating AI-driven gems and tools in Ruby, the importance of evaluations, and the fun and creative aspects of integrating AI into Ruby on Rails projects.

Mentioned in the show:

  • Kieran Klaassen – Ruby developer, creator of Cora and Leva.
  • Leva gem – Kieran's LLM evaluation framework for Rails.
  • Jumpstart Pro – “is the best Ruby on Rails SaaS template out there”.
  • Stepper / Stepper Motor (workflow engine) – a “journey” with steps for background jobs.
  • Jaccard Index – A metric for set similarity (|A∩B|/|A∪B|).
  • LangSmith – a platform for building production-grade LLM applications.
  • Morph LLM – The Fastest Way to Apply AI Edits (4500+ tokens/sec).
  • Friday AI Agent – An AI-powered coding agent that handles PRs from start to finish.
  • DSPy.rb – Framework for building AI agents and optimizing prompts.

Highlights:

00:00 Introduction and Guest Welcome

00:53 Kieran's Background and AI Journey

01:20 Building AI Tools and the Leva Gem

03:47 Challenges and Best Practices in AI Development

07:16 Evaluations and Real-World Applications

07:36 Community Recognition and Adoption

12:37 Prompt Engineering and Model Testing

22:06 Leveraging AI for Workflow Optimization

28:35 Visualizing Workflows and Tools

31:44 Exploring Hybrid Orchestration Layers

33:15 Debating Deterministic Workflows vs. Agent Flows

34:28 The Fun of Experimenting with AI and Ruby

34:55 Building Gems and Learning Through Creation

40:03 The Value of Rails in AI Development

46:28 Evaluating AI Outputs and Metrics

50:40 Annotation and Continuous Improvement

53:50 Future of AI and Rails Integration

54:54 Closing Thoughts and Recommendations

  continue reading

פרקים

1. Evaluating LLMs with Leva (00:00:00)

2. Kieran's Background and AI Journey (00:00:53)

3. Building AI Tools and the Leva Gem (00:01:10)

4. Challenges and Best Practices in AI Development (00:03:47)

5. Evaluations and Real-World Applications (00:07:16)

6. Community Recognition and Adoption (00:07:36)

7. Prompt Engineering and Model Testing (00:12:37)

8. Leveraging AI for Workflow Optimization (00:22:06)

9. Visualizing Workflows and Tools (00:28:35)

10. Exploring Hybrid Orchestration Layers (00:31:44)

11. Debating Deterministic Workflows vs. Agent Flows (00:33:15)

12. The Fun of Experimenting with AI and Ruby (00:34:28)

13. Building Gems and Learning Through Creation (00:34:55)

14. The Value of Rails in AI Development (00:40:03)

15. Evaluating AI Outputs and Metrics (00:46:28)

16. Annotation and Continuous Improvement (00:50:40)

17. Future of AI and Rails Integration (00:53:50)

18. Closing Thoughts and Recommendations (00:54:54)

10 פרקים

כל הפרקים

×
 
Loading …

ברוכים הבאים אל Player FM!

Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.

 

מדריך עזר מהיר

האזן לתוכנית הזו בזמן שאתה חוקר
הפעלה