התחל במצב לא מקוון עם האפליקציה Player FM !
Automated Reasoning to Prevent LLM Hallucination with Byron Cook - #712
Manage episode 454678204 series 2355587
Today, we're joined by Byron Cook, VP and distinguished scientist in the Automated Reasoning Group at AWS to dig into the underlying technology behind the newly announced Automated Reasoning Checks feature of Amazon Bedrock Guardrails. Automated Reasoning Checks uses mathematical proofs to help LLM users safeguard against hallucinations. We explore recent advancements in the field of automated reasoning, as well as some of the ways it is applied broadly, as well as across AWS, where it is used to enhance security, cryptography, virtualization, and more. We discuss how the new feature helps users to generate, refine, validate, and formalize policies, and how those policies can be deployed alongside LLM applications to ensure the accuracy of generated text. Finally, Byron also shares the benchmarks they’ve applied, the use of techniques like ‘constrained coding’ and ‘backtracking,’ and the future co-evolution of automated reasoning and generative AI.
The complete show notes for this episode can be found at https://twimlai.com/go/712.
750 פרקים
Automated Reasoning to Prevent LLM Hallucination with Byron Cook - #712
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Manage episode 454678204 series 2355587
Today, we're joined by Byron Cook, VP and distinguished scientist in the Automated Reasoning Group at AWS to dig into the underlying technology behind the newly announced Automated Reasoning Checks feature of Amazon Bedrock Guardrails. Automated Reasoning Checks uses mathematical proofs to help LLM users safeguard against hallucinations. We explore recent advancements in the field of automated reasoning, as well as some of the ways it is applied broadly, as well as across AWS, where it is used to enhance security, cryptography, virtualization, and more. We discuss how the new feature helps users to generate, refine, validate, and formalize policies, and how those policies can be deployed alongside LLM applications to ensure the accuracy of generated text. Finally, Byron also shares the benchmarks they’ve applied, the use of techniques like ‘constrained coding’ and ‘backtracking,’ and the future co-evolution of automated reasoning and generative AI.
The complete show notes for this episode can be found at https://twimlai.com/go/712.
750 פרקים
Wszystkie odcinki
×

1 From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731 1:01:25


1 How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730 1:07:27


1 CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729 56:18


1 Generative Benchmarking with Kelly Hong - #728 54:17


1 Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727 1:34:06


1 Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726 51:45


1 Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725 1:09:07


1 Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724 50:32


1 Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723 58:38


1 Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722 42:11


1 Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721 49:29


1 Accelerating AI Training and Inference with AWS Trainium2 with Ron Diamant - #720 1:07:05


1 π0: A Foundation Model for Robotics with Sergey Levine - #719 52:30


1 AI Trends 2025: AI Agents and Multi-Agent Systems with Victor Dibia - #718 1:44:59


1 Speculative Decoding and Efficient LLM Inference with Chris Lott - #717 1:16:30
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.